决策树算法中的random_state是什么意义？-慕课网

1回答

liuyubobobo 2018-03-19 16:36:30

sklearn中的决策树实现，在寻找最大熵的切分的时候，所考虑的features的是乱序的。这样在多个features中，如果出现最大熵一致的情况，可能选择的切分位置不一样（对于拥有大量01二分属性的数据来说，这种情况很常见），从而使得决策树尽量在每一层照顾不同的特征。具体可以参见sklearn文档中下面这种的这句话：http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html

The features are always randomly permuted at each split. Therefore, the best found split may vary, even with the same training data and max_features=n_features, if the improvement of the criterion is identical for several splits enumerated during the search of the best split. To obtain a deterministic behaviour during fitting, random_state has to be fixed.

另外，对于sklearn的DecisionTreeClassifier, 其中的splitter参数可以选择“random”，也将引入随机的影响。

0 回复有任何疑惑可以回复我~

收起回答

提问者烈焰卡卡 #1
```
非常感谢！
```
回复有任何疑惑可以回复我~ 2018-03-19 16:41:02

决策树算法中的random_state是什么意义？

正在回答

1回答

相似问题

请选择置顶位置

本课精华内容

PCA降维的把握和依据分别是什么？比如把一个近似直线分布的二维点数据降维成直线？

关于线性回归中归一化处理和不对归一化处理的问题

数据归一化为什么有用？对归一化无法直观理解

如何判断欠拟合是因为模型选择错误还是超参数选择错误？

学好具体算法和机器学习的实际应用之间有哪些距离？

关于 XGBoost

关于机器学习不同的指标

其他算法的决策边界

在三维数据上的 PCA

关于回归和分类

热搜

最近搜索清空