Machine learning 为什么在train_test_split中的两个数组中都包含目标类？_Machine Learning_Scikit Learn_Train Test Split

Machine learning 为什么在train_test_split中的两个数组中都包含目标类？

machine-learning scikit-learn

Machine learning 为什么在train_test_split中的两个数组中都包含目标类？,machine-learning,scikit-learn,train-test-split,Machine Learning,Scikit Learn,Train Test Split,在上面的train_test_split示例中，result是数据帧，y_true是从数据帧的目标类列形成的numpy数组我的问题是，如果我们已经分别给出了“y\u true”，为什么我们要将整个“结果”数据帧作为“训练测试”分割中的输入参数之一？我的意思是，我们是否应该首先从“结果”数据框中排除目标类列？Scikit learn支持pandas，但pandas不是必需的。对于numpy阵列，将功能和标签放在同一个阵列中并不总是有意义的，因此当前设计的train\u test\u split功

在上面的train_test_split示例中，

result

是数据帧，

y_true

是从数据帧的目标类列形成的numpy数组

我的问题是，如果我们已经分别给出了“y\u true”，为什么我们要将整个“结果”数据帧作为“训练测试”分割中的输入参数之一？我的意思是，我们是否应该首先从“结果”数据框中排除目标类列？

Scikit learn支持pandas，但pandas不是必需的。对于numpy阵列，将功能和标签放在同一个阵列中并不总是有意义的，因此当前设计的

train\u test\u split

功能。因此，您需要确保

结果

数据帧及其拆分具有所需的格式。如果

y\u true

是

结果

数据帧的一部分，则可以（并且应该）选择在函数调用之前或之后排除它

X_train, test_df, y_train, y_test = train_test_split(result, y_true, stratify = y_true, test_size = 0.2)