Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/visual-studio-code/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
用于外部交叉验证的Python折叠索引_Python_Pandas_Cross Validation - Fatal编程技术网

用于外部交叉验证的Python折叠索引

用于外部交叉验证的Python折叠索引,python,pandas,cross-validation,Python,Pandas,Cross Validation,我有一个pandas dataframedf包含来自两个类的数据。 我希望随机生成指数,用于分层K倍交叉验证 我现在做的是: df_folds = np.array_split(df, 5) for k in range(5): # We use 'list' to copy, in order to 'pop' later on df_train = list(df_folds) df_test = df_train.pop(k) df_train = pd.

我有一个pandas dataframe
df
包含来自两个类的数据。 我希望随机生成指数,用于分层K倍交叉验证

我现在做的是:

df_folds = np.array_split(df, 5)
for k in range(5):
    # We use 'list' to copy, in order to 'pop' later on
    df_train = list(df_folds)
    df_test  = df_train.pop(k)
    df_train = pd.concat(df_train)
然而,这不是一个分层的5倍交叉验证,因为它只是将数据帧分成5个部分

from sklearn.model_selection import StratifiedKFold
skf = StratifiedKFold(n_splits=3)
skf.get_n_splits(df)

print(skf)  

for train_index, test_index in skf.split(df):
   print("TRAIN:", train_index, "TEST:", test_index)

TypeError: split() takes at least 3 arguments (2 given)

sklearn已经提供了这一点:您尝试过吗?我无法使其与pandas DataFrame一起工作。请在您的问题中显示错误代码,因为sklearn与pandas兼容dataframes@EdChum请看我triedError的代码,文档显示需要2个参数,您需要传递包含数据的列,然后是包含类标签的列: