Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/344.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python/Sklearn-indexer-Index越界_Python_Pandas_Scikit Learn - Fatal编程技术网

Python/Sklearn-indexer-Index越界

Python/Sklearn-indexer-Index越界,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,我正在尝试使用10倍CV在我的数据集上运行kNN分类器。我在WEKA有一些模型方面的经验,但很难将其转移到Sklearn 下面是我的代码 filename = 'train4.csv' names = ['attribut names are here'] dataframe = read_csv(filename, names=names) array = dataframe.values X = array[:,0:47] Y = array[:,47] num_folds = 10 kfo

我正在尝试使用10倍CV在我的数据集上运行kNN分类器。我在WEKA有一些模型方面的经验,但很难将其转移到Sklearn

下面是我的代码

filename = 'train4.csv'
names = ['attribut names are here']
dataframe = read_csv(filename, names=names)
array = dataframe.values
X = array[:,0:47]
Y = array[:,47]
num_folds = 10
kfold = KFold(n_splits=10, random_state=7)
model = KNeighborsClassifier()
results = cross_val_score(model, X, Y, cv=kfold)
print(results.mean())
我得到了错误

>IndexError                                Traceback (most recent call last)
<ipython-input-19-8d9596c3368b> in <module>()
      4 array = dataframe.values
      5 X = array[:,0:47]
----> 6 Y = array[:,47]
      7 num_folds = 10
      8 kfold = KFold(n_splits=10, random_state=7)

> IndexError: index 47 is out of bounds for axis 1 with size 47
>索引器错误回溯(最后一次调用)
在()
4数组=dataframe.values
5 X=数组[:,0:47]
---->6 Y=数组[:,47]
7次/10次
8 kfold=kfold(n_分割=10,随机状态=7)
>索引器:索引47超出大小为47的轴1的界限
在我的CSV中,第47个属性是目标标签-因此是48(这里我错了吗?)

我正在Jupyter笔记本上运行pandas/sklearn

谢谢

试试这个:

import pandas as pd

filename = 'train4.csv'
names = ['attribut names are here']
target_col_name = 'name_of_your_target_column'

df = pd.read_csv(filename, names=names)

num_folds = 10
kfold = KFold(n_splits=10, random_state=7)
model = KNeighborsClassifier()
results = cross_val_score(model,
                          df.drop(target_col_name, axis=1), 
                          df[target_col_name],
                          cv=kfold)
print(results.mean())

你的CSV有列名吗?目标
y
列的列名是什么?嗨,伙计,谢谢你的回复。使用此代码,我将两个“target\u col\u name”替换为介于“”之间的列名。我收到错误“ValueError:labels['mix1_instrument']未包含在axis'@Gareth中,能否发布
打印输出(df.columns.tolist())
?我发现我的列名有拼写错误。谢谢我现在收到一个值错误,我假设这是因为我有两个数据类型为“object”的属性。“我怎样才能纠正这个问题呢?”加雷斯,这是另一个问题,应该单独问…;-)