Python 3.x 实现csv文件时出现python错误

Python 3.x 实现csv文件时出现python错误,python-3.x,Python 3.x,当我尝试在我的feature python文件上运行quora duplicates文件时,我遇到了这个错误, 下面是我正在运行的代码部分 data = pd.read_csv('train.csv', sep='\t') data = data.drop(['id', 'qid1', 'qid2'], axis=1) 输出是 unfile('/Volumes/Macintosh HD/chrome/is_that_a_duplicate_quora_question-master/featu

当我尝试在我的feature python文件上运行quora duplicates文件时,我遇到了这个错误, 下面是我正在运行的代码部分

data = pd.read_csv('train.csv', sep='\t')
data = data.drop(['id', 'qid1', 'qid2'], axis=1)
输出是

unfile('/Volumes/Macintosh HD/chrome/is_that_a_duplicate_quora_question-master/feature_engineering.py', wdir='/Volumes/Macintosh HD/chrome/is_that_a_duplicate_quora_question-master')

Traceback (most recent call last):

File "<ipython-input-31-e29a1095cc40>", line 1, in <module>
runfile('/Volumes/Macintosh HD/chrome/is_that_a_duplicate_quora_question-master/feature_engineering.py', wdir='/Volumes/Macintosh HD/chrome/is_that_a_duplicate_quora_question-master')

File "/Users/Yash/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "/Users/Yash/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/Volumes/Macintosh HD/chrome/is_that_a_duplicate_quora_question-master/feature_engineering.py", line 55, in <module>
data = data.drop(['id','qid1','qid2'], axis=1)

File "/Users/Yash/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 2530, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)

File "/Users/Yash/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 2562, in _drop_axis
new_axis = axis.drop(labels, errors=errors)

File "/Users/Yash/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3744, in drop
labels[mask])

ValueError: labels ['id' 'qid1' 'qid2'] not contained in axis

请帮助我找出问题所在

您需要删除分隔符参数
\
,因为csv中的内容已经有
作为分隔符:

# sample.csv file contains following data

"id","qid1","qid2","question1","question2","is_duplicate"
"0","1","2","What is the step by step guide to invest in share market in india?","What is the step by step guide to invest in share ,"0"
"1","3","4","What is the story of Kohinoor (Koh-i-Noor) Diamond?","What would happen if the Indian government stole the Kohinoor(-i-Noor) diamond back?","0"



sep='\t'
意味着使用制表符作为分隔符,但看起来数据是用逗号分隔的
sep=','
可能有效吗?
# sample.csv file contains following data

"id","qid1","qid2","question1","question2","is_duplicate"
"0","1","2","What is the step by step guide to invest in share market in india?","What is the step by step guide to invest in share ,"0"
"1","3","4","What is the story of Kohinoor (Koh-i-Noor) Diamond?","What would happen if the Indian government stole the Kohinoor(-i-Noor) diamond back?","0"
df = pd.read_csv('sample.csv')
data = df.drop(['id', 'qid1', 'qid2'], axis=1)
print data
#output will be like this:
"question1","question2","is_duplicate"
"What is the step by step guide to invest in share market in india?","What is the step by step guide to invest in share ,"0"
"What is the story of Kohinoor (Koh-i-Noor) Diamond?","What would happen if the Indian government stole the Kohinoor(-i-Noor) diamond back?","0"