Python 尝试使用列名列表时引发ValueError
以下代码导致Pandas引发ValueError。我不知道为什么使用普通列表可以很好地工作Python 尝试使用列名列表时引发ValueError,python,file,pandas,error-handling,dataframe,Python,File,Pandas,Error Handling,Dataframe,以下代码导致Pandas引发ValueError。我不知道为什么使用普通列表可以很好地工作 fileFields = [str(input("Please enter the column name for the pedigree field in your request file.\n")), str(input("Please enter the column name for the pedigree field
fileFields = [str(input("Please enter the column name for the pedigree field in
your request file.\n")),
str(input("Please enter the column name for the pedigree field
in the Tissue Library file.\n")),
str(input("Please enter the column name for the sourceID field
in the Tissue Library file.\n")),
str(input("Please enter the column name for the pedigree field in
the Gold Standard file.\n")),
str(input("Please enter the column name for the sourceID field in
the Gold Standard file.\n"))]
dfRequests = pd.read_csv(fileInputs[0], skipinitialspace=True,
usecols=fileFields[0])
dfTissueLibrary = pd.read_csv(fileInputs[1], skipinitialspace=True,
usecols=fileFields[1:2])
dfGoldStandard = pd.read_csv(fileInputs[2], skipinitialspace=True,
usecols=fileFields[3:4])
结果:
Traceback (most recent call last):
File "filepathway hidden for security", line 74, in <module>
usecols=fileFields[0])
File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 529, in parser_f
return _read(filepath_or_buffer, kwds)
File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 295, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 612, in __init__
self._make_engine(self.engine)
File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 747, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1154, in __init__
col_indices.append(self.names.index(u))
ValueError: 'd' is not in list
回溯(最近一次呼叫最后一次):
文件“为安全起见隐藏文件路径”,第74行,在
usecols=文件字段[0])
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”,第529行,在parser\u f中
返回读取(文件路径或缓冲区,kwds)
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”,第295行,已读
parser=TextFileReader(文件路径或缓冲区,**kwds)
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”,第612行,在uu init中__
自制发动机(自制发动机)
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”,第747行,在生成引擎中
self.\u engine=CParserWrapper(self.f,**self.options)
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”,第1154行,在__
col_index.append(self.names.index(u))
ValueError:“d”不在列表中
我觉得Pandas好像从fileFields列表中的每个索引中提取字符串,并将它们转换为字符串列表。我试图通过调用索引字符串列表来解决这个问题,但没有成功。有什么建议吗
有什么建议吗
我的方法是使用一个小助手函数,如下所示,使过程简单且安全:
def selective_read_csv(purpose, path):
# read just the header row and get the column names
columns = list(pd.read_csv(path, nrows=1).columns.values)
df = None
while df is None:
# present user with a selection of actual columns, taking
# out the guess work
file_fields = raw_input("[%s] Enter columns as a comma-separated list %s " % (purpose, columns))
try:
df = pd.read_csv(path, usecols=file_fields.split(','))
except ValueError as e:
print "Sorry, %s" % e
df = None
return df
df = selective_read_csv('requests file', '/tmp/data.csv')
通过这种方式,用户会收到文件中实际存在的列的提示,错误的输入会得到很好的处理:
[requests file] Enter columns as a comma-spearated list [u'a', u'b'] aaa
Sorry, 'aaa' is not in list
[requests file] Enter columns as a comma-spearated list [u'a', u'b']
然后为每种文件类型调用此函数,例如:
dfRequests = selective_read_csv('requests file', fileInputs[0])
dfTissueLibrary = selective_read_csv('tissue library', fileInputs[1])
dfGoldStandard = selective_read_csv('gold standard', fileInputs[2])
fieldField[0]
返回一个字符串(输入的第一列),因此d
可能是第一列的第一个字符,对吗?如果是,请设置usecols=fieldfieldfields
。