Python 尝试使用列名列表时引发ValueError_Python_File_Pandas_Error Handling_Dataframe

Python 尝试使用列名列表时引发ValueError

python file pandas error-handling dataframe

Python 尝试使用列名列表时引发ValueError,python,file,pandas,error-handling,dataframe,Python,File,Pandas,Error Handling,Dataframe,以下代码导致Pandas引发ValueError。我不知道为什么使用普通列表可以很好地工作 fileFields = [str(input("Please enter the column name for the pedigree field in your request file.\n")), str(input("Please enter the column name for the pedigree field

以下代码导致Pandas引发ValueError。我不知道为什么使用普通列表可以很好地工作

fileFields = [str(input("Please enter the column name for the pedigree field in
                  your request file.\n")),
              str(input("Please enter the column name for the pedigree field
                  in the Tissue Library file.\n")),
              str(input("Please enter the column name for the sourceID field
                  in the Tissue Library file.\n")),
              str(input("Please enter the column name for the pedigree field in 
                  the Gold Standard file.\n")),
              str(input("Please enter the column name for the sourceID field in
                  the Gold Standard file.\n"))]

dfRequests = pd.read_csv(fileInputs[0], skipinitialspace=True,
                         usecols=fileFields[0])
dfTissueLibrary = pd.read_csv(fileInputs[1], skipinitialspace=True,
                              usecols=fileFields[1:2])
dfGoldStandard = pd.read_csv(fileInputs[2], skipinitialspace=True,
                             usecols=fileFields[3:4])

结果:

Traceback (most recent call last):
  File "filepathway hidden for security", line 74, in <module>
    usecols=fileFields[0])
  File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 529, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 295, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 612, in __init__
    self._make_engine(self.engine)
  File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 747, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "filepathway hidden for security\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1154, in __init__
    col_indices.append(self.names.index(u))
ValueError: 'd' is not in list

回溯（最近一次呼叫最后一次）：
文件“为安全起见隐藏文件路径”，第74行，在
usecols=文件字段[0]）
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”，第529行，在parser\u f中
返回读取（文件路径或缓冲区，kwds）
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”，第295行，已读
parser=TextFileReader（文件路径或缓冲区，**kwds）
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”，第612行，在uu init中__
自制发动机（自制发动机）
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”，第747行，在生成引擎中
self.\u engine=CParserWrapper（self.f，**self.options）
文件“FilePath hidden for security\Local\Continuum\Anaconda3\lib\site packages\pandas\io\parsers.py”，第1154行，在__
col_index.append（self.names.index（u））
ValueError:“d”不在列表中

我觉得Pandas好像从fileFields列表中的每个索引中提取字符串，并将它们转换为字符串列表。我试图通过调用索引字符串列表来解决这个问题，但没有成功。有什么建议吗

有什么建议吗

我的方法是使用一个小助手函数，如下所示，使过程简单且安全：

def selective_read_csv(purpose, path):
    # read just the header row and get the column names
    columns = list(pd.read_csv(path, nrows=1).columns.values)
    df = None
    while df is None:
        # present user with a selection of actual columns, taking
        # out the guess work
        file_fields = raw_input("[%s] Enter columns as a comma-separated list %s " % (purpose, columns))
        try:
            df = pd.read_csv(path, usecols=file_fields.split(','))
        except ValueError as e:
            print "Sorry, %s" % e
            df = None
    return df
df = selective_read_csv('requests file', '/tmp/data.csv')

通过这种方式，用户会收到文件中实际存在的列的提示，错误的输入会得到很好的处理：

[requests file] Enter columns as a comma-spearated list [u'a', u'b'] aaa
Sorry, 'aaa' is not in list
[requests file] Enter columns as a comma-spearated list [u'a', u'b']

然后为每种文件类型调用此函数，例如：

dfRequests = selective_read_csv('requests file', fileInputs[0])
dfTissueLibrary = selective_read_csv('tissue library', fileInputs[1])
dfGoldStandard = selective_read_csv('gold standard', fileInputs[2])

fieldField[0]

返回一个字符串（输入的第一列），因此

可能是第一列的第一个字符，对吗？如果是，请设置

usecols=fieldfieldfields

。