Python/IPython奇怪的不可复制列表索引超出范围错误_Python_Indexing_Changelist

Python/IPython奇怪的不可复制列表索引超出范围错误

python indexing

Python/IPython奇怪的不可复制列表索引超出范围错误,python,indexing,changelist,Python,Indexing,Changelist,我最近学习了一些Python，以及如何将其应用到我的工作中。我已经成功地编写了两个脚本，但我有一个问题我就是想不出来我正在打开一个约4000行的文件，每行有两个制表符分隔的列。读取输入文件时，我收到一个索引错误，表示列表索引超出范围。然而，虽然我每次都会出错，但并不是每次都发生在同一行上（如中所示，每次都会在不同的行上抛出错误！）。因此，出于某种原因，它通常是有效的，但随后（似乎）随机失效因为我上星期才开始学习Python，所以我很困惑。我环顾四周寻找同样的问题，但没有发现任何类似的问题。此

我最近学习了一些Python，以及如何将其应用到我的工作中。我已经成功地编写了两个脚本，但我有一个问题我就是想不出来

我正在打开一个约4000行的文件，每行有两个制表符分隔的列。读取输入文件时，我收到一个索引错误，表示列表索引超出范围。然而，虽然我每次都会出错，但并不是每次都发生在同一行上（如中所示，每次都会在不同的行上抛出错误！）。因此，出于某种原因，它通常是有效的，但随后（似乎）随机失效

因为我上星期才开始学习Python，所以我很困惑。我环顾四周寻找同样的问题，但没有发现任何类似的问题。此外，我不知道这是一个特定于语言还是特定于IPython的问题。任何帮助都将不胜感激

input = open("count.txt", "r")
changelist = []
listtosort = []
second = str()

output = open("output.txt", "w")

for each in input:
    splits = each.split("\t")
    changelist = list(splits[0])
    second = int(splits[1])

print second

if changelist[7] == ";":   
    changelist.insert(6, "000")
    va = "".join(changelist) 
    var = va + ("\t") + str(second)
    listtosort.append(var)
    output.write(var)

elif changelist[8] == ";":   
    changelist.insert(6, "00")
    va = "".join(changelist) 
    var = va + ("\t") + str(second)
    listtosort.append(var)
    output.write(var)

elif changelist[9] == ";":   
    changelist.insert(6, "0")
    va = "".join(changelist) 
    var = va + ("\t") + str(second)
    listtosort.append(var)
    output.write(var)

else:
    #output.write(str("".join(changelist)))
    va = "".join(changelist)
    var = va + ("\t") + str(second)
    listtosort.append(var)
    output.write(var)

output.close()

错误

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/home/a/Desktop/sharedfolder/ipytest/individ.ins.count.test/<ipython-input-87-32f9b0a1951b> in <module>()
     57     splits = each.split("\t")
     58     changelist = list(splits[0])
---> 59     second = int(splits[1])
     60 
     61     print second

IndexError: list index out of range

期望输出：

ID=cds0000;Name=NP_414542.1;Parent=gene0;Dbxref=ASAP:ABE-0000006,UniProtKB%2FSwiss-Prot:P0AD86,Genbank:NP_414542.1,EcoGene:EG11277,GeneID:944742;gbkey=CDS;product=thr  12
ID=cds1000;Name=NP_415538.1;Parent=gene1035;Dbxref=ASAP:ABE-0003451,UniProtKB%2FSwiss-Prot:P31545,Genbank:NP_415538.1,EcoGene:EG11735,GeneID:946500;gbkey=CDS;product=deferrrochelatase%2C  50
ID=cds1001;Name=NP_415539.1;Parent=gene1036;Note=PhoB-dependent%2C  36

当count.txt中有一行不包含制表符时，就会发生这种情况。因此，按制表符拆分时，将不会有任何

拆分[1]

。因此出现错误“索引超出范围”

要知道是哪一行导致了错误，只需在第57行的

拆分后添加打印（每个）
。错误消息之前打印的行是您的罪魁祸首。如果您的输入文件不断更改，那么您将得到不同的位置。更改脚本以处理此类格式不正确的行。
获取索引器的原因是您的输入文件显然没有完全以制表符分隔。这就是为什么当您尝试访问它时，splits[1]
上没有任何内容
您的代码可能需要一些重构。首先，您正在重复自己的，如果-检查，则这是不必要的。这只是将cds0
填充到7个字符，这可能不是您想要的。我将以下内容放在一起，以演示如何将代码重构为更具python风格和枯燥的代码。我不能保证它能与您的数据集一起工作，但我希望它能帮助您了解如何以不同的方式做事
    to_sort = []
    # We can open two files using the with statement. This will also handle 
    # closing the files for us, when we exit the block.
    with open("count.txt", "r") as inp, open("output.txt", "w") as out:
        for each in inp:
           # Split at ';'... So you won't have to worry about whether or not
           # the file is tab delimited
           changed = each.split(";")

           # Get the value you want. This is called unpacking.
           # The value before '=' will always be 'ID', so we don't really care about it.
           # _ is generally used as a variable name when the value is discarded.
           _, value = changed[0].split("=")

           # 0-pad the desired value to 7 characters. Python string formatting
           # makes this very easy. This will replace the current value in the list.
           changed[0] = "ID={:0<7}".format(value)

           # Join the changed-list with the original separator and
           # and append it to the sort list.
           to_sort.append(";".join(changed))

       # Write the results to the file all at once. Your test data already
       # provided the newlines, you can just write it out as it is.
       output.writelines(to_sort)

       # Do what else you need to do. Maybe to_list.sort()?

to_sort=[]
#我们可以使用with语句打开两个文件。这也将处理
#当我们退出块时，为我们关闭文件。
打开（“count.txt”、“r”）作为输入，打开（“output.txt”、“w”）作为输出：
对于inp中的每个：
#在“；”处拆分。。。所以你不必担心
#该文件以制表符分隔
更改=每个。拆分（“；”）
#获取您想要的值。这叫做拆包。
#“=”之前的值始终是“ID”，因此我们并不真正关心它。
#_u通常在值被丢弃时用作变量名。
_，值=已更改[0]。拆分（“”）
#0-将所需值填充到7个字符。Python字符串格式
#这很容易。这将替换列表中的当前值。
已更改[0]=“ID”={：0您能提供一些示例输入和预期输出吗？请将此信息编辑到您的问题中。对不起，经验不足！您确定这是一个\t
而不仅仅是分隔列的空白吗？我真的不知道……让我困惑的是，有些行有时会出现错误，但不会出现w其他时间出现错误。我多次运行完全相同的代码，每次都在不同的位置出现错误！感谢您的回复！我将尝试遵循您的建议…祈祷吧！非常感谢您的时间和输入！我已经尝试了您的代码，但编号不太正确。输入包含CD等值0/cds10/cds100需要是cds0000/cds0010/cds0100来对它们进行排序，除了上面的代码，我想不出任何其他方法来实现它…我还有很多东西要学！在不知道ID的情况下，cds0/cds10/cds100本身就是一个可排序的序列。最终的实现将取决于POS的数量我建议你看一下这本书，看看有没有什么提示。
    to_sort = []
    # We can open two files using the with statement. This will also handle 
    # closing the files for us, when we exit the block.
    with open("count.txt", "r") as inp, open("output.txt", "w") as out:
        for each in inp:
           # Split at ';'... So you won't have to worry about whether or not
           # the file is tab delimited
           changed = each.split(";")

           # Get the value you want. This is called unpacking.
           # The value before '=' will always be 'ID', so we don't really care about it.
           # _ is generally used as a variable name when the value is discarded.
           _, value = changed[0].split("=")

           # 0-pad the desired value to 7 characters. Python string formatting
           # makes this very easy. This will replace the current value in the list.
           changed[0] = "ID={:0<7}".format(value)

           # Join the changed-list with the original separator and
           # and append it to the sort list.
           to_sort.append(";".join(changed))

       # Write the results to the file all at once. Your test data already
       # provided the newlines, you can just write it out as it is.
       output.writelines(to_sort)

       # Do what else you need to do. Maybe to_list.sort()?