python-列表索引超出范围,是否使用CSV?
我有一个CSV,看起来像这样:python-列表索引超出范围,是否使用CSV?,python,csv,indexoutofboundsexception,Python,Csv,Indexoutofboundsexception,我有一个CSV,看起来像这样: F02303521,"Smith,Andy",GHI,"Smith,Andy",GHI,,, F04300621,"Parker,Helen",CERT,"Yu,Betty",IOUS,,, 我想删除第二列等于第四列的所有行(例如当Smith,Andy=Smith,Andy)。我尝试在python中使用“作为分隔符,并将列拆分为: F02303521,史密斯,安迪,GHI,史密斯,安迪,GHI,,, 我尝试了以下python代码: testCSV = 'test
F02303521,"Smith,Andy",GHI,"Smith,Andy",GHI,,,
F04300621,"Parker,Helen",CERT,"Yu,Betty",IOUS,,,
我想删除第二列等于第四列的所有行(例如当Smith,Andy=Smith,Andy
)。我尝试在python中使用“
作为分隔符,并将列拆分为:
F02303521,
史密斯,安迪,GHI,
史密斯,安迪,GHI,,,
我尝试了以下python代码:
testCSV = 'test.csv'
deletionText = 'linestodelete.txt'
correct = 'correctone.csv'
i = 0
j = 0 #where i & j keep track of line number
with open(deletionText,'w') as outfile:
with open(testCSV, 'r') as csv:
for line in csv:
i = i + 1 #on the first line, i will equal 1.
PI = line.split('"')[1]
investigator = line.split('"')[3]
#if they equal each other, write that line number into the text file
as to be deleted.
if PI == investigator:
outfile.write(i)
#From the TXT, create a list of line numbers you do not want to include in output
with open(deletionText, 'r') as txt:
lines_to_be_removed_list = []
# for each line number in the TXT
# remove the return character at the end of line
# and add the line number to list domains-to-be-removed list
for lineNum in txt:
lineNum = lineNum.rstrip()
lines_to_be_removed_list.append(lineNum)
with open(correct, 'w') as outfile:
with open(deletionText, 'r') as csv:
# for each line in csv
# extract the line number
for line in csv:
j = j + 1 # so for the first line, the line number will be 1
# if csv line number is not in lines-to-be-removed list,
# then write that to outfile
if (j not in lines_to_be_removed_list):
outfile.write(line)
但对于这一行:
PI = line.split('"')[1]
我得到:
回溯(最近一次呼叫最后一次):
文件“C:/Users/sskadamb/PycharmProjects/vastDeleteLine/manipulation.py”,第11行,在
PI=行分割(“”)[1]
索引器:列表索引超出范围
我想它会做的,安迪,安迪,调查员,安迪。。。为什么不这样呢
任何帮助都将不胜感激,谢谢 尝试使用逗号而不是qoute进行拆分 x、 split(“,”)当您考虑csv时,请思考,这是一个很棒的Python数据分析库。以下是如何实现您的目标:
import pandas as pd
fields = ['field{}'.format(i) for i in range(8)]
df = pd.read_csv("data.csv", header=None, names=fields)
df = df[df['field1'] != df['field3']]
print df
这张照片是:
field0 field1 field2 field3 field4 field5 field6 field7
1 F04300621 Parker,Helen CERT Yu,Betty IOUS NaN NaN NaN
这意味着
列表中的元素少于两个。将其放入try
块中,并将匹配的除了打印出来行。拆分(“”)
。你有一个随机的空行吗?还有,你为什么不使用内置的csv模块?你为什么不使用非常好的csv
模块?@NightShadeQueen我哪里都没有空行,我知道这一点。我不知道,我可以使用csv
,我只是真的没有去研究它,因为我觉得我不是ally需要,“作为一种格式,没有标准定义。各种实现都有自己的小怪癖。csv
模块可以为您处理这些怪癖。如果我使用逗号分隔,它会将名称一分为二,而这不是我想要的。正如上面的评论所述,csv不是简单的逗号分隔格式-各种实现包括额外的规则来处理包含逗号的值,就像OP的文件一样。当我想到CSV时,我只是想。@TigerhawkT3可能是因为你还没有试过熊猫:)哦,谢谢!我以前从未用过熊猫。我将对此进行更多研究:)