Python:迭代问题
我正在对一个文本文件执行文本处理,并一直试图迭代到for循环中Python:迭代问题,python,text,iteration,Python,Text,Iteration,我正在对一个文本文件执行文本处理,并一直试图迭代到for循环中 fields = [1, 2, 3, 4, 5] i = 0 with open('file path', 'r') as f: for line in f: # while i is smaller than the number of fields (=5) while i <= len(fields)-1: currentfield = fields[i]
fields = [1, 2, 3, 4, 5]
i = 0
with open('file path', 'r') as f:
for line in f:
# while i is smaller than the number of fields (=5)
while i <= len(fields)-1:
currentfield = fields[i]
# if the first character of the line matches currentfield
# (that being a number)
if line[0] == currentfield:
print(line[4:]) # print the value in the "third column"
i += 1
文本文件中实际上没有列,但字段编号(即1、2、3、4、5)和后面的值(即17824)之间的空格有两个制表符。我只是不知道怎么打电话给17824
我试图做的是迭代每个条目/年份的所有字段,但输出仅给出第一个字段的值1。因此,我得到如下输出:
17824
3224
6563453
它只迭代第一个字段,而不是迭代所有字段。如何修复代码,以便将输出创建为类似表的形式,并在其中迭代字段2、3、4和5?像这样:
17824 20131125192004.9 690714s1969 dcu 000 0 eng ...and so on
3224 20w125192004.9 690714s1969 dcu 000 0 eng ...and so on
6563453 2013341524626245.9 484914s1969 dcu 000 0 eng ...and so on
编辑:我知道我不是很清楚,所以我添加了一些部分。这将帮助您:
for line in f:
print '\nline[0] is %s' % line[0]
for currentfield in fields: # loop through all fields
# convert currentfield to string
if line[0] == str(currentfield): #if the first character of the line matches currentfield (that being a number)
print 'Printing field %d' % current field # debugging
print line[4:] #print the value in the "third column"
这给了我:
u'''line[0] is -
line[0] is 1
Printing field 1
17824
line[0] is 2
Printing field 2
20131125192004.9
line[0] is 3
Printing field 3
690714s1969 dcu 000 0 eng
line[0] is 4
Printing field 4
a 75601809
line[0] is 4
Printing field 4
a DLC
line[0] is 4
Printing field 4
b eng
line[0] is 4
Printing field 4
c DLC
line[0] is 5
Printing field 5
a WA 750
line[0] is -
line[0] is 1
Printing field 1
3224
line[0] is 2
Printing field 2
20w125192004.9
line[0] is 3
Printing field 3
690714s1969 dcu 000 0 eng
line[0] is 5
Printing field 5
a WA 120
line[0] is -
line[0] is 1
Printing field 1
6563453
line[0] is 2
Printing field 2
2013341524626245.9
line[0] is 3
Printing field 3
484914s1969 dcu 000 0 eng
line[0] is 4
Printing field 4
a 75601809
line[0] is 4
Printing field 4
a eng
line[0] is 4
Printing field 4
c DLC
line[0] is 5
Printing field 5
a WA 345'''
顺便说一句,将行[:4]
更改为行[:8]
将给出上面粘贴的数据的第三列
然后可以使用regex删除第三列数据后面空格后面的任何内容
编辑更改后的Q 在这里,我将每一行连接起来,并删除所有保留列作为列表的空格,列表中有
l=[el for el in'。如果el!='''>,则连接(行)。然后,您可以通过直接引用列来索引该列,例如,对于第4列:l[4]
for line in f:
l = [el for el in ''.join(line) if el != '']
print '\nline[0] is %s' % line[0]
for currentfield in fields: # loop through all fields
# convert currentfield to string
if l[0] == str(currentfield): #if the first character of the line matches currentfield (that being a number)
print 'Printing field %d' % current field # debugging
print l[currentfield] #print the value in the "third column"
我假设你错过了一个I=0
初始化?@alexmcf刚刚编辑过,它在我的代码中。谢谢你的接球!我还不确定你想做什么。一般来说,我建议用一个空格截断多个出现的空白字符,并按该空格分割,然后去除尾随空格和前导空格。所以类似于re.sub(r'\s+','',line).strip()的东西可以工作。但是正如我所说的,你想要什么还不清楚。编辑之后,问题是I=0
放错了位置。不是每次从0数到4,而是第一次从0数到4,然后每次从5数到4。您需要在外部循环内重置i=0
。或者,更好的做法是,不要尝试使用while
循环来为循环的工作执行。如果您只想迭代数字0到4,请对范围(5)中的i使用:
。或者,更好的是,如果您想要这些数字的唯一原因是作为字段的索引,只需对字段中的字段执行。
@abarnert:我尝试了字段中的字段:
,但它只是通过了字段的索引,即1。它没有经过2、3、4和5。你知道为什么吗?我将open('file path','r')作为f:for字段中的字段:for行中的f:if行[0:3]==field:print行[4:]谢谢您的帮助。Python一直给我一个无效的语法错误。顺便说一句,我使用了f=“[your pasted file data]”。拆分('\n')
以重新创建文件读取。也许可以试试这个
for line in f:
l = [el for el in ''.join(line) if el != '']
print '\nline[0] is %s' % line[0]
for currentfield in fields: # loop through all fields
# convert currentfield to string
if l[0] == str(currentfield): #if the first character of the line matches currentfield (that being a number)
print 'Printing field %d' % current field # debugging
print l[currentfield] #print the value in the "third column"