Python 3.x Python:如何读取文本文件中的列?

Python 3.x Python:如何读取文本文件中的列?,python-3.x,text-files,text-processing,Python 3.x,Text Files,Text Processing,我有一个以空格分隔的文本文件,如下所示 title 1 589.890 0.260 Fine 0.100 lex larry_page " " " title 1 590.150 0.000 . 0.950 lex larry_page " " " title 1 592.290 0.130 Here 0.990 lex larry_page " " " title 1 592.420 0.160 I 0.990 lex larry_page " " " title 1 592.580 0.28

我有一个以空格分隔的文本文件,如下所示

title 1 589.890 0.260 Fine 0.100 lex larry_page " " "
title 1 590.150 0.000 . 0.950 lex larry_page " " "
title 1 592.290 0.130 Here 0.990 lex larry_page " " "
title 1 592.420 0.160 I 0.990 lex larry_page " " "
title 1 592.580 0.280 go 0.990 lex larry_page " " "
title 1 592.860 0.000 , 0.100 lex larry_page " " "
title 1 593.180 0.270 taking 0.990 lex larry_page " " "
title 1 593.450 0.170 Russel 0.990 lex larry_page person_name russel_arnold object
title 1 593.640 0.060 for 0.990 lex larry_page " " "
title 1 593.700 0.110 the 0.990 lex larry_page " " "
title 1 593.810 0.460 team 0.990 lex larry_page " " "
title 1 594.270 0.000 . 0.950 lex larry_page " " "
title 1 594.920 0.140 In 0.990 lex larry_page " " "
title 1 595.060 0.090 the 0.990 lex larry_page " " "
title 1 595.150 0.360 sack 0.990 lex larry_page " " "
title 1 595.510 0.000 . 0.950 lex larry_page " " "
title 1 598.810 0.360 Hey 0.100 lex larry_page " " "
title 1 599.170 0.460 Helen 0.990 lex larry_page person_name helen_winkle addressee
title 1 599.630 0.000 . 0.950 lex larry_page " " "
title 1 600.490 0.170 Hi 0.530 lex helen_winkle " " "
title 1 600.740 0.290 guys 0.530 lex helen_winkle " " "
title 1 601.030 0.000 . 0.950 lex helen_winkle " " "
title 1 602.010 0.220 Helen 0.990 lex larry_page person_name helen_winkle addressee
title 1 602.230 0.000 , 0.100 lex larry_page " " "
title 1 602.280 0.140 I 0.990 lex larry_page " " "
title 1 602.470 0.100 have 0.990 lex larry_page " " "
title 1 602.600 0.030 a 0.990 lex larry_page " " "
title 1 602.950 0.350 question 0.990 lex larry_page " " "
title 1 603.300 0.180 for 0.990 lex larry_page " " "
title 1 603.480 0.190 you 0.990 lex larry_page " " "
title 1 603.670 0.000 , 0.100 lex larry_page " " "
title 1 603.670 0.060 and 0.990 lex larry_page " " "
title 1 603.730 0.070 it 0.990 lex larry_page " " "
title 1 603.840 0.180 might 0.990 lex larry_page " " "
title 1 604.020 0.200 be 0.990 lex larry_page " " "
title 1 604.220 0.460 a 0.100 lex larry_page " " "
title 1 604.680 0.170 little 0.990 lex larry_page " " "
title 1 604.850 0.550 awkward 0.990 lex larry_page " " "
title 1 605.400 0.000 , 0.100 lex larry_page " " "
title 1 605.610 0.090 you 0.990 lex larry_page " " "
title 1 605.700 0.320 know 0.990 lex larry_page " " "
title 1 606.020 0.000 , 0.100 lex larry_page " " "
title 1 606.340 0.260 given 0.990 lex larry_page " " "
title 1 606.660 0.130 that 0.990 lex larry_page " " "
title 1 606.870 0.330 I 0.990 lex larry_page " " 
589.890 599.630 larry_page russel_arnold helen_winkle jerome_halloy leo_cazenille
在这里,我试图在[7]列的基础上重新格式化

start_time end_time column[7] column[9]names 
我正在尝试将文本重新格式化如下:

589.890 599.630 larry_page russel_arnold helen_winkle   
600.490 601.030 helen_winkle "
602.010 607,200 larry_page helen_winkle 
在上述格式中,helen_winkle在第[9]列中没有名字,因此我给出了“

附:有时,它可能有更多的名字,如下面所示

title 1 589.890 0.260 Fine 0.100 lex larry_page " " "
title 1 590.150 0.000 . 0.950 lex larry_page " " "
title 1 592.290 0.130 Here 0.990 lex larry_page " " "
title 1 592.420 0.160 I 0.990 lex larry_page " " "
title 1 592.580 0.280 go 0.990 lex larry_page " " "
title 1 592.860 0.000 , 0.100 lex larry_page " " "
title 1 593.180 0.270 taking 0.990 lex larry_page " " "
title 1 593.450 0.170 Russel 0.990 lex larry_page person_name russel_arnold object
title 1 593.640 0.060 for 0.990 lex larry_page " " "
title 1 593.700 0.110 the 0.990 lex larry_page " " "
title 1 593.810 0.460 team 0.990 lex larry_page " " "
title 1 594.270 0.000 . 0.950 lex larry_page " " "
title 1 594.920 0.140 In 0.990 lex larry_page " " "
title 1 595.060 0.090 the 0.990 lex larry_page " " "
title 1 595.150 0.360 sack 0.990 lex larry_page " " "
title 1 595.510 0.000 . 0.950 lex larry_page " " "
title 1 598.810 0.360 Hey 0.100 lex larry_page " " "
title 1 599.170 0.460 Helen 0.990 lex larry_page person_name helen_winkle addressee
title 1 599.630 0.000 . 0.950 lex larry_page " " "
title 1 600.490 0.170 Hi 0.530 lex helen_winkle " " "
title 1 600.740 0.290 guys 0.530 lex helen_winkle " " "
title 1 601.030 0.000 . 0.950 lex helen_winkle " " "
title 1 602.010 0.220 Helen 0.990 lex larry_page person_name helen_winkle addressee
title 1 602.230 0.000 , 0.100 lex larry_page " " "
title 1 602.280 0.140 I 0.990 lex larry_page " " "
title 1 602.470 0.100 have 0.990 lex larry_page " " "
title 1 602.600 0.030 a 0.990 lex larry_page " " "
title 1 602.950 0.350 question 0.990 lex larry_page " " "
title 1 603.300 0.180 for 0.990 lex larry_page " " "
title 1 603.480 0.190 you 0.990 lex larry_page " " "
title 1 603.670 0.000 , 0.100 lex larry_page " " "
title 1 603.670 0.060 and 0.990 lex larry_page " " "
title 1 603.730 0.070 it 0.990 lex larry_page " " "
title 1 603.840 0.180 might 0.990 lex larry_page " " "
title 1 604.020 0.200 be 0.990 lex larry_page " " "
title 1 604.220 0.460 a 0.100 lex larry_page " " "
title 1 604.680 0.170 little 0.990 lex larry_page " " "
title 1 604.850 0.550 awkward 0.990 lex larry_page " " "
title 1 605.400 0.000 , 0.100 lex larry_page " " "
title 1 605.610 0.090 you 0.990 lex larry_page " " "
title 1 605.700 0.320 know 0.990 lex larry_page " " "
title 1 606.020 0.000 , 0.100 lex larry_page " " "
title 1 606.340 0.260 given 0.990 lex larry_page " " "
title 1 606.660 0.130 that 0.990 lex larry_page " " "
title 1 606.870 0.330 I 0.990 lex larry_page " " 
589.890 599.630 larry_page russel_arnold helen_winkle jerome_halloy leo_cazenille
我只是停留在这里,不知道如何继续下去

path = "path of the textfile"
with open(path,'r') as f :
        for line in f:
            columns = line.strip().split()
            start = float(columns[2])
            end = start+float(columns[3])
            pro_name = columns[9]
            s_name = columns[7]

你要做的是一个更为普遍的问题,叫做“拐弯”——行进,列出。这是一个很难有效解决的问题,尽管你在学术论文中读到的数据大多是数字

在本例中,我要做的是创建空列表,然后在for循环期间向其追加值

path = "path of the textfile"
start_list = []
end_list = []
pro_name_list = []
s_name_list = []
with open(path,'r') as f :
    for line in f:
        columns = line.strip().split()
        start = float(columns[2])
        start_list.append(start)
        end = start+float(columns[3])
        end_list.append(end)
        pro_name = columns[9]
        pro_name_list.append(pro_name)
        s_name = columns[7]
        s_name_list.append(s_name)

编辑:只是重新阅读你的问题,并意识到我没有真正正确地回答它…但我认为一旦你有了表示列的数组,你就可以按照你需要的顺序重新创建文件了?我理解你的问题了吗?

你的想法很好。顺便说一句,我发现很难执行“for循环”“要匹配s_name_列表以及开始时间和结束时间,您必须再次澄清。我不完全清楚你的意思。我如何准确地重新格式化“589.890 599.630拉里·佩奇·拉塞尔·阿诺·海伦·温克尔”@Rangooski:如果你有包含字段的列表,只需将字符串写入文件或屏幕:
file.write({:s}{:s}{:s})。格式(开始列表[I],结束列表[I],名字列表[I])
显然,您需要在
for
循环中执行此操作。@Rangooski-如果这回答了您的问题,请标记它!