Python 是否基于分组变量从文件加载列表列表?

Python 是否基于分组变量从文件加载列表列表?,python,Python,如果我有文件: A pgm1 A pgm2 A pgm3 Z pgm4 Z pgm5 C pgm6 C pgm7 C pgm8 C pgm9 如何创建列表: [['pgm1','pgm2','pgm3'],['pgm4','pgm5'],['pgm6','pgm7','pgm8','pgm9']] 我需要保留加载文件中的原始顺序。所以[pgm4,pgm5]必须是第二个子列表 我的偏好是,当分组变量从上一个变量更改时,会触发新的子列表,即“A,Z,C”。但我可以接受分组变量是否必须是连续的,即

如果我有文件:

A pgm1
A pgm2
A pgm3
Z pgm4
Z pgm5
C pgm6
C pgm7
C pgm8
C pgm9
如何创建列表:

[['pgm1','pgm2','pgm3'],['pgm4','pgm5'],['pgm6','pgm7','pgm8','pgm9']]

我需要保留加载文件中的原始顺序。所以[pgm4,pgm5]必须是第二个子列表

我的偏好是,当分组变量从上一个变量更改时,会触发新的子列表,即“A,Z,C”。但我可以接受分组变量是否必须是连续的,即“1,2,3”

(这是为了支持同时运行每个子列表中的程序,但要等待所有上游程序完成,然后再继续下一个列表。)

我在RHEL2.6.32上使用Python2.6.6

简单使用

代码:

演示:


在我的OP之后,其他web搜索发现:

这是我目前的做法。请告知我是否可以让它更像蟒蛇

loadfile1.txt(无分组变量-输出与loadfile4.txt相同):

loadfile2.txt(随机分组变量):

loadfile3.txt(相同的分组变量-无依赖项-多线程):

loadfile4.txt(不同的分组变量-依赖项-单线程):

我的Python脚本:

#!/usr/bin/python

# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file

# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'

with open(filename) as f_in:
    lines = filter(None, (line.rstrip() for line in f_in))

print(lines)

# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)

# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
    pgms = []
    for pgm in group:
        try:
            pgms.append(pgm[1].strip())
        except IndexError:
            pgms.append(pgm[0].strip())

    listofpgms.append(pgms)

print(listofpgms)
使用loadfile2.txt时的输出:

['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]

请你展示一下到目前为止你已经尝试了什么?在发布之前,我进行了一个多小时的网络搜索,搜索了“python文件列表”。让我感到困惑的是如何发现团队何时发生了变化。话虽如此,以后我会尽我所能提供我在所有SO文章中尝试过的示例代码。我需要保留加载文件中的原始顺序。所以pgm4,pgm5必须是第二个子列表。我使用的是RHEL2.6.32和Python2.6.6,所以我没有OrderedDict。
infile = 'filename'
with open(infile) as f:
    a = [i.strip() for i in f]

a = [i.split() for i in a]

def orderset(seq):
    seen = set()
    seen_add = seen.add
    return [ x for x in seq if not (x in seen or seen_add(x))]

l = []
for i in orderset([i[0] for i in a]):
    l.append([j[1] for j in a if j[0] == i])
pgm1
pgm2
pgm3

pgm4
pgm5

pgm6
pgm7
pgm8
/a/path/with spaces/pgm9
10, pgm1
10, pgm2
10, pgm3

ZZ, pgm4
ZZ, pgm5

-5, pgm6
-5, pgm7
-5, pgm8
-5, /a/path/with spaces/pgm9
,pgm1
,pgm2
,pgm3

,pgm4
,pgm5

,pgm6
,pgm7
,pgm8
,/a/path/with spaces/pgm9
1, pgm1
2, pgm2
3, pgm3

4, pgm4
5, pgm5

6, pgm6
7, pgm7
8, pgm8
9, /a/path/with spaces/pgm9
#!/usr/bin/python

# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file

# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'

with open(filename) as f_in:
    lines = filter(None, (line.rstrip() for line in f_in))

print(lines)

# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)

# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
    pgms = []
    for pgm in group:
        try:
            pgms.append(pgm[1].strip())
        except IndexError:
            pgms.append(pgm[0].strip())

    listofpgms.append(pgms)

print(listofpgms)
['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]