Python 是否基于分组变量从文件加载列表列表？_Python

Python 是否基于分组变量从文件加载列表列表？

python

Python 是否基于分组变量从文件加载列表列表？,python,Python,如果我有文件： A pgm1 A pgm2 A pgm3 Z pgm4 Z pgm5 C pgm6 C pgm7 C pgm8 C pgm9 如何创建列表： [['pgm1','pgm2','pgm3'],['pgm4','pgm5'],['pgm6','pgm7','pgm8','pgm9']] 我需要保留加载文件中的原始顺序。所以[pgm4，pgm5]必须是第二个子列表我的偏好是，当分组变量从上一个变量更改时，会触发新的子列表，即“A，Z，C”。但我可以接受分组变量是否必须是连续的，即

如果我有文件：

A pgm1
A pgm2
A pgm3
Z pgm4
Z pgm5
C pgm6
C pgm7
C pgm8
C pgm9

如何创建列表：

[['pgm1','pgm2','pgm3'],['pgm4','pgm5'],['pgm6','pgm7','pgm8','pgm9']]

我需要保留加载文件中的原始顺序。所以[pgm4，pgm5]必须是第二个子列表

我的偏好是，当分组变量从上一个变量更改时，会触发新的子列表，即“A，Z，C”。但我可以接受分组变量是否必须是连续的，即“1，2，3”

（这是为了支持同时运行每个子列表中的程序，但要等待所有上游程序完成，然后再继续下一个列表。）

我在RHEL2.6.32上使用Python2.6.6

简单使用

代码：

演示：

在我的OP之后，其他web搜索发现：

这是我目前的做法。请告知我是否可以让它更像蟒蛇

loadfile1.txt（无分组变量-输出与loadfile4.txt相同）：

loadfile2.txt（随机分组变量）：

loadfile3.txt（相同的分组变量-无依赖项-多线程）：

loadfile4.txt（不同的分组变量-依赖项-单线程）：

我的Python脚本：

#!/usr/bin/python

# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file

# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'

with open(filename) as f_in:
    lines = filter(None, (line.rstrip() for line in f_in))

print(lines)

# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)

# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
    pgms = []
    for pgm in group:
        try:
            pgms.append(pgm[1].strip())
        except IndexError:
            pgms.append(pgm[0].strip())

    listofpgms.append(pgms)

print(listofpgms)

使用loadfile2.txt时的输出：

['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]

请你展示一下到目前为止你已经尝试了什么？在发布之前，我进行了一个多小时的网络搜索，搜索了“python文件列表”。让我感到困惑的是如何发现团队何时发生了变化。话虽如此，以后我会尽我所能提供我在所有SO文章中尝试过的示例代码。我需要保留加载文件中的原始顺序。所以pgm4，pgm5必须是第二个子列表。我使用的是RHEL2.6.32和Python2.6.6，所以我没有OrderedDict。

infile = 'filename'
with open(infile) as f:
    a = [i.strip() for i in f]

a = [i.split() for i in a]

def orderset(seq):
    seen = set()
    seen_add = seen.add
    return [ x for x in seq if not (x in seen or seen_add(x))]

l = []
for i in orderset([i[0] for i in a]):
    l.append([j[1] for j in a if j[0] == i])

pgm1
pgm2
pgm3

pgm4
pgm5

pgm6
pgm7
pgm8
/a/path/with spaces/pgm9

10, pgm1
10, pgm2
10, pgm3

ZZ, pgm4
ZZ, pgm5

-5, pgm6
-5, pgm7
-5, pgm8
-5, /a/path/with spaces/pgm9

,pgm1
,pgm2
,pgm3

,pgm4
,pgm5

,pgm6
,pgm7
,pgm8
,/a/path/with spaces/pgm9

1, pgm1
2, pgm2
3, pgm3

4, pgm4
5, pgm5

6, pgm6
7, pgm7
8, pgm8
9, /a/path/with spaces/pgm9

#!/usr/bin/python

# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file

# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'

with open(filename) as f_in:
    lines = filter(None, (line.rstrip() for line in f_in))

print(lines)

# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)

# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
    pgms = []
    for pgm in group:
        try:
            pgms.append(pgm[1].strip())
        except IndexError:
            pgms.append(pgm[0].strip())

    listofpgms.append(pgms)

print(listofpgms)

['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]