Python 是否基于分组变量从文件加载列表列表?
如果我有文件:Python 是否基于分组变量从文件加载列表列表?,python,Python,如果我有文件: A pgm1 A pgm2 A pgm3 Z pgm4 Z pgm5 C pgm6 C pgm7 C pgm8 C pgm9 如何创建列表: [['pgm1','pgm2','pgm3'],['pgm4','pgm5'],['pgm6','pgm7','pgm8','pgm9']] 我需要保留加载文件中的原始顺序。所以[pgm4,pgm5]必须是第二个子列表 我的偏好是,当分组变量从上一个变量更改时,会触发新的子列表,即“A,Z,C”。但我可以接受分组变量是否必须是连续的,即
A pgm1
A pgm2
A pgm3
Z pgm4
Z pgm5
C pgm6
C pgm7
C pgm8
C pgm9
如何创建列表:
[['pgm1','pgm2','pgm3'],['pgm4','pgm5'],['pgm6','pgm7','pgm8','pgm9']]
我需要保留加载文件中的原始顺序。所以[pgm4,pgm5]必须是第二个子列表 我的偏好是,当分组变量从上一个变量更改时,会触发新的子列表,即“A,Z,C”。但我可以接受分组变量是否必须是连续的,即“1,2,3” (这是为了支持同时运行每个子列表中的程序,但要等待所有上游程序完成,然后再继续下一个列表。) 我在RHEL2.6.32上使用Python2.6.6简单使用 代码: 演示:
在我的OP之后,其他web搜索发现: 这是我目前的做法。请告知我是否可以让它更像蟒蛇 loadfile1.txt(无分组变量-输出与loadfile4.txt相同): loadfile2.txt(随机分组变量): loadfile3.txt(相同的分组变量-无依赖项-多线程): loadfile4.txt(不同的分组变量-依赖项-单线程): 我的Python脚本:
#!/usr/bin/python
# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file
# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'
with open(filename) as f_in:
lines = filter(None, (line.rstrip() for line in f_in))
print(lines)
# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)
# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
pgms = []
for pgm in group:
try:
pgms.append(pgm[1].strip())
except IndexError:
pgms.append(pgm[0].strip())
listofpgms.append(pgms)
print(listofpgms)
使用loadfile2.txt时的输出:
['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]
请你展示一下到目前为止你已经尝试了什么?在发布之前,我进行了一个多小时的网络搜索,搜索了“python文件列表”。让我感到困惑的是如何发现团队何时发生了变化。话虽如此,以后我会尽我所能提供我在所有SO文章中尝试过的示例代码。我需要保留加载文件中的原始顺序。所以pgm4,pgm5必须是第二个子列表。我使用的是RHEL2.6.32和Python2.6.6,所以我没有OrderedDict。
infile = 'filename'
with open(infile) as f:
a = [i.strip() for i in f]
a = [i.split() for i in a]
def orderset(seq):
seen = set()
seen_add = seen.add
return [ x for x in seq if not (x in seen or seen_add(x))]
l = []
for i in orderset([i[0] for i in a]):
l.append([j[1] for j in a if j[0] == i])
pgm1
pgm2
pgm3
pgm4
pgm5
pgm6
pgm7
pgm8
/a/path/with spaces/pgm9
10, pgm1
10, pgm2
10, pgm3
ZZ, pgm4
ZZ, pgm5
-5, pgm6
-5, pgm7
-5, pgm8
-5, /a/path/with spaces/pgm9
,pgm1
,pgm2
,pgm3
,pgm4
,pgm5
,pgm6
,pgm7
,pgm8
,/a/path/with spaces/pgm9
1, pgm1
2, pgm2
3, pgm3
4, pgm4
5, pgm5
6, pgm6
7, pgm7
8, pgm8
9, /a/path/with spaces/pgm9
#!/usr/bin/python
# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file
# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'
with open(filename) as f_in:
lines = filter(None, (line.rstrip() for line in f_in))
print(lines)
# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)
# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
pgms = []
for pgm in group:
try:
pgms.append(pgm[1].strip())
except IndexError:
pgms.append(pgm[0].strip())
listofpgms.append(pgms)
print(listofpgms)
['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]