Python 3.x Python：在无序文件中根据前面的标识符添加数字_Python 3.x

Python 3.x Python：在无序文件中根据前面的标识符添加数字

python-3.x

Python 3.x Python：在无序文件中根据前面的标识符添加数字,python-3.x,Python 3.x,我的文件格式很乱： a11 0.0 a12 132.0 b13 0.0 b42 584.0 randomstuff etc a11 0.0 a12 6.0 b13 138.0 b42 6.0 有成千上万的a###########################。我想为每个项目添加所有数字，因此我只有： a11, 0 a12, 138 b13, 138 b42, 590 我需要一些方法来生成每个标识符（a11、a12等），因为

我的文件格式很乱：

a11      0.0
a12    132.0
b13      0.0
b42    584.0
randomstuff
etc
a11      0.0
a12      6.0
b13    138.0
b42      6.0

有成千上万的a###########################。我想为每个项目添加所有数字，因此我只有：

a11, 0
a12, 138
b13, 138
b42, 590

我需要一些方法来生成每个标识符（a11、a12等），因为有数千个不同的标识符

要生成所有的组合，一个简单的方法就是3次循环：

for letter in 'abcdefghijkmnopqrstuvwxyz':
    for digit1 in '0123456789':
        for digit2 in '0123456789':
            print(letter + digit1 + digit2)

它生成

a00

z99

但要解析这些数据，只需检查输入行是否符合格式，然后将其整理到字典中可能更容易

code_sums = {}  # empty dictionary
lines = open("input_file.txt", "rt").readlines()
for row in lines:
    # check the line is good input
    # cleanup and single space
    row = row.strip().replace('\t', ' ')
    while (row.find('  ') != -1):
        row = row.replace('  ', ' ')  # double space to single
    # verify there's only two values in the line
    if (len(row.split(' ')) == 2):
        code, value = row.split(' ')
        if (len(code) == 3 and
            code[0] in 'abcdefghijklmnopqrstuvwxyz' and 
            code[1].isdigit() and 
            code[2].isdigit()):
            try:
                float_val = float(value)
                # looks like we have valid input, tally the value
                if (code in code_sums):
                    code_sums[code] += float_val
                else:
                    code_sums[code] = float_val
            except:
                pass # probably a malformed input line

#for code in code_sums.keys():
#    print("%s -> %7.1f" % (code, code_sums[code]))

fout = open("output_file.csv", "wt")  # TODO - handle errors
fout.write("Code,Sum\n")
for code in code_sums.keys():
    fout.write("%s,%7.1f\n" % (code, code_sums[code]))
fout.close()

听起来groupby就是你所需要的全部。你想解析这个输入文件，还是只生成[a-z][0-9][0-9]的所有不同组合？@Kingsley我想1）获取标识符的所有组合，而不是[a-z][0-9][0-9]的每个组合然后2）使用这些来汇总我文件中每个标识的所有不同值。除了打印输出，我可以将其输出为.csv吗？每行都有“a11，0”等

，并以open（'output.txt'，'a'）作为_文件：_文件.write（code，code_sums[code]）

不起作用：（否则这太好了！@Liquidity-好了。