Python：不同文件中的数字总和_Python_File_Sum

Python：不同文件中的数字总和

python file

Python：不同文件中的数字总和,python,file,sum,Python,File,Sum,我知道我的问题似乎已经有了解决办法，但这并不是我在其他学科所需要的。这就是：我有200个文件，每个文件有800行。文件的每一行包含800个数字。简而言之，每个文件都有完全相同的格式。让我们简单地说，我的文件是这样的：文件1： 2856725001 65 41 20 18 00 文件2： 0132090542 00 23 14 52 99 我需要做的是，文件中相同位置的数字总和，这意味着我需要这样一个输出文件：输出： 2988815543 6564347099 现在，我想做的是把每一行写在不

我知道我的问题似乎已经有了解决办法，但这并不是我在其他学科所需要的。这就是：
我有200个文件，每个文件有800行。文件的每一行包含800个数字。简而言之，每个文件都有完全相同的格式。让我们简单地说，我的文件是这样的：

文件1：

2856725001
65 41 20 18 00

文件2：

0132090542
00 23 14 52 99

我需要做的是，文件中相同位置的数字总和，这意味着我需要这样一个输出文件：

输出：

2988815543
6564347099

现在，我想做的是把每一行写在不同的文件中，但这会占用很多地方

我不知道我怎么能做到。如果有人有任何建议，我愿意接受。谢谢

假设您事先知道文件格式并有一个文件名列表，您只需在文件上迭代，并将总和累积到一个大小合适的列表中：

nrows, ncols = 2, 5          # 800, 800 in your real code
sums = [[0] * ncols for _ in range(nrows)]

file_names = ["file1.txt", "file2.txt"]
for file_name in file_names:
    with open(file_name) as f:
        for i, row in enumerate(f):
            for j, col in enumerate(row.split()):
                sums[i][j] += int(col)

for row in sums:
    print(*row)
# 29 88 81 55 43
# 65 64 34 70 99

或者使用：

假设您事先知道文件格式并有一个文件名列表，您只需迭代文件并将总和累积到一个适当大小的列表中：

nrows, ncols = 2, 5          # 800, 800 in your real code
sums = [[0] * ncols for _ in range(nrows)]

file_names = ["file1.txt", "file2.txt"]
for file_name in file_names:
    with open(file_name) as f:
        for i, row in enumerate(f):
            for j, col in enumerate(row.split()):
                sums[i][j] += int(col)

for row in sums:
    print(*row)
# 29 88 81 55 43
# 65 64 34 70 99

或者使用：

使用

numpy

Ex:

import os
import numpy as np


result = {}
base_path = r"PATH_TO_FILES"
for filename in os.listdir(base_path):               #Iterate each file
    filename = os.path.join(base_path, filename)
    with open(filename) as infile:                   #Open file for read
        for i, line in enumerate(infile):
            if i not in result:
                result[i] = np.array(line.split(), dtype=int)
            else:
                result[i] = result[i] + np.array(line.split(), dtype=int)    #sum lines

for k, v in result.items():
    print(v)

[29 88 81 55 43]
[65 64 34 70 99]

输出：

import os
import numpy as np


result = {}
base_path = r"PATH_TO_FILES"
for filename in os.listdir(base_path):               #Iterate each file
    filename = os.path.join(base_path, filename)
    with open(filename) as infile:                   #Open file for read
        for i, line in enumerate(infile):
            if i not in result:
                result[i] = np.array(line.split(), dtype=int)
            else:
                result[i] = result[i] + np.array(line.split(), dtype=int)    #sum lines

for k, v in result.items():
    print(v)

[29 88 81 55 43]
[65 64 34 70 99]

使用

numpy

Ex:

import os
import numpy as np


result = {}
base_path = r"PATH_TO_FILES"
for filename in os.listdir(base_path):               #Iterate each file
    filename = os.path.join(base_path, filename)
    with open(filename) as infile:                   #Open file for read
        for i, line in enumerate(infile):
            if i not in result:
                result[i] = np.array(line.split(), dtype=int)
            else:
                result[i] = result[i] + np.array(line.split(), dtype=int)    #sum lines

for k, v in result.items():
    print(v)

[29 88 81 55 43]
[65 64 34 70 99]

输出：

import os
import numpy as np


result = {}
base_path = r"PATH_TO_FILES"
for filename in os.listdir(base_path):               #Iterate each file
    filename = os.path.join(base_path, filename)
    with open(filename) as infile:                   #Open file for read
        for i, line in enumerate(infile):
            if i not in result:
                result[i] = np.array(line.split(), dtype=int)
            else:
                result[i] = result[i] + np.array(line.split(), dtype=int)    #sum lines

for k, v in result.items():
    print(v)

[29 88 81 55 43]
[65 64 34 70 99]

首先，您可以加载单个文件，以获得文件的结构。这也将处理并非所有行都具有相同数量的观察值的情况。然后，根据您对所有文件和行的排序结构，添加单个值

further_files = ['file 2']
sums = []
with open('file 1') as file:
    for row in file:
        sums.append(row.split())

for file in further_files:
    with open(file) as open_file:
        for i, row in enumerate(open_file):
            sums[i] = [x + y for x, y in zip(sums[i], row.split())]

further_files = ['file 2']
sums = []
with open('file 1') as file:
    for row in file:
        sums.append(row.split())

for file in further_files:
    with open(file) as open_file:
        for i, row in enumerate(open_file):
            sums[i] = [x + y for x, y in zip(sums[i], row.split())]

当我运行代码时，您会遇到此错误：

NameError:name'further\u files'未定义

yes，您需要定义更多的\u文件，作为您要读取的更多文件的列表。当我运行代码时，您会遇到此错误：

NameError:name'further\u files'未定义

yes，您需要定义更多的\u文件，作为您想要阅读的更多文件的列表。谢谢，它工作得非常好。另一个问题：如果每一行都以字符串开头，然后是数字怎么办？@Peter那么

numpy

版本不再有效了（但是有

numpy.genfromtxt

IIRC，我在上面链接的文档中提到了它，请参见）。在任何情况下，这取决于，它是否总是同一个字符串，字符串和数字之间是否有分隔符，字符串是否包含空格，它是否遵循某种规则结构，它是否有特定的长度？@Peter:那么最简单的方法是将枚举中的列（row.split（））替换为枚举中的列（row.split（））：替换为枚举中的列（row[k:].split（））：，其中

是第一个数值的第一个字符的索引（即对于大小写

foo:28 56 72 50 01

，

k=5

）或

对于枚举中的j，col（row.split（）[1:]）：

，如果字符串和第一个值之间有空格。非常感谢，你的解决方案正是我所需要的谢谢，它非常有效。另一个问题：如果每一行都以字符串开头，然后是数字怎么办？@Peter那么

numpy

版本不再有效了（但是有

numpy.genfromtxt

是第一个数值的第一个字符的索引（即对于

foo:28 56 72 50 01

，

k=5

）或

对于枚举中的j，col（row.split（）[1:]）：

，如果字符串和第一个值之间有空格。非常感谢，您的解决方案正是我所需要的