使用python2.7进行数据提取和求和_Python_Python 2.7

使用python2.7进行数据提取和求和

python python-2.7

使用python2.7进行数据提取和求和,python,python-2.7,Python,Python 2.7,在一个名为 data.txt 03/05/2016 16:43 502 03/05/2016 16:43 502 03/05/2016 16:44 501 03/05/2016 16:44 504 03/05/2016 16:44 505 03/05/2016 16:44 506 04/05/2016 16:44 501 04/05/2016 16:45 501 04/05/2016 16:45 501 04/05/2016 16:45 52 04/05/2016 17:08

在一个名为

data.txt

03/05/2016 16:43  502
03/05/2016 16:43  502
03/05/2016 16:44  501
03/05/2016 16:44  504
03/05/2016 16:44  505
03/05/2016 16:44  506
04/05/2016 16:44  501
04/05/2016 16:45  501
04/05/2016 16:45  501
04/05/2016 16:45  52
04/05/2016 17:08  50
05/05/2016 17:08  502
05/05/2016 17:08  503
05/05/2016 17:08  504
05/05/2016 17:09  506
06/05/2016 17:09  507
06/05/2016 17:09  507
07/05/2016 17:09  508
07/05/2016 17:09  50
08/05/2016 17:10  5
08/05/2016 17:10  504
09/05/2016 17:10  504
09/05/2016 17:10  503
09/05/2016 17:10  503
10/05/2016 17:11  505
10/05/2016 17:11  505

我想做一些数学运算，这样我就可以得到最终的结果

03/05/2016   3020
04/05/2016   1605
05/05/2016   2015
06/05/2016   5023
07/05/2016   1014
08/05/2016   558
09/05/2016   5023
10/05/2016   5022

第二列是值的总和

该结果存储在另一个文本文件中，如data1.txt

我想用python 2.7编写这段代码

如何实现此功能……

设置看起来像解决方案安装程序看起来像解决方案您可以使用将给定日期的值相加：

from collections import Counter

with open('data.txt') as f:
    res = sum((Counter({d: int(c)}) for d, t, c in (line.split() for line in f)), Counter())

with open('data1.txt', 'wb') as f:
    f.writelines('{0}\t{1}\n'.format(*x) for x in sorted(res.items()))

输出：

03/05/2016  3020
04/05/2016  1605
05/05/2016  2015
06/05/2016  1014
07/05/2016  558
08/05/2016  509
09/05/2016  1510
10/05/2016  1010

此解决方案不需要标准Python安装之外的任何库。

您可以使用它来计算给定日期的值：

from collections import Counter

with open('data.txt') as f:
    res = sum((Counter({d: int(c)}) for d, t, c in (line.split() for line in f)), Counter())

with open('data1.txt', 'wb') as f:
    f.writelines('{0}\t{1}\n'.format(*x) for x in sorted(res.items()))

输出：

03/05/2016  3020
04/05/2016  1605
05/05/2016  2015
06/05/2016  1014
07/05/2016  558
08/05/2016  509
09/05/2016  1510
10/05/2016  1010

此解决方案不需要标准Python安装之外的任何库。

纯Python解决方案：

import collections

data=collections.defaultdict(int)
with open('data.txt', 'r') as f:
    for line in f:
        row=line.split()
        data[row[0]]+=int(row[2])

with open('data1.txt', 'w') as f:
    for key, value in sorted(data.items()):
        f.write(str(key)+" "+str(value)+"\n")

输出：

$ python a.py 
$ cat data1.txt 
03/05/2016 3020
04/05/2016 1605
05/05/2016 2015
06/05/2016 1014
07/05/2016 558
08/05/2016 509
09/05/2016 1510
10/05/2016 1010
$

纯python解决方案：

import collections

data=collections.defaultdict(int)
with open('data.txt', 'r') as f:
    for line in f:
        row=line.split()
        data[row[0]]+=int(row[2])

with open('data1.txt', 'w') as f:
    for key, value in sorted(data.items()):
        f.write(str(key)+" "+str(value)+"\n")

输出：

$ python a.py 
$ cat data1.txt 
03/05/2016 3020
04/05/2016 1605
05/05/2016 2015
06/05/2016 1014
07/05/2016 558
08/05/2016 509
09/05/2016 1510
10/05/2016 1010
$

您可以使用以下内容：

from collections import OrderedDict
f = open('data.txt')
res = OrderedDict()
for line in f:

    values = line.split(' ')
    if len(values) == 4:
        date = values[0]
        val = values[3]
        if res.get(date):
            res[date] += int(val)
        else:
            res[date] = int(val)

f.close()

f = open('data1.txt', 'w')
for line in res.keys():
    f.write('{} {}\n'.format(line, res[line]))
f.close()

您可以使用以下内容：

from collections import OrderedDict
f = open('data.txt')
res = OrderedDict()
for line in f:

    values = line.split(' ')
    if len(values) == 4:
        date = values[0]
        val = values[3]
        if res.get(date):
            res[date] += int(val)
        else:
            res[date] = int(val)

f.close()

f = open('data1.txt', 'w')
for line in res.keys():
    f.write('{} {}\n'.format(line, res[line]))
f.close()

你能详细说明一下“某些数学运算”吗？我想在第三列数据中求和结果。txt第二列的结果输出应该是第三列数据的总和。txt输出日期必须排序吗？如果你对第三列数据求和，你将得到一个数字。如果您指的是按日期分组的总和，那么您是如何得到2016年5月7日的502的？您能详细说明一下“某些数学运算”吗？我想在第三列数据中求和结果。txt第二列的结果输出应该是第三列数据的总和。txt必须对输出日期进行排序吗？如果您对第三列数据求和，您将得到一个数字。如果您是指按日期分组的总和，那么您如何获得2016年5月7日的502？先生，我想在第3列中添加结果先生，我想在第3列中添加结果回溯（最近一次调用）：文件“C:/Users/GhOsT/Desktop/oo.py”，第4行，res=sum（（Counter（{d:int（C）}）表示d，t，C in（line.split（）表示f行）），Counter（））文件“C:/Users/GhOsT/Desktop/oo.py”，第4行，res=sum（（计数器（{d:int（C）}）用于d，t，C-in（line.split（）用于f中的行）），计数器（））值错误：需要大于0的值才能unpack@DarrenSam：您是否尝试使用问题中提供的相同输入？如果没有，请将示例输入保存到一个文件并尝试使用它？我已经用Python2.7对此进行了测试，对我来说它工作得很好。回溯（最后一次调用）：文件“C:/Users/GhOsT/Desktop/oo.py”，第4行，res=sum（（计数器（{d:int（C）}）表示d，t，C in（line.split（）表示f中的line）），计数器（））文件“C:/Users/GhOsT/Desktop/oo.py”，第4行，res=sum（（计数器（{d:int（C）}）表示d，t，C）（f中的行的line.split（）），Counter（））ValueError:需要大于0的值才能unpack@DarrenSam：您是否尝试使用与问题中提供的相同的输入？如果没有，请将示例输入保存到一个文件中并尝试使用它？我已使用Python 2.7对此进行了测试，对我来说它工作正常。回溯（最近一次调用）：file“C:/Users/GhOsT/Desktop/oo.py”，第7行，数据[0]行+=int（第[2]行）索引器中：列出索引range@DarrenSam在这里，对于版本

2.7.6

和您发布的数据traceback（最后一次调用）：文件“C:/Users/GhOsT/Desktop/oo.py”，第7行，数据[row[0]+=int（row[2]）索引器：列出索引，共个range@DarrenSam对于版本

2.7.6

和您发布的数据，它在这里工作正常。如果res.get（date）：res[date]+=int（val）或者：res[date]=int（val）res.get（date），您可以解释一下吗此语句的作用是什么？您可以用以下语句替换它：

has_key

get

-从dict按键返回值，但如果它不存在，则返回无值。先生，如果res.get（date）：res[date]+=int（val）其他：res[date]=int（val）res.get（date）您能解释一下吗此语句的作用是什么？您可以将其替换为

has_key

get

-从dict按键返回值，但如果它不存在，则返回None。