Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/315.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python中的数据分组_Python_Grouping - Fatal编程技术网

Python中的数据分组

Python中的数据分组,python,grouping,Python,Grouping,我有以下(空格分隔的)输入: 我希望实现以下管道分隔输出: (其中列为[日期和名称、NUM条目、最后一列的总和]) 这是我目前的代码: s = open("output.txt","r") fn=s.readlines() d = {} for line in fn: parts = line.split() if parts[0] in d: d[parts[0]][1] += int(parts[2]) d[parts[0]][2] += 1 else: d[parts[0

我有以下(空格分隔的)输入:

我希望实现以下管道分隔输出: (其中列为[日期和名称、NUM条目、最后一列的总和])

这是我目前的代码:

s = open("output.txt","r")
fn=s.readlines()
d = {}
for line in fn:
 parts = line.split()
 if parts[0] in d:
   d[parts[0]][1] += int(parts[2])
   d[parts[0]][2] += 1
 else:
d[parts[0]] = [parts[1], int(parts[2]), 1]
for date in sorted(d):
   print "%s %s|%d|%d" % (date, d[date][0], d[date][2], d[date][1])
我得到的输出为:

2012-10-06 PETER|2|70
而不是

2012-10-06 PETER|1|60
而且
TOM
没有显示在列表中


我需要做什么来更正我的代码?

什么与当前代码不兼容?
2012-10-06 PETER|2|70
2012-10-06 PETER|1|60
import itertools
with open('output.txt', 'r') as f:
    splitlines = (line.split() for line in f if line.strip())
    for (date, name), bits in itertools.groupby(splitlines, key=lambda bits: bits[:2]):
        total = 0
        count = 0
        for _, _, val in bits:
            total += int(val)
            count += 1
        print '%s %s|%d|%d' % (date, name, count, total)
d = {}
with open('output.txt', 'r') as f:
    for line in f:
        date, name, val = line.split()
        key = (date, name)
        if key not in d:
            d[key] = [0, 0]
        d[key][0] += int(val)
        d[key][1] += 1

for key in sorted(d):
    date, name = key
    total, count = d[key]
    print '%s %s|%d|%d' % (date, name, count, total)
d = collections.defaultdict(list)
with open('output.txt', 'r') as f:
    for line in f:
        date, name, val = line.split()
        d[date, name].append(int(val))

for (date, name), vals in sorted(d.items()):
    print '%s %s|%d|%d' % (date, name, len(vals), sum(vals))