Python将文件中类似的行分组到一行中_Python_File_Sorting

Python将文件中类似的行分组到一行中

python file sorting

Python将文件中类似的行分组到一行中,python,file,sorting,Python,File,Sorting,我目前有一个非常无序的文件 file.txt vfc1 3435 4556 vfc1 2334 2123 vfc1 5556 1234 vfc2 8997 5969 vfc2 4543 3343 vfc2 1232 2123 我要做的是对该文件进行排序，使文件中的所有行显示在一行上，如下所示： file_output.txt vfc1 1234 2123 2334 3435 4556 5556 vfc2 1232 2123 3343 4543 5969 8997 d = {} for

我目前有一个非常无序的文件

file.txt

vfc1 3435 4556
vfc1 2334 2123
vfc1 5556 1234
vfc2 8997 5969
vfc2 4543 3343
vfc2 1232 2123

我要做的是对该文件进行排序，使文件中的所有行显示在一行上，如下所示：

file_output.txt
vfc1 1234 2123 2334 3435 4556 5556 
vfc2 1232 2123 3343 4543 5969 8997

d = {}

for line in file('file.txt'):
        if line.strip():
                sl = line.split()
                if d.has_key(sl[0]):
                        d[sl[0]] += ' %s' % ' '.join(sl[1:])
                else:
                        d[sl[0]] = ' '.join(sl[1:])

fd = open('file_output.txt', 'w')
for key in d:
        fd.write('%s %s\n' % (key, d[key]))

fd.close()

不是特定于python的。更像是伪代码，但想法如下：

获取数组中的所有行
设置目标阵列
设置“最后一项”数组
设置全局变量以确定当前索引
检查阵列：
- 使用
```
'
```
  （空格）作为分隔符将字符串拆分为数组
```
部分
```
- ```
零件[0]
```
  ==currentIndex吗？如果是，请将
```
部分[1]，部分[2]
```
  添加到
```
lastEntry
```
- 如果没有，请将
```
lastEntry
```
  添加到
```
targetArray
```
  。设置
```
currentIndex=parts[0]
```
  。清除
```
lastEntry
```
  。将
```
部分[1]，部分[2]
```
  添加到
```
lastEntry
```

就这样！：-）

可能是这样的：

file_output.txt
vfc1 1234 2123 2334 3435 4556 5556 
vfc2 1232 2123 3343 4543 5969 8997

d = {}

for line in file('file.txt'):
        if line.strip():
                sl = line.split()
                if d.has_key(sl[0]):
                        d[sl[0]] += ' %s' % ' '.join(sl[1:])
                else:
                        d[sl[0]] = ' '.join(sl[1:])

fd = open('file_output.txt', 'w')
for key in d:
        fd.write('%s %s\n' % (key, d[key]))

fd.close()

您还可以使用

iterools.groupby

按第一列对行进行分组：

from collections import defaultdict
from itertools import chain, groupby

with open(input) as f:
    data = (x.split() for x in f)
    grouped = defaultdict(list)
    for key, group in groupby(data, key=lambda x: x[0]):
        for line in group:
            grouped[key] += line[1:]

for k,v in grouped.items():
    print k, ' '.join(v)

这个怎么样

from collections import defaultdict

d = defaultdict(list)
with open('input.txt') as f:
    for line in f.readlines():
        data = line.split()
        d[data[0]].extend(data[1:])

with open('output.txt', 'w') as f:
    for key, value in d.iteritems():
        f.write(
            '%(key)s %(value)s\n' 
            % {'key': key, 'value': " ".join(sorted(value))}
        )

试试看。使用字典。在你的例子中，所有的

vfc1

s在所有

vfc2

s之前，所以它是按顺序排列的，是故意的吗？@Blender我试过了。。使用字典，结果糟透了。而且我从来没有玩过字典before@Blender好啊但这不是最好的：：：d={}以open（“temp.txt”）作为文件：对于文件中的行：value=line.split（）key=value[0]val2=value[1]val3=value[2]d[str（key）]=val2+''+val3I获取此代码的错误。它说：TypeError:values（）不接受任何参数（给定2个）