python通过增加文本文件中的编号来管理冗余数据

python通过增加文本文件中的编号来管理冗余数据,python,Python,我是python新手。我有一个文本文件,我需要避免冗余,不是通过删除,而是通过增加文本文件中的数字(如果发现行相同) 请帮忙!答案将不胜感激! 随机文本文件的示例: hello ram1 hello ram1 hello gate1 hello gate1 预期产出: hello ram1 hello ram2 hello gate1 hello gate2 使用正则表达式和: 使用切片表示法、字典更新 import re numbers = {} with open('1.txt')

我是python新手。我有一个文本文件,我需要避免冗余,不是通过删除,而是通过增加文本文件中的数字(如果发现行相同)

请帮忙!答案将不胜感激! 随机文本文件的示例:

hello ram1
hello ram1
hello gate1
hello gate1
预期产出:

hello ram1
hello ram2
hello gate1
hello gate2

使用正则表达式和:


使用切片表示法、字典更新

import re

numbers = {}
with open('1.txt') as f:
    for line in f:
        row = re.split(r'(\d+)', line.strip())
        words = tuple(row[::2])  # Extract non-number parts to use it as key
        if words not in numbers:
            numbers[words] = [int(n) for n in row[1::2]]  # extract number parts.
        numbers[words] = [n+1 for n in numbers[words]]  # Increase numbers.
        row[1::2] = map(str, numbers[words])  # Assign back numbers
        print(''.join(row))

我打算建议使用
defaultdict(lamba:count(1))
然后使用
.format(line,next(numbers[line]))
-另外,可能不需要正则表达式-也可能是
line.rstrip('0123456789\n')
sufficient@falsetru很好,但是当我在中间放了一些no,它类似于:
heo.1o heo.1o
,它显示为
heo.1o1,heo.1o1
,而不是
heo.1o,heo.2o
@JonClements,
number[line]+=1
,因为默认的工厂函数只针对缺少的键调用。和
lambda:count(1)
应该是
count(1)。next
count(1)。\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu?(例如
he2llo wo5ld
@falsetru是的,它将包含多个数字!您能解释一下代码吗?@aseyraamsram,添加了评论。)
import re

numbers = {}
with open('1.txt') as f:
    for line in f:
        row = re.split(r'(\d+)', line.strip())
        words = tuple(row[::2])  # Extract non-number parts to use it as key
        if words not in numbers:
            numbers[words] = [int(n) for n in row[1::2]]  # extract number parts.
        numbers[words] = [n+1 for n in numbers[words]]  # Increase numbers.
        row[1::2] = map(str, numbers[words])  # Assign back numbers
        print(''.join(row))
import re

seen = {}
#open file
f = open('1.txt')
#read through file
for line in f:
    #does the line has anything?
    if len(line):
        #regex, for example, matching "(hello [space])(ram or gate)(number)"
        matched = re.match(r'(.*\s)(.*)(\d)',line)
        words = matched.group(1) #matches hello space
        key = matched.group(2) #matches anything before number
        num = int(matched.group(3)) #matches only the number

        if key in seen:
            # see if { ram or gate } exists in seen. add 1
            seen[key] = int(seen[key]) + 1
        else:
            # if { ram or gate } does not exist, create one and assign the initial number
            seen[key] = num
        print('{}{}{}'.format(words,key,seen[key]))