python通过增加文本文件中的编号来管理冗余数据
我是python新手。我有一个文本文件,我需要避免冗余,不是通过删除,而是通过增加文本文件中的数字(如果发现行相同) 请帮忙!答案将不胜感激! 随机文本文件的示例:python通过增加文本文件中的编号来管理冗余数据,python,Python,我是python新手。我有一个文本文件,我需要避免冗余,不是通过删除,而是通过增加文本文件中的数字(如果发现行相同) 请帮忙!答案将不胜感激! 随机文本文件的示例: hello ram1 hello ram1 hello gate1 hello gate1 预期产出: hello ram1 hello ram2 hello gate1 hello gate2 使用正则表达式和: 使用切片表示法、字典更新 import re numbers = {} with open('1.txt')
hello ram1
hello ram1
hello gate1
hello gate1
预期产出:
hello ram1
hello ram2
hello gate1
hello gate2
使用正则表达式和:
使用切片表示法、字典更新
import re
numbers = {}
with open('1.txt') as f:
for line in f:
row = re.split(r'(\d+)', line.strip())
words = tuple(row[::2]) # Extract non-number parts to use it as key
if words not in numbers:
numbers[words] = [int(n) for n in row[1::2]] # extract number parts.
numbers[words] = [n+1 for n in numbers[words]] # Increase numbers.
row[1::2] = map(str, numbers[words]) # Assign back numbers
print(''.join(row))
我打算建议使用
defaultdict(lamba:count(1))
然后使用.format(line,next(numbers[line]))
-另外,可能不需要正则表达式-也可能是line.rstrip('0123456789\n')
sufficient@falsetru很好,但是当我在中间放了一些no,它类似于:heo.1o heo.1o
,它显示为heo.1o1,heo.1o1
,而不是heo.1o,heo.2o
@JonClements,number[line]+=1
,因为默认的工厂函数只针对缺少的键调用。和lambda:count(1)
应该是count(1)。next
或count(1)。\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu?(例如he2llo wo5ld
@falsetru是的,它将包含多个数字!您能解释一下代码吗?@aseyraamsram,添加了评论。)
import re
numbers = {}
with open('1.txt') as f:
for line in f:
row = re.split(r'(\d+)', line.strip())
words = tuple(row[::2]) # Extract non-number parts to use it as key
if words not in numbers:
numbers[words] = [int(n) for n in row[1::2]] # extract number parts.
numbers[words] = [n+1 for n in numbers[words]] # Increase numbers.
row[1::2] = map(str, numbers[words]) # Assign back numbers
print(''.join(row))
import re
seen = {}
#open file
f = open('1.txt')
#read through file
for line in f:
#does the line has anything?
if len(line):
#regex, for example, matching "(hello [space])(ram or gate)(number)"
matched = re.match(r'(.*\s)(.*)(\d)',line)
words = matched.group(1) #matches hello space
key = matched.group(2) #matches anything before number
num = int(matched.group(3)) #matches only the number
if key in seen:
# see if { ram or gate } exists in seen. add 1
seen[key] = int(seen[key]) + 1
else:
# if { ram or gate } does not exist, create one and assign the initial number
seen[key] = num
print('{}{}{}'.format(words,key,seen[key]))