Python 3.x 为文件中单词的第一个匹配项提取n个字符
我是Python的初学者。我有一个只有一行数据的文件。我的要求是在某些单词后提取“n”个字符,仅用于第一次出现。而且,这些单词不是连续的 数据文件:Python 3.x 为文件中单词的第一个匹配项提取n个字符,python-3.x,Python 3.x,我是Python的初学者。我有一个只有一行数据的文件。我的要求是在某些单词后提取“n”个字符,仅用于第一次出现。而且,这些单词不是连续的 数据文件:{“id”:“1234566jnejnwfw”,“displayId”:“1234566jne”,“author”:{“name”:”abcd@xyz.com,“datetime”:15636378484,“displayId:“2342346JNE”,“datetime”:4353453} 我想在“displayId”的第一次匹配之后和“autho
{“id”:“1234566jnejnwfw”,“displayId”:“1234566jne”,“author”:{“name”:”abcd@xyz.com,“datetime”:15636378484,“displayId:“2342346JNE”,“datetime”:4353453}
我想在“displayId”的第一次匹配之后和“author”之前获取值,即:1234566jne。对于“datetime”也是如此
我试着根据索引作为单词来断开这行,并将其放入另一个文件中进行进一步清理,以获得准确的值
tmpFile = "tmpFile.txt"
tmpFileOpen = open(tmpFile, "w+")
with open("data file") as openfile:
for line in openfile:
tmpFileOpen.write(line[line.index(displayId) + len(displayId):])
然而,我相信这不是进一步工作的好办法
有人能帮我吗?如果我正确理解了您的问题,您可以通过执行以下操作来实现这一点:
import json
tmpFile = "tmpFile.txt"
tmpFileOpen = open(tmpFile, "w+")
with open("data.txt") as openfile:
for line in openfile:
// Loads the json to a dict in order to manipulate it easily
data = json.loads(str(line))
// Here I specify that I want to write to my tmp File only the first 3
// characters of the field `displayId`
tmpFileOpen.write(data['displayId'][:3])
这是可以做到的,因为您的文件中的数据是JSON,但是如果格式更改,它将无法工作此答案应该适用于任何与您的问题中的格式类似的displayId。我决定不为此答案加载JSON文件,因为完成任务不需要它
import re
tmpFile = "tmpFile.txt"
tmpFileOpen = open(tmpFile, "w+")
with open('data_file.txt', 'r') as input:
lines = input.read()
# Use regex to find the displayId element
# example: "displayId":"1234566jne
# \W matches none words, such as " and :
# \d matches digits
# {6,8} matches digits lengths between 6 and 8
# [a-z] matches lowercased ASCII characters
# {3} matches 3 lowercased ASCII characters
id_patterns = re.compile(r'\WdisplayId\W{3}\d{6,8}[a-z]{3}')
id_results = re.findall(id_patterns, lines)
# Use list comprehension to clean the results
clean_results = ([s.strip('"displayId":"') for s in id_results])
# loop through clean_results list
for id in clean_results:
# Write id to temp file on separate lines
tmpFileOpen.write('{} \n'.format(id))
# output in tmpFileOpen
# 1234566jne
# 23423426jne
此答案确实加载JSON文件,但如果JSON文件格式发生更改,则此答案将失败
import json
tmpFile = 'tmpFile.txt'
tmpFileOpen = open(tmpFile, "w+")
# Load the JSON file
jdata = json.loads(open('data_file.txt').read())
# Find the first ID
first_id = (jdata['displayId'])
# Write the first ID to the temp file
tmpFileOpen.write('{} \n'.format(first_id))
# Find the second ID
second_id = (jdata['author']['displayId'])
# Write the second ID to the temp file
tmpFileOpen.write('{} \n'.format(second_id))
# output in tmpFileOpen
# 1234566jne
# 23423426jne
非常感谢你。你的回答帮助了我,我能够推导出解决方案,因为json结构发生了变化。谢谢你,Milox。你的回答也帮助了我。