Python:json->;文本,如何只写唯一的值?
我有一个json文件,从中提取引号。它是源文件(格式完全相同) 我的目标是将所有引用(只是引用,而不是作者或其他元数据)提取到一个简单的文本文档中。前5行是:Python:json->;文本,如何只写唯一的值?,python,json,file,Python,Json,File,我有一个json文件,从中提取引号。它是源文件(格式完全相同) 我的目标是将所有引用(只是引用,而不是作者或其他元数据)提取到一个简单的文本文档中。前5行是: # Don't cry because it's over, smile because it happened. # I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle
# Don't cry because it's over, smile because it happened.
# I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.
# Be yourself; everyone else is already taken.
# Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.
# Be who you are and say what you feel, because those who mind don't matter, and those who matter don't mind.
挑战在于有些引用重复,我只想把每个引用写一次。什么是只将唯一值写入文本文档的好方法
我想到的最好的办法是:
import json
with open('quotes.json', 'r') as json_f:
data = json.load(json_f)
quote_list = []
with open('quotes.txt', 'w') as text_f:
for quote_object in data:
quote = quote_object['Quote']
if quote not in quote_list:
text_f.write(f'{quote}\n')
quote_list.append(quote)
但是,必须创建并维护一个包含40000个值的单独列表,这让人觉得效率非常低下
我尝试在每次写函数迭代时读取该文件,但不知何故,读取总是返回为空:
with open('quotes.json', 'r') as json_f:
data = json.load(json_f)
with open('quotes.txt', 'w+') as text_f:
for quote_object in data:
quote = quote_object['Quote']
print(text_f.read()) # prints nothing?
# if it can't read the doc, I can't check if quote already there
text_f.write(f'{quote}\n')
想知道为什么text_f.read()返回为空,还有什么更优雅的解决方案。您可以使用一组:
import json
with open('quotes.json', 'r') as json_f:
data = json.load(json_f)
quotes = set()
with open('quotes.txt', 'w') as text_f:
for quote_object in data:
quote = quote_object['Quote']
quotes.add(quote)
多次向集合中添加同一个引号将无效:只保留一个对象 你可以用一套!它可能比我拥有的要好,但我不确定它是否解决了问题。您仍然需要在每次引用迭代时循环集合(在您的代码中,您缺少写入部分,因此缺少循环部分),这仍然感觉效率低下。我想知道是否有一种方法可以直接检查您正在写入的文件,这是否是一种更好的方法?也许我错了,等等,我是个白痴。我意识到在集合上迭代是O(1),而在列表上迭代是O(n)。你说得对。这是超高效的。非常感谢。