Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/331.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:json->;文本,如何只写唯一的值?_Python_Json_File - Fatal编程技术网

Python:json->;文本,如何只写唯一的值?

Python:json->;文本,如何只写唯一的值?,python,json,file,Python,Json,File,我有一个json文件,从中提取引号。它是源文件(格式完全相同) 我的目标是将所有引用(只是引用,而不是作者或其他元数据)提取到一个简单的文本文档中。前5行是: # Don't cry because it's over, smile because it happened. # I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle

我有一个json文件,从中提取引号。它是源文件(格式完全相同)

我的目标是将所有引用(只是引用,而不是作者或其他元数据)提取到一个简单的文本文档中。前5行是:

# Don't cry because it's over, smile because it happened.
# I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.
# Be yourself; everyone else is already taken.
# Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.
# Be who you are and say what you feel, because those who mind don't matter, and those who matter don't mind.
挑战在于有些引用重复,我只想把每个引用写一次。什么是只将唯一值写入文本文档的好方法

我想到的最好的办法是:

import json

with open('quotes.json', 'r') as json_f:
    data = json.load(json_f)

    quote_list = []

    with open('quotes.txt', 'w') as text_f:
        for quote_object in data:
            quote = quote_object['Quote']
            if quote not in quote_list:
                text_f.write(f'{quote}\n')
                quote_list.append(quote)

但是,必须创建并维护一个包含40000个值的单独列表,这让人觉得效率非常低下

我尝试在每次写函数迭代时读取该文件,但不知何故,读取总是返回为空:

with open('quotes.json', 'r') as json_f:
    data = json.load(json_f)

    with open('quotes.txt', 'w+') as text_f:
        for quote_object in data:
            quote = quote_object['Quote']

            print(text_f.read()) # prints nothing?
            # if it can't read the doc, I can't check if quote already there

            text_f.write(f'{quote}\n')
想知道为什么text_f.read()返回为空,还有什么更优雅的解决方案。

您可以使用一组:

import json

with open('quotes.json', 'r') as json_f:
    data = json.load(json_f)

    quotes = set()

    with open('quotes.txt', 'w') as text_f:
        for quote_object in data:
            quote = quote_object['Quote']
            quotes.add(quote)

多次向集合中添加同一个引号将无效:只保留一个对象

你可以用一套!它可能比我拥有的要好,但我不确定它是否解决了问题。您仍然需要在每次引用迭代时循环集合(在您的代码中,您缺少写入部分,因此缺少循环部分),这仍然感觉效率低下。我想知道是否有一种方法可以直接检查您正在写入的文件,这是否是一种更好的方法?也许我错了,等等,我是个白痴。我意识到在集合上迭代是O(1),而在列表上迭代是O(n)。你说得对。这是超高效的。非常感谢。