在Python中过滤文本文件中的唯一行_Python_Python 2.7

在Python中过滤文本文件中的唯一行

python python-2.7

在Python中过滤文本文件中的唯一行,python,python-2.7,Python,Python 2.7,我想打印文本文件中存在的唯一行例如：如果我的文本文件的内容是： 12345 12345 12474 54675 35949 35949 74564 我想让我的Python程序打印： 12474 54675 74564 我正在使用Python 2.7 试试这个： from collections import OrderedDict seen = OrderedDict() for line in open('file.txt'): line = line.strip()

我想打印文本文件中存在的唯一行

例如：如果我的文本文件的内容是：

我想让我的Python程序打印：

12474
54675
74564

我正在使用Python 2.7

试试这个：

from collections import OrderedDict

seen = OrderedDict()
for line in open('file.txt'):
    line = line.strip()
    seen[line] = seen.get(line, 0) + 1

print("\n".join([k for k,v in seen.items() if v == 1]))

印刷品

12474
54675
74564

更新：感谢下面的评论，这是更好的：

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
    pass

with open('file.txt') as f:
    seen = OrderedCounter([line.strip() for line in f])
    print("\n".join([k for k,v in seen.items() if v == 1]))

使用

count（）

检查列表中每个元素的出现次数，并使用for循环中的

index（）

删除每个出现次数：

with open("file.txt","r")as f:
    data=f.readlines()
    for x in data:
        if data.count(x)>1:   #if item is a duplicate
            for i in range(data.count(x)):  
                data.pop(data.index(x))  #find indexes of duplicates, and remove them 
with open("file.txt","w")as f:
    f.write("".join(data)) #write data back to file as string

file.txt：

12474
54675
74564

不是最有效的，因为它使用

count

但很简单：

with open("input.txt") as f:
    orig = list(f)
    filtered = [x for x in orig if orig.count(x)==1]

print("".join(filtered))

将文件转换为行列表
创建列表理解：只保留一次出现的行
打印列表（使用空字符串连接，因为换行符仍在行中）

您可以使用

OrderedDict

和

计数器

删除重复项并维持顺序，如下所示：

from collections import OrderedDict, Counter

class OrderedCounter(Counter, OrderedDict):
    pass

with open('/tmp/hello.txt') as f:
    ordered_counter = OrderedCounter(f.readlines())

new_list = [k.strip() for k, v in ordered_counter.items() if v==1]
# ['12474', '54675', '74564']

还有你自己的尝试？看起来你想让我们为你写一些代码。虽然许多用户愿意为陷入困境的程序员编写代码，但他们通常只在海报已经试图自己解决问题时才提供帮助。演示这项工作的一个好方法是包括您迄今为止编写的代码、示例输入（如果有）、预期输出和实际获得的输出（输出、回溯等）。你提供的细节越多，你可能得到的答案就越多。检查和。@Jean-Françoisfare-它确实值得结束，但我认为这不是一个准确的欺骗。此问题需要完全删除计数大于1的条目。好的，重新打开。从那以后我再也关不上了。不要抱怨重复的答案：）好吧，我试着给鱼吃，但在OP试着吃它之前，鱼解释了答案：）真的！没听清楚，等等，埃里克是对的。这个代码不是我想要的want@AltayKarakalpaklı我更新了代码，所以它现在做它想做的事情should@AltayKarakalpaklı上面的评论当然是对的：在你发布问题之前，你真的尝试过什么吗？@hansaplast：你的代码现在可以工作了，但是如果OP不费心的话，你不应该写任何代码。一些解释你的方法会做什么的文字会更好。我比你快了1分钟：-P@hansaplast顺便说一句，我写这篇文章只是为了和你分享如何做到这一点。我想你已经明白了是的，感谢评论中的提示，这是一个有趣的练习。

pass

的目的是什么？@EricDuminil由于这个类的主体是空的，您需要

pass

来完成这个类的范围。检查：您可以直接使用

readlines（）

，对吗？是的，谢谢您的建议