Python 检查列表中的任何项目是否在文件的某一行中,如果不在,则将该行写入新文件
我有一个文件有很多行,我需要创建一个新文件,不包括包含一些单词的行 我们已经创建了一个可以工作的代码,但是有很多单词,因此,最好将这些单词存储在一个列表中,并验证该列表中的项目。遵守守则:Python 检查列表中的任何项目是否在文件的某一行中,如果不在,则将该行写入新文件,python,Python,我有一个文件有很多行,我需要创建一个新文件,不包括包含一些单词的行 我们已经创建了一个可以工作的代码,但是有很多单词,因此,最好将这些单词存储在一个列表中,并验证该列表中的项目。遵守守则: infile = file('./infile_test.txt') newopen = open('./newfile.txt', 'w') for line in infile: if 'ssh' not in line and 'snmp' not in line and 'etc' not
infile = file('./infile_test.txt')
newopen = open('./newfile.txt', 'w')
for line in infile:
if 'ssh' not in line and 'snmp' not in line and 'etc' not in line:
newopen.write(line)
这是一个示例,但假设infle_test.txt具有以下行,将创建一个新文件,不包括第2、4和6行:
line 1: this is a file test
line 2: ssh, snmp
line 3: the idea is to iterate in each line of this file
line 4: if the list of words (ssh,etc) does not appears in any of the line
line 5: then write the line in another file
line 6: etc
line 7: itens have been removed or not ?
我相信创建一个列表,如:
list = ['ssh', 'snmp', 'etc']
然后再重复一遍,比较每个列表ITEN可能会更好,他们尝试了一个新的“for”,使用了“all”和“any”函数,但效果不好
有人知道实现这一目标的更好方法吗?试试看:
infile = open('./infile_test.txt')
newopen = open('./newfile.txt', 'w')
words = ['ssh', 'snmp', 'etc']
for line in infile:
found = True
for word in words:
if word in line:
found = False
if not found:
newopen.write(line)
word_list = ['ssh', 'snmp', 'etc']
result_lines = []
for line in infile:
if all(line.lower().find(word.lower()) < 0 for word in word_list):
result_lines.append(line)
newopen.writelines(result_lines)
word\u list=['ssh','snmp','etc']
结果_行=[]
对于填充中的线:
如果单词列表中的单词的所有(line.lower().find(word.lower())<0):
结果_行。追加(行)
newopen.writelines(结果行)
尝试一下:
word_list = ['ssh', 'snmp', 'etc']
result_lines = []
for line in infile:
if all(line.lower().find(word.lower()) < 0 for word in word_list):
result_lines.append(line)
newopen.writelines(result_lines)
word\u list=['ssh','snmp','etc']
结果_行=[]
对于填充中的线:
如果单词列表中的单词的所有(line.lower().find(word.lower())<0):
结果_行。追加(行)
newopen.writelines(结果行)
完整脚本:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
第一行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
使用一个集合来收集不需要的单词。集合只允许唯一的值,因此,如果要从文件中读入这些值或以其他方式收集大量值,则不会有重复的值。此外,您还可以在脚本的后面使用交集操作符“&”
第二行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
使用“with”打开文件被认为是一种很好的做法,因为它可以进行额外的管理,例如在完成文件后自动关闭文件。请注意,您可以在这一行中打开这两个文件
第三行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
行现在是字符串列表,每个字符串表示原始文件中的一行
第四行也是最后一行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
这里是真正的工作完成的地方。这是一个列表理解,它只返回要写入的一行newopen。如果当前行集合(line.split())
中的一组单词和您的一组不需要的单词之间没有交集&
,则写入(line)
我对上面的脚本有点懒散,把最终的解决方案留给你。如果没有进一步的规范,split()只会根据空格将行分隔为单词。因此,如果您有一个不需要的单词隐藏在parens中或与其他标点符号相邻,就像您对输入文件的第4行所做的那样,split()将返回一个麻烦的单词
(ssh,etc)
…它与不需要的列表中的任何内容都不匹配,因此会传递到newfile.txt。使用split()的参数来解决这个问题。您还可以查看Python的re模块,用某种正则表达式替换line.split()
祝你好运 完整脚本:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
第一行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
使用一个集合来收集不需要的单词。集合只允许唯一的值,因此,如果要从文件中读入这些值或以其他方式收集大量值,则不会有重复的值。此外,您还可以在脚本的后面使用交集操作符“&”
第二行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
使用“with”打开文件被认为是一种很好的做法,因为它可以进行额外的管理,例如在完成文件后自动关闭文件。请注意,您可以在这一行中打开这两个文件
第三行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
行现在是字符串列表,每个字符串表示原始文件中的一行
第四行也是最后一行:
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
lines = infile.readlines()
[newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]
这里是真正的工作完成的地方。这是一个列表理解,它只返回要写入的一行newopen。如果当前行集合(line.split())
中的一组单词和您的一组不需要的单词之间没有交集&
,则写入(line)
我对上面的脚本有点懒散,把最终的解决方案留给你。如果没有进一步的规范,split()只会根据空格将行分隔为单词。因此,如果您有一个不需要的单词隐藏在parens中或与其他标点符号相邻,就像您对输入文件的第4行所做的那样,split()将返回一个麻烦的单词
(ssh,etc)
…它与不需要的列表中的任何内容都不匹配,因此会传递到newfile.txt。使用split()的参数来解决这个问题。您还可以查看Python的re模块,用某种正则表达式替换line.split()
祝你好运 你是怎么尝试的。我相信在mylist中使用nested for(你说的那句话)对于x已经足够了:如果x在line:continue中使用'list'作为python列表的变量名,你不应该这样做。有一些特殊的单词list、dict等不应该是变量名,尽管python似乎允许这样做。我自己也被那个烧掉了。@Darshan,谢谢你,已经用你的建议测试过了,但是写下了包含单词的行,已经测试过了,好像x不在行中,但是它写了很多次单词。无论如何,谢谢你对我的评论that@don_q,谢谢,我不知道那个变量名,我会修改的。你是怎么尝试的。我相信在mylist中使用nested for(你说的那句话)对于x已经足够了:如果x在line:continue中使用'list'作为python列表的变量名,你不应该这样做。有一些特殊的单词list、dict等不应该是变量名,尽管python似乎允许这样做。“我自己也被那句话烧死了。@Darshan,谢谢你,已经用你的建议测试过了,但那写的是包含“哈”的句子