Python 检查列表中的任何项目是否在文件的某一行中，如果不在，则将该行写入新文件_Python

Python 检查列表中的任何项目是否在文件的某一行中，如果不在，则将该行写入新文件

python

Python 检查列表中的任何项目是否在文件的某一行中，如果不在，则将该行写入新文件,python,Python,我有一个文件有很多行，我需要创建一个新文件，不包括包含一些单词的行我们已经创建了一个可以工作的代码，但是有很多单词，因此，最好将这些单词存储在一个列表中，并验证该列表中的项目。遵守守则： infile = file('./infile_test.txt') newopen = open('./newfile.txt', 'w') for line in infile: if 'ssh' not in line and 'snmp' not in line and 'etc' not

我有一个文件有很多行，我需要创建一个新文件，不包括包含一些单词的行

我们已经创建了一个可以工作的代码，但是有很多单词，因此，最好将这些单词存储在一个列表中，并验证该列表中的项目。遵守守则：

infile = file('./infile_test.txt')
newopen = open('./newfile.txt', 'w')

for line in infile:
    if 'ssh' not in line and 'snmp' not in line and 'etc' not in line:
        newopen.write(line)

这是一个示例，但假设infle_test.txt具有以下行，将创建一个新文件，不包括第2、4和6行：

line 1: this is a file test
line 2: ssh, snmp
line 3: the idea is to iterate in each line of this file
line 4: if the list of words (ssh,etc) does not appears in any of the line
line 5: then write the line in another file
line 6: etc
line 7: itens have been removed or not ?

我相信创建一个列表，如：

list = ['ssh', 'snmp', 'etc']

然后再重复一遍，比较每个列表ITEN可能会更好，他们尝试了一个新的“for”，使用了“all”和“any”函数，但效果不好

有人知道实现这一目标的更好方法吗？

试试看：

infile = open('./infile_test.txt')
newopen = open('./newfile.txt', 'w')
words = ['ssh', 'snmp', 'etc']
for line in infile:
    found = True
    for word in words:
        if word in line:
            found = False
    if not found:
        newopen.write(line)

word_list = ['ssh', 'snmp', 'etc']
result_lines = []
for line in infile:
    if all(line.lower().find(word.lower()) < 0 for word in word_list):
        result_lines.append(line)
newopen.writelines(result_lines)

word\u list=['ssh'，'snmp'，'etc']
结果_行=[]
对于填充中的线：
如果单词列表中的单词的所有（line.lower（）.find（word.lower（））<0）：
结果_行。追加（行）
newopen.writelines（结果行）

尝试一下：

word_list = ['ssh', 'snmp', 'etc']
result_lines = []
for line in infile:
    if all(line.lower().find(word.lower()) < 0 for word in word_list):
        result_lines.append(line)
newopen.writelines(result_lines)

word\u list=['ssh'，'snmp'，'etc']
结果_行=[]
对于填充中的线：
如果单词列表中的单词的所有（line.lower（）.find（word.lower（））<0）：
结果_行。追加（行）
newopen.writelines（结果行）

完整脚本：

my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

第一行：

my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

使用一个集合来收集不需要的单词。集合只允许唯一的值，因此，如果要从文件中读入这些值或以其他方式收集大量值，则不会有重复的值。此外，您还可以在脚本的后面使用交集操作符“&”

第二行：

my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

使用“with”打开文件被认为是一种很好的做法，因为它可以进行额外的管理，例如在完成文件后自动关闭文件。请注意，您可以在这一行中打开这两个文件

第三行：

my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

行现在是字符串列表，每个字符串表示原始文件中的一行

第四行也是最后一行：

my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

这里是真正的工作完成的地方。这是一个列表理解，它只返回要写入的一行

newopen。如果当前行集合（line.split（））
中的一组单词和您的一组不需要的单词之间没有交集&
，则写入（line）

我对上面的脚本有点懒散，把最终的解决方案留给你。如果没有进一步的规范，split（）只会根据空格将行分隔为单词。因此，如果您有一个不需要的单词隐藏在parens中或与其他标点符号相邻，就像您对输入文件的第4行所做的那样，split（）将返回一个麻烦的单词
(ssh,etc)

…它与不需要的列表中的任何内容都不匹配，因此会传递到newfile.txt。使用split（）的参数来解决这个问题。您还可以查看Python的re模块，用某种正则表达式替换line.split（）
祝你好运
 完整脚本：
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

第一行：
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

使用一个集合来收集不需要的单词。集合只允许唯一的值，因此，如果要从文件中读入这些值或以其他方式收集大量值，则不会有重复的值。此外，您还可以在脚本的后面使用交集操作符“&”
第二行：
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

使用“with”打开文件被认为是一种很好的做法，因为它可以进行额外的管理，例如在完成文件后自动关闭文件。请注意，您可以在这一行中打开这两个文件
第三行：
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

行现在是字符串列表，每个字符串表示原始文件中的一行
第四行也是最后一行：
my_unwanted_words = set(['ssh', 'snmp', 'etc'])
with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:
    lines = infile.readlines()
    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

my_unwanted_words = set(['ssh', 'snmp', 'etc'])

with open("infile_test.txt", 'r') as infile, open("newfile.txt", 'w') as newopen:

    lines = infile.readlines()

    [newopen.write(line) for line in lines if not (set(line.split()) & my_unwanted_words)]

这里是真正的工作完成的地方。这是一个列表理解，它只返回要写入的一行newopen。如果当前行集合（line.split（））
中的一组单词和您的一组不需要的单词之间没有交集&
，则写入（line）

我对上面的脚本有点懒散，把最终的解决方案留给你。如果没有进一步的规范，split（）只会根据空格将行分隔为单词。因此，如果您有一个不需要的单词隐藏在parens中或与其他标点符号相邻，就像您对输入文件的第4行所做的那样，split（）将返回一个麻烦的单词
(ssh,etc)

…它与不需要的列表中的任何内容都不匹配，因此会传递到newfile.txt。使用split（）的参数来解决这个问题。您还可以查看Python的re模块，用某种正则表达式替换line.split（）
祝你好运
 你是怎么尝试的。我相信在mylist中使用nested for（你说的那句话）对于x已经足够了：如果x在line:continue中使用'list'作为python列表的变量名，你不应该这样做。有一些特殊的单词list、dict等不应该是变量名，尽管python似乎允许这样做。我自己也被那个烧掉了。@Darshan，谢谢你，已经用你的建议测试过了，但是写下了包含单词的行，已经测试过了，好像x不在行中，但是它写了很多次单词。无论如何，谢谢你对我的评论that@don_q，谢谢，我不知道那个变量名，我会修改的。你是怎么尝试的。我相信在mylist中使用nested for（你说的那句话）对于x已经足够了：如果x在line:continue中使用'list'作为python列表的变量名，你不应该这样做。有一些特殊的单词list、dict等不应该是变量名，尽管python似乎允许这样做。“我自己也被那句话烧死了。@Darshan，谢谢你，已经用你的建议测试过了，但那写的是包含“哈”的句子