Python 3.x 使用Python在行中查找多个关键字_Python 3.x

Python 3.x 使用Python在行中查找多个关键字

python-3.x

Python 3.x 使用Python在行中查找多个关键字,python-3.x,Python 3.x,我有这样一句话： 20:28:26.684597 24:d5:6e:76:9s:10（oui未知）>45:83:r4:7u:9s:i2 （oui未知），以太类型802.1Q（0x8100），长度78:vlan 64，p 0， ethertype IPv4，（tos 0x48，ttl 34，id 5643，偏移量0，标志[无]，协议TCP（6），长度60）192.168.45.28.56982> 172.68.54.28.网络缓存：标志，cksum 0xg654（正确），序号576485934，

我有这样一句话：

20:28:26.684597 24:d5:6e:76:9s:10（oui未知）>45:83:r4:7u:9s:i2 （oui未知），以太类型802.1Q（0x8100），长度78:vlan 64，p 0， ethertype IPv4，（tos 0x48，ttl 34，id 5643，偏移量0，标志[无]，协议TCP（6），长度60）192.168.45.28.56982> 172.68.54.28.网络缓存：标志，cksum 0xg654（正确），序号576485934，win 65535，选项[mss 1460，萨克克，TS val 2544789 ecr 0，wscale 0，eol]，长度0

在这一行中，我需要从“ID5643”中找到ID值，并从

192.168.45.28.56982

中找到另一个值（56982）。在这些情况下，“id”将是常量，

192.168.45.28

是常量

我已经编写了一个这样的脚本，请建议一种缩短代码的方法，因为在我的脚本中涉及多个步骤：

file = open('test.txt')
fi = file.readlines()

for line in fi:
    test = (line.split(","))
    for word2 in test:
        if "id" in word2:
            find2 = word2.split(" ")[-1]
            print("************", find2)
    for word in test:
        if "192.168.45.28" in word:
            find = word.split(".")
            print(find)
            for word1 in find:
                if ">" in word1:
                    find1 = word1.split(">")[0]
                    print(find1)

# 您可以使用正则表达式。这里有更多信息：

你可以这样写

import re
file = open('test.txt')
fi = file.readlines()

for line in fi:
    match = re.match('.*id (\d+).*',line)
    if match:
        print("************ %s" % match.group(1))
    match = re.match('.*192\.168\.45\.28\.(\d+).*',line)
    if match:
        print(match.group(1))

**更新**

正如jDo指出的，最好使用findall，在前面编译regex qnd，不要使用readlines，因此您将得到如下结果：

import re

re_id = re.compile("id (\d+)")
re_ip = re.compile("192\.168\.45\.28\.(\d+)")
with open("test.txt", "r") as f:
    for line in f:
        match = re.findall(re_id,line)
        if match:
            print("************ %s" % match.group(1))
        match = re.findall(re_ip,line)
        if match:
            print(match.group(1))

您可以使用正则表达式。这里有更多信息：

你可以这样写

import re
file = open('test.txt')
fi = file.readlines()

for line in fi:
    match = re.match('.*id (\d+).*',line)
    if match:
        print("************ %s" % match.group(1))
    match = re.match('.*192\.168\.45\.28\.(\d+).*',line)
    if match:
        print(match.group(1))

**更新**

正如jDo指出的，最好使用findall，在前面编译regex qnd，不要使用readlines，因此您将得到如下结果：

import re

re_id = re.compile("id (\d+)")
re_ip = re.compile("192\.168\.45\.28\.(\d+)")
with open("test.txt", "r") as f:
    for line in f:
        match = re.findall(re_id,line)
        if match:
            print("************ %s" % match.group(1))
        match = re.findall(re_ip,line)
        if match:
            print(match.group(1))

您可以使用正则表达式：

import re

# This searches for the literal id
# followed by a space and 1 or more digits
idPattern = re.compile("id (\d+)")
# This searches for your IP followed by a 
# a dot and one or more digits
ipPattern = re.compile("192\.168\.45\.28\.(\d+)")

with open("test.txt", 'r') as data:
    for line in data:
        id = idPattern.findall(line)
        ip = ipPattern.findall(line)

请参见

您可以使用正则表达式：

import re

# This searches for the literal id
# followed by a space and 1 or more digits
idPattern = re.compile("id (\d+)")
# This searches for your IP followed by a 
# a dot and one or more digits
ipPattern = re.compile("192\.168\.45\.28\.(\d+)")

with open("test.txt", 'r') as data:
    for line in data:
        id = idPattern.findall(line)
        ip = ipPattern.findall(line)

参见与其他方法相同的方法。不过，它不会将空列表添加到结果中，它会编译正则表达式以提高效率，它不会一次性将整个文件读入内存，也不会使用

id

作为变量名（这是一个内置函数，因此最好避免使用它）。输出中可能存在重复项（我不能假设您只需要唯一的条目）

与其他方法相同。不过，它不会将空列表添加到结果中，它会编译正则表达式以提高效率，它不会一次性将整个文件读入内存，也不会使用

id

作为变量名（这是一个内置函数，因此最好避免使用它）。输出中可能存在重复项（我不能假设您只需要唯一的条目）

出现以下错误“AttributeError:'set'对象没有属性'extend'”//，但我希望每行的值都存储在变量id1和ip1中，因为我需要对它们执行更多操作。你能给我一个密码吗that@dantiston确定

set（）

已扩展？这是一个列表属性。你是说

set.add（）

？@jDo你说得对，我是作为列表编写和测试的，当我切换到set时忘记更改extend。@Zoro99我更新了代码以在每行存储结果。出现以下错误“AttributeError:'set'对象没有属性'extend'”//但是我希望每行的值都存储在变量id1和ip1中，因为我需要对它们执行更多的操作。你能给我一个密码吗that@dantiston确定

set（）

已扩展？这是一个列表属性。你是说

set.add（）

？@jDo你说得对，我是作为列表编写和测试的，当我切换到set时忘了更改extend。@Zoro99我更新了代码以存储每行的结果。它没有给出任何输出，尽管脚本执行得很好，但我认为正则表达式并不完全正确。我更新了它。很快在这里测试了它，应该可以正常工作，不过您正在将整个文件读入内存。正如有人指出的，“使用readlines（）的有效方法是不要使用它。永远。”此外，为了提高效率，编译正则表达式，并使用

findall

在字符串中搜索，而不是从一开始就搜索（这样你就可以去掉星号）。你是对的，但他只要求使用排序器代码，而不是为了优化内存。@BramV，我想这是一个定义的问题，是否避免一些你几乎永远不会使用的东西可以被称为“优化”：DIt没有给出任何输出，尽管脚本执行得很好，但我认为正则表达式并不是完全正确的。我更新了它。很快在这里测试了它，应该可以正常工作，不过您正在将整个文件读入内存。正如有人指出的，“使用readlines（）的有效方法是不要使用它。永远。”此外，为了提高效率，编译正则表达式，并使用

findall

在字符串中搜索，而不是从一开始就搜索（这样你就可以去掉星号）。你是对的，但他只要求使用排序器代码，而不是为了优化内存。@BramV，我想这是一个定义问题，是否避免一些你几乎永远不会使用的东西可以称为“优化”：DJust根据你的建议编辑了我的问题//因此对于这种情况，“readlines”是最合适的，或者是否有更有效的方法可用。当然，我会这么做的。…有意义根据你的建议编辑我的问题//因此对于这种情况，“readlines”是最合适的，还是有更好的有效方法可用。当然，我会这么做。…有意义