使用Python的正则表达式模式_Python_Regex

使用Python的正则表达式模式

python regex

使用Python的正则表达式模式,python,regex,Python,Regex,我正在尝试制作一个脚本/程序，在其中我可以在两个文本文件中找到匹配的IP地址：一个包含IP地址列表的文本文件（1.1.1.1）一个包含子网列表的文本文件（1.1.1.0/28）我想用正则表达式，但我不知道怎么做例如： import re def check(fname1, fname2): f2 = open(fname2) f1 = open(fname1) pattern = ('\d{1,3}\.\d{1,3}\.\d{1,3}') for

我正在尝试制作一个脚本/程序，在其中我可以在两个文本文件中找到匹配的IP地址：

一个包含IP地址列表的文本文件（1.1.1.1）
一个包含子网列表的文本文件（1.1.1.0/28）

我想用正则表达式，但我不知道怎么做

例如：

import re


def check(fname1, fname2):
    f2 = open(fname2)
    f1 = open(fname1)
    pattern = ('\d{1,3}\.\d{1,3}\.\d{1,3}')

    for line in f1:
        p1 = re.match(pattern, line)
        out_p1 = p1.group(0)
        for item in f2:
            p2 = re.match(pattern, item)
            out_p2 = p2.group(0)
            if out_p1 in out_p2:
                print(line, item)

因此，我试图将第一个文本文件中的IP地址与第二个文本文件中的子网进行匹配。然后我想输出IP地址及其匹配的子网

像这样：

#IP      #Subnet
1.1.1.1, 1.1.1.0/28
8.8.10.5, 8.8.8.0/23

通过运行嵌套循环，您将进行大量不必要的处理，将第一个文件中的所有匹配项附加到一个列表中，然后使用第二个文件中的匹配项检查该列表更有意义。这是使用两个本地列表近似于此过程：

import re

input1 = ['1.1.1.1', '233.123.4.125']
input2 = ['1.1.1.1/123', '123.55.2.235/236']
pattern = ('^(\d{1,3}\.?){4}')
matchlist = []


for line in input1:
  p1 = re.match(pattern, line)
  matchlist.append(p1.group(0))

print(matchlist)

for item in input2:
  p2 = re.match(pattern, item)
  t = p2.group(0)
  if t in matchlist:
    print t

不考虑将两个文本文件中的数据行拉入程序内存（例如，

f1=open（fname1，'r'）。readlines（）

），假设您有两个行列表

import re

f1 = ['1.1.1.1', '192.168.1.1', '192.35.192.1', 'some other line not desired']


f2 = ['1.1.1.0/28', '1.2.2.0/28', '192.168.1.1/8', 'some other line not desired']



def get_ips(text):
    # this will match on any string containing three octets
    pattern = re.compile('\d{1,3}\.\d{1,3}\.\d{1,3}')
    out = []
    for row in text:
        if re.match(pattern, row):
            out.append(re.match(pattern, row).group(0))
    return out


def is_match(ip, s):
    # this will return true if ip matches what it finds in string s
    if ip in s:
        return True


def check(first, second):
    # First iterate over each IP found in the first file
    for ip in get_ips(first):
        # now check that against each subnet line in the second file
        for subnet in second:
            if is_match(ip, row):
                print('IP: {ip} matches subnet: {subnet}')

请注意，我试图将一些功能分解为不同的关注点。您应该能够分别修改每个函数。这是假设您将行放入一些字符串列表中。我也不确定您在

F2

中真正想要匹配什么，因此这应该允许您修改

is_match（）

，而不影响其他部分

祝你好运。

在搜索之前将两个文件都拉入RAM有多困难？你想在

F1

中的

F2

中的行中找到匹配的行吗？你所说的比较两个文件的模式是什么意思？是的，我想比较f1和f2的行。最后，作为一个评论，想想你在代码中写了什么。您从

F1

中提取到

模式的匹配项，然后您想在F2
中查找该匹配项，但您的代码在F2
中查找模式的匹配项。如果您想添加详细信息，请编辑您的问题；不要将这些详细信息添加为注释。\n如何打印完整的IP而不是前三个八位字节1.1.1？您只能从F1
中获取每行的前三个八位字节，因此您没有这些。您可以从F1
的每一行中提取所有四个八位字节，然后修改is_match（）
以提取前三个八位字节并测试其是否在s
中，或者您可以在get_ips（）
中构建一个字典，其中每一行的三个八位字节和四个八位字节都匹配，如他们所说，给猫剥皮有不同的方法。。。。