Python 在两个连续的GET请求后查找DELETE的所有IP地址

Python 在两个连续的GET请求后查找DELETE的所有IP地址,python,regex,Python,Regex,我有一个日志文件在两个连续的GET请求后查找DELETE的所有IP地址。 192.168.10.20 - - [18/Jul/2017:08:41:37 +0000] "GET HTTP/1.0" "/Safari/5322" 10.30.24.3 - - [18/Jul/2017:08:45:15 +0000] "GET HTTP/1.0" "/Safari/5322" 98.5.45.3 - - [18/Jul/2017:08:45:49 +0000] "DELETE Firefox/3.

我有一个日志文件
在两个连续的GET请求后查找DELETE的所有IP地址。

192.168.10.20 - - [18/Jul/2017:08:41:37 +0000] "GET HTTP/1.0" "/Safari/5322"
10.30.24.3 - - [18/Jul/2017:08:45:15 +0000] "GET HTTP/1.0" "/Safari/5322"
98.5.45.3 - - [18/Jul/2017:08:45:49 +0000] "DELETE  Firefox/3.8"
94.5.6.3 - - [18/Jul/2017:08:48:56 +0000] "DELETE  Firefox/3.8"
192.168.1.2 - - [18/Jul/2017:08:41:37 +0000] "GET HTTP/1.0" "/Safari/5322"
10.30.24.12 - - [18/Jul/2017:08:45:15 +0000] "GET HTTP/1.0" "/Safari/5322"
98.5.45.34 - - [18/Jul/2017:08:45:49 +0000] "DELETE  Firefox/3.8"
预料之外
['98.5.45.3'、'94.5.6.3'、'98.5.45.34']

我的代码

import re
s =  open(r'C:/Users/apache_log.log').read()
expr = r'GET.*?GET[^\n]*\n(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'
print(re.findall(expr, s, re.DOTALL))

上面的表达式将给出来自2 GET请求的IP地址

您可以在这里使用正则表达式和一些经过调整的点逻辑:

log = """192.168.10.20 - - [18/Jul/2017:08:41:37 +0000] \"GET HTTP/1.0\" \"/Safari/5322\"
10.30.24.3 - - [18/Jul/2017:08:45:15 +0000] \"GET HTTP/1.0\" \"/Safari/5322\"
98.5.45.3 - - [18/Jul/2017:08:45:49 +0000] \"DELETE  Firefox/3.8\"
94.5.6.3 - - [18/Jul/2017:08:48:56 +0000] \"DELETE  Firefox/3.8\"
192.168.1.2 - - [18/Jul/2017:08:41:37 +0000] \"GET HTTP/1.0\" \"/Safari/5322\"
10.30.24.12 - - [18/Jul/2017:08:45:15 +0000] \"GET HTTP/1.0\" \"/Safari/5322\"
98.5.45.34 - - [18/Jul/2017:08:45:49 +0000] \"DELETE  Firefox/3.8\""""

matches = re.findall(r'GET(?:(?!(?:POST|PUT|GET|DELETE)).)*(?:(?!(?:POST|PUT|GET|DELETE)).)*\n(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?:(?!(?:POST|PUT|GET|DELETE)).)*\"DELETE', log, flags=re.DOTALL)
print(matches)
这张照片是:

['98.5.45.3', '98.5.45.34']

但是,我可能只编写一个简单的解析器,逐行读取以查找IP匹配项。

正则表达式必须是唯一的解决方案吗?如果要求客户机的IP地址必须使
删除之前获得
,那么在逐行遍历日志文件的同时,通过映射单独跟踪它们会容易得多。我认为您错过了94.5.6。3@TimBiegeleisen有趣的是,他写了一个问题:)(@TimBiegeleisen您能解释一下正则表达式吗,以及为什么这里需要dotall。奇怪的正则表达式+1:)