Python 有没有办法让这个功能看起来更好?
我需要一个从Apache日志文件中提取url的逻辑: 现在我做到了:Python 有没有办法让这个功能看起来更好?,python,Python,我需要一个从Apache日志文件中提取url的逻辑: 现在我做到了: apache_log = {'@source': 'file://xxxxxxxxxxxxxxx//var/log/apache2/access.log', '@source_host': 'xxxxxxxxxxxxxxxxxxx', '@message': 'xxxxxxxxxxxxxxx xxxxxxxxxx - - [02/Aug/2013:12:38:37 +0000] "POST /user/12345/produc
apache_log = {'@source': 'file://xxxxxxxxxxxxxxx//var/log/apache2/access.log', '@source_host': 'xxxxxxxxxxxxxxxxxxx', '@message': 'xxxxxxxxxxxxxxx xxxxxxxxxx - - [02/Aug/2013:12:38:37 +0000] "POST /user/12345/product/2 HTTP/1.1" 404 513 "-" "PycURL/7.26.0"', '@tags': [], '@fields': {}, '@timestamp': '2013-08-02T12:38:38.181000Z', '@source_path': '//var/log/apache2/access.log', '@type': 'Apache-access'}
data = apache_log['@message'].split()
if data.index('"POST') and data[data.index('"POST')+2].startswith('HTTP'):
print data[data.index('"POST')+1]
它返回给我:
/user/12345/product/2
基本上结果是正确的,但我不太喜欢我这样做的方式
是否有人建议从apache日志文件中提取此路径的更好(更Pythonic)方法。正则表达式将更好地工作:
import re
post_path = re.compile(r'"POST (/\S+) HTTP')
match = post_path.search(apache_log['@message'])
if match:
print match.group(1)
演示:
属于codereview.SE。我不认为
if data.index(“'POST')
部分按照您想要的方式工作。为了将来的参考,检查列表中的内容是否只是,'POST'in data
。没错,我错过了这部分谢谢@Martijn您的答案总是很棒!
>>> import re
>>> apache_log = {'@source': 'file://xxxxxxxxxxxxxxx//var/log/apache2/access.log', '@source_host': 'xxxxxxxxxxxxxxxxxxx', '@message': 'xxxxxxxxxxxxxxx xxxxxxxxxx - - [02/Aug/2013:12:38:37 +0000] "POST /user/12345/product/2 HTTP/1.1" 404 513 "-" "PycURL/7.26.0"', '@tags': [], '@fields': {}, '@timestamp': '2013-08-02T12:38:38.181000Z', '@source_path': '//var/log/apache2/access.log', '@type': 'Apache-access'}
>>> post_path = re.compile(r'"POST (/\S+) HTTP')
>>> match = post_path.search(apache_log['@message'])
>>> if match:
... print match.group(1)
...
/user/12345/product/2