Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 正则表达式在给定单词前后获取单词_Python_Regex_Python 3.x - Fatal编程技术网

Python 正则表达式在给定单词前后获取单词

Python 正则表达式在给定单词前后获取单词,python,regex,python-3.x,Python,Regex,Python 3.x,有人能帮我在Python中使用下面字符串的正则表达式模式吗?我有.log文件,我想从字符串中找到下一行,我必须获得用户和ip 我想要一个正则表达式,它可以从中的前面一个单词,从中的后面一个单词 Failed password for root from 123.183.209.132 port 39706 ssh2 我想要上面字符串中的root和123.183.209.132 Failed password for invalid user packer from 13.82.211.217

有人能帮我在Python中使用下面字符串的正则表达式模式吗?我有
.log
文件,我想从字符串中找到下一行,我必须获得用户和ip

我想要一个正则表达式,它可以从中的
前面一个单词,从
中的
后面一个单词

Failed password for root from 123.183.209.132 port 39706 ssh2
我想要上面字符串中的
root
123.183.209.132

Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2
reverse mapping checking getaddrinfo for undefined.datagroup.ua
[93.183.207.5] failed - POSSIBLE BREAK-IN ATTEMPT!

reverse mapping checking getaddrinfo for nsg-static-226.127.71.182.airtel.in [182.71.127.226] failed - POSSIBLE BREAK-IN ATTEMPT!

reverse mapping checking getaddrinfo for 179.185.44.168.static.gvt.net.br [179.185.44.168] failed - POSSIBLE BREAK-IN ATTEMPT!
我要从上面的管柱上取
packer
13.82.211.217

Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2
reverse mapping checking getaddrinfo for undefined.datagroup.ua
[93.183.207.5] failed - POSSIBLE BREAK-IN ATTEMPT!

reverse mapping checking getaddrinfo for nsg-static-226.127.71.182.airtel.in [182.71.127.226] failed - POSSIBLE BREAK-IN ATTEMPT!

reverse mapping checking getaddrinfo for 179.185.44.168.static.gvt.net.br [179.185.44.168] failed - POSSIBLE BREAK-IN ATTEMPT!
我想要
未定义的.datagroup.ua
93.183.207.5
from(新正则表达式)

我的工作代码

def parse(filename, date=None):
    try:
        # string = 'Failed password for ([a-z]*|[a-z]* [a-z]* [a-z]*) from '
        string = 'Failed password for ([a-z]*|[a-z]* [a-z]* [a-z]*) from [0-9]+(?:\.[0-9]+){3}'
        # string_sub = 'for (?<user>[a-zA-Z\.]+).*?(?<ip>(?:\d{1,3}\.){3}\d{1,3})'
        # string_re = re.compile(r"^[^ ]+ - (C[^ ]*) \[([^ ]+)").match
        match_list =[]
        with open(filename, 'r') as file:
            for line in file:
                for match in re.finditer(string, line, re.S):
                    match_text = match.group()
                    user_ip = re.search(r'Failed password for .*?(\w+) from (\d+(?:\.\d+){3})', match_text)
                    user = user_ip.groups()[0]
        print(user)
    except KeyError as e:
        msg="key %s is missing" % str(e)
        return msg
    except Exception as e:
        return str(e)
def parse(文件名,日期=None):
尝试:
#字符串='来自的([a-z]*|[a-z]*[a-z]*[a-z]*)的密码失败'
字符串='来自[0-9]+(?:\.[0-9]+){3}的([a-z]*.[a-z]*[a-z]*[a-z]*)的密码失败'
#string_sub='for(?[a-zA-Z\.]+).*((?(?:\d{1,3}\){3}\d{1,3})'
#字符串\u re=re.compile(r“^[^]+-(C[^]*)\[([^]+)”)。匹配
匹配列表=[]
打开(文件名为“r”)作为文件:
对于文件中的行:
对于re.finditer中的匹配(字符串、行、re.S):
match_text=match.group()
用户\u ip=re.search(r'Failed password for.*?(\w+)from(\d+(?:\。\d+{3})),匹配\u text)
user=user_ip.groups()[0]
打印(用户)
除KeyError外,如e:
msg=“缺少键%s”%str(e)
返回消息
例外情况除外,如e:
返回str(e)

我一直在使用正则表达式。

如果我理解正确,您基本上希望该行的
后面的单词(用户名)和
ip
?如果是这样,那么:

for (?<user>[a-zA-Z\.]+).*?(?<ip>(?:\d{1,3}\.){3}\d{1,3})

对于您的用例来说,正则表达式可能有些过分……您是否尝试过更简单的方法,例如:

s1 = "Failed password for root from 123.183.209.132 port 39706 ssh2"
s2 = "Failed password for invalid user packer from 13.82.211.217 port 45832 ssh2"

parsed = s1.split('from',1)
user = parsed[0].split()[-1]
ip = parsed[1].split()[0]

print(f'User is {user} and IP is {ip}')
输出:

('root', '123.183.209.132')
('packer', '13.82.211.217')
('undefined.datagroup.ua', '93.183.207.5')
(?:                     # non capture group
    Failed password     # literally
  |                   # OR
    reverse mapping     # literally
    .+?                 # 1 or more any character, not greedy
)                       # end group
 for                    # literally
 .*?                    # 0 or more any character
 ([\w.]+)               # group 1, 1 or more word character or dot
 \s+                    # 1 or more spaces
 (?:from |\[)           # non capture group, from OR opening square bracket
(\d+(?:\.\d+){3})       # group 2, IP
说明:

('root', '123.183.209.132')
('packer', '13.82.211.217')
('undefined.datagroup.ua', '93.183.207.5')
(?:                     # non capture group
    Failed password     # literally
  |                   # OR
    reverse mapping     # literally
    .+?                 # 1 or more any character, not greedy
)                       # end group
 for                    # literally
 .*?                    # 0 or more any character
 ([\w.]+)               # group 1, 1 or more word character or dot
 \s+                    # 1 or more spaces
 (?:from |\[)           # non capture group, from OR opening square bracket
(\d+(?:\.\d+){3})       # group 2, IP

这回答了你的问题吗?我已经编辑了我的问题,我一直在使用user和ip两种形式的字符串。我已经完成了第一部分的regex,我只需要第二部分的regex。有人能帮我只做一个吗?如果我从
@aakankshakhandelwal中得到一个单词前后的单词,这将很容易,但你的最后一个示例没有包含word from all?有两个不同的字符串,所以我需要两个不同的正则表达式。1.to
regex.sub()
从字符串中获取用户和ip。2.从第二个字符串中获取对象和ip。对上面的字符串非常有效。我还需要对下面给定的字符串使用相同的正则表达式
反向映射检查getaddrinfo以获取未定义的.datagroup.ua[93.183.207.5]失败-可能有人试图闯入!