Regex 在日志文件中匹配url的正则表达式给出了行延续错误-我需要转义什么？_Regex_Python 3.x

Regex 在日志文件中匹配url的正则表达式给出了行延续错误-我需要转义什么？

regex python-3.x

Regex 在日志文件中匹配url的正则表达式给出了行延续错误-我需要转义什么？,regex,python-3.x,Regex,Python 3.x,我试图分析一个反向访问日志，并使用一个正则表达式来匹配基本url，将其放入一个变量中，然后打印该变量。它说打印是非法的语法。我玩过各种各样的转义正则表达式的游戏，所有这些都会导致各种其他错误弹出。我错过了什么 import re, sys, glob, os with open('log.txt') as f: for line in f: match = re.search("http|https):\/\/(.*?)./" print("match")

我试图分析一个反向访问日志，并使用一个正则表达式来匹配基本url，将其放入一个变量中，然后打印该变量。它说打印是非法的语法。我玩过各种各样的转义正则表达式的游戏，所有这些都会导致各种其他错误弹出。我错过了什么

import re, sys, glob, os

with open('log.txt') as f:
    for line in f:
       match = re.search("http|https):\/\/(.*?)./"
        print("match")

根据您的原始表达式，我猜这里我们希望找到一些结尾带有斜杠的URL，我们将从以下简单表达式开始：

https?:\/\/(.+?)\/

如果可能不需要结束斜杠，我们可以将其简化为：

https?:\/\/[^\s]+

或者我们可以继续添加/删除我们的边界，如果我们愿意的话

正则表达式电路可视化正则表达式：

试验

您在

http | https:\/\/（.*？）/

模式中有一个不匹配的

）

，并且

重新搜索方法未完成
使用
图案细节

http
-一个http
字符串
s？
-可选的s
：//
-一个子字符串
（[^/]*）
-捕获组1：除/

如果您计划打印匹配的值而不是整行，请访问右侧的.group（）
：
您有一个不匹配的）
，并且重新搜索方法不完整。使用match=re.search（r“https？：/（[^/]*），第行）
尝试编辑问题以阻止引用代码（以提高易读性），看起来缩进不统一，这是Python中的语法错误。调用re.search是错误的，因为它缺少第二个参数，这应该是一个包含URL.Wiktor Stribizew的字符串-你的建议很好。如果你想回答这个问题，我会接受的。多谢各位@kmkelmor I.我喜欢你的答案设置了一些迭代的事实，但是我不知道如何让它通过一行一行的文件作为字符串工作，而不让每个匹配都匹配1
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"https?:\/\/[^\s]+"

test_str = ("https://somedomain/\n"
    "https://somedomain")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

match = re.search(r"https?://([^/]*)", line)

import re, sys, glob, os

with open('log.txt') as f:
    for line in f:
        match = re.search(r"https?://([^/]*)", line)
        if match:                 # Always check if there is a match before accessing groups
            print(match.group(1)) # Only print capture group value, group() will print the whole match