python如何在CSS中解析@font-face?

python如何在CSS中解析@font-face?,css,regex,python-3.x,parsing,beautifulsoup,Css,Regex,Python 3.x,Parsing,Beautifulsoup,如何使用Python和BeautifulSoup(或lxml/XPath,或其他方式)从(url)中提取字体名“opensans”和两个链接 ... @字体{ 字体系列:“开放式SAN”; src:url(“/font/OpenSans常规webfont.woff2”)格式(“woff2”), url(“/fonts/OpenSans常规webfont.woff”)格式(“woff”); } ... 提前感谢您的帮助 ^\s*font-family:\s*"(.*)";$|^.*\surl\

如何使用Python和BeautifulSoup(或lxml/XPath,或其他方式)从(url)中提取字体名“opensans”和两个链接


...
@字体{
字体系列:“开放式SAN”;
src:url(“/font/OpenSans常规webfont.woff2”)格式(“woff2”),
url(“/fonts/OpenSans常规webfont.woff”)格式(“woff”);
}
...
提前感谢您的帮助

^\s*font-family:\s*"(.*)";$|^.*\surl\("(.*?)"\).*$
#编码=utf8
#上面的标记定义了此文档的编码,并且用于Python 2.x兼容性
进口稀土
regex=r“^\s*字体系列:\s*\”(.*)\”;$\124^.*\ surl\(\”(.*)\).$”
测试\u str=(“\n”
“…\n\n”
“@font-face{\n”
“字体系列:\“开放式SAN\”;\n”
src:url(\“/fonts/OpenSans常规webfont.woff2\”)格式(\“woff2\”)\n
“url(\”/fonts/OpenSans常规webfont.woff\”)格式(\“woff\”);\n”
“}\n\n”
“…\n”
"")
matches=re.finditer(regex、test\u str、re.MULTILINE)
对于matchNum,在枚举中匹配(匹配,开始=1):
打印(“在{start}-{end}:{Match}找到了Match{matchNum}”。格式(matchNum=matchNum,start=Match.start(),end=Match.end(),Match=Match.group())
对于范围(0,len(match.groups())中的groupNum:
groupNum=groupNum+1
打印(“在{start}-{end}:{Group}找到的组{groupNum}”。格式(groupNum=groupNum,start=match.start(groupNum),end=match.end(groupNum),Group=match.Group(groupNum)))
#注意:为了与Python 2.7兼容,使用ur“”作为正则表达式的前缀,使用u“”作为测试字符串和替换的前缀。

请看

BeautifulSoup不解析CSS,它是HTML/XML解析器。但是您可以使用
re
模块来获取数据可能
^\s*font-family:\s*"(.*)";$|^.*\surl\("(.*?)"\).*$
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"^\s*font-family:\s*\"(.*)\";$|^.*\surl\(\"(.*?)\"\).*$"

test_str = ("<style>\n"
    "...\n\n"
    "    @font-face {\n"
    "        font-family: \"Open Sans\";\n"
    "        src: url(\"/fonts/OpenSans-Regular-webfont.woff2\") format(\"woff2\"),\n"
    "             url(\"/fonts/OpenSans-Regular-webfont.woff\") format(\"woff\");\n"
    "    }\n\n"
    "...\n"
    "</style>")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.