Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/354.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 带有lxml解析器的BeautifulSoup4从xhtml文件中的内联svg中删除xmlns属性_Python_Xml_Beautifulsoup_Xhtml_Lxml - Fatal编程技术网

Python 带有lxml解析器的BeautifulSoup4从xhtml文件中的内联svg中删除xmlns属性

Python 带有lxml解析器的BeautifulSoup4从xhtml文件中的内联svg中删除xmlns属性,python,xml,beautifulsoup,xhtml,lxml,Python,Xml,Beautifulsoup,Xhtml,Lxml,我已经安装了BeautifulSoup4 v4.6.0和lxml v3.8.0。我试图解析以下xhtml 我要分析的代码: from bs4 import BeautifulSoup xhtml_string = """ <?xml version="1.0" encoding="utf-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xh

我已经安装了BeautifulSoup4 v4.6.0和lxml v3.8.0。我试图解析以下
xhtml

我要分析的代码:

from bs4 import BeautifulSoup

xhtml_string = """
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    </head>

    <body class="sgc-1">
      <svg xmlns="http://www.w3.org/2000/svg" height="100%" preserveAspectRatio="xMidYMid meet" version="1.1" viewBox="0 0 600 800" width="100%" xmlns:xlink="http://www.w3.org/1999/xlink">
        <image height="800" width="573" xlink:href="../Images/Cover.jpg"></image>
      </svg>
    </body>
</html>
"""

soup = BeautifulSoup(xhtml_string, 'xml')

我没有更改源
xhtml
的选项,从我所看到的
xmlns
声明是有效的。有没有办法让BeautifulSoup保持
xhtml
的原样?

您应该使用
lxml
解析器,而不是
xml

soup = BeautifulSoup(xhtml_string, 'lxml')

另一方面,lxml解析器不保留大小写(因此标记都变为小写)。lxml中是否有解析器或选项来保留大小写以获得完整的解决方案?您可以尝试html5解析器
soup = BeautifulSoup(xhtml_string, 'lxml')