Python 美化组XML解析不起作用_Python_Xml_Beautifulsoup_Urllib2_Lxml

Python 美化组XML解析不起作用

python xml

Python 美化组XML解析不起作用,python,xml,beautifulsoup,urllib2,lxml,Python,Xml,Beautifulsoup,Urllib2,Lxml,我试图用BeautifulSoup解析一个XML页面，但由于某种原因，它找不到XML解析器。我不认为这是一个路径问题，因为我过去使用lxml解析页面，而不是XML。代码如下： from bs4 import * import urllib2 import lxml from lxml import * BASE_URL = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml

我试图用

BeautifulSoup

解析一个XML页面，但由于某种原因，它找不到XML解析器。我不认为这是一个路径问题，因为我过去使用

lxml

解析页面，而不是XML。代码如下：

from bs4 import *
import urllib2
import lxml
from lxml import *


BASE_URL = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml"

proxy = urllib2.ProxyHandler({'http':'http://myProxy.com})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
page = urllib2.urlopen(BASE_URL)

soup = BeautifulSoup(page,"xml") 

print soup

我可能遗漏了一些简单的东西，但是我在这里找到的所有关于BS的XML解析问题都是关于bs3的，我使用的是bs4，它使用一种不同的方法来解析XML。谢谢。

如果您安装了

lxml

，只需将其称为

BeautifulSoup

的解析器，如下所示

代码：

from bs4 import BeautifulSoup as bsoup
import requests as rq

url = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml"
r = rq.get(url)

soup = bsoup(r.content, "lxml")
print soup

<html><body><dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:nonamespaceschemalocation="66_database.xsd"><all_bids>
<auction_id>66</auction_id>
<auction_description>Advanced Wireless Services</auction_description>
... really long list follows...
[Finished in 34.9s]

结果：

from bs4 import BeautifulSoup as bsoup
import requests as rq

url = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml"
r = rq.get(url)

soup = bsoup(r.content, "lxml")
print soup

<html><body><dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:nonamespaceschemalocation="66_database.xsd"><all_bids>
<auction_id>66</auction_id>
<auction_description>Advanced Wireless Services</auction_description>
... really long list follows...
[Finished in 34.9s]


66
高级无线服务
... 下面是一个很长的列表。。。
[以34.9秒完成]

让我们知道这是否有帮助。

这一定是一个路径问题，因为它在iPython中工作，但在Eclipse PyDev中不工作。我会解决的。谢谢你的提示，这有点奇怪。不过，很高兴知道你找到了问题的根源。在任何情况下，上述答案都可以作为一种解决办法