Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/317.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 美化组XML解析不起作用_Python_Xml_Beautifulsoup_Urllib2_Lxml - Fatal编程技术网

Python 美化组XML解析不起作用

Python 美化组XML解析不起作用,python,xml,beautifulsoup,urllib2,lxml,Python,Xml,Beautifulsoup,Urllib2,Lxml,我试图用BeautifulSoup解析一个XML页面,但由于某种原因,它找不到XML解析器。我不认为这是一个路径问题,因为我过去使用lxml解析页面,而不是XML。代码如下: from bs4 import * import urllib2 import lxml from lxml import * BASE_URL = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml

我试图用
BeautifulSoup
解析一个XML页面,但由于某种原因,它找不到XML解析器。我不认为这是一个路径问题,因为我过去使用
lxml
解析页面,而不是XML。代码如下:

from bs4 import *
import urllib2
import lxml
from lxml import *


BASE_URL = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml"

proxy = urllib2.ProxyHandler({'http':'http://myProxy.com})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
page = urllib2.urlopen(BASE_URL)

soup = BeautifulSoup(page,"xml") 

print soup

我可能遗漏了一些简单的东西,但是我在这里找到的所有关于BS的XML解析问题都是关于bs3的,我使用的是bs4,它使用一种不同的方法来解析XML。谢谢。

如果您安装了
lxml
,只需将其称为
BeautifulSoup
的解析器,如下所示

代码:

from bs4 import BeautifulSoup as bsoup
import requests as rq

url = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml"
r = rq.get(url)

soup = bsoup(r.content, "lxml")
print soup
<html><body><dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:nonamespaceschemalocation="66_database.xsd"><all_bids>
<auction_id>66</auction_id>
<auction_description>Advanced Wireless Services</auction_description>
... really long list follows...
[Finished in 34.9s]
结果:

from bs4 import BeautifulSoup as bsoup
import requests as rq

url = "http://auctionresults.fcc.gov/Auction_66/Results/xml/round/66_115_database_round.xml"
r = rq.get(url)

soup = bsoup(r.content, "lxml")
print soup
<html><body><dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:nonamespaceschemalocation="66_database.xsd"><all_bids>
<auction_id>66</auction_id>
<auction_description>Advanced Wireless Services</auction_description>
... really long list follows...
[Finished in 34.9s]

66
高级无线服务
... 下面是一个很长的列表。。。
[以34.9秒完成]

让我们知道这是否有帮助。

这一定是一个路径问题,因为它在iPython中工作,但在Eclipse PyDev中不工作。我会解决的。谢谢你的提示,这有点奇怪。不过,很高兴知道你找到了问题的根源。在任何情况下,上述答案都可以作为一种解决办法