使用Beauty Soup 4和Python解析错误_Python_Html Parsing_Beautifulsoup

使用Beauty Soup 4和Python解析错误

python

使用Beauty Soup 4和Python解析错误,python,html-parsing,beautifulsoup,Python,Html Parsing,Beautifulsoup,我需要从该网站获取房间列表：我使用BeautifulSoup4来解析页面。这是我迄今为止编写的代码： from bs4 import BeautifulSoup import urllib pageFile = urllib.urlopen("http://studentroom.ch/dynasite.cfm?dsmid=106547") pageHtml = pageFile.read() pageFile.close() soup = BeautifulSoup("".join(p

我需要从该网站获取房间列表：

我使用BeautifulSoup4来解析页面。这是我迄今为止编写的代码：

from bs4 import BeautifulSoup
import urllib

pageFile = urllib.urlopen("http://studentroom.ch/dynasite.cfm?dsmid=106547")
pageHtml = pageFile.read()
pageFile.close()

soup = BeautifulSoup("".join(pageHtml))

roomsNoFilter = soup.find('div', {"id": "ImmoListe"})

rooms = roomsNoFilter.table.find_all('tr', recursive=False)

for room in rooms:
    print room
    print "----------------"

print len(rooms)

现在我只想得到表中的行。但我只得到7行，而不是78行（或77行）

一开始我觉得我只收到了一部分html，但我打印了整个html，并且接收正确。没有ajax调用在页面加载后加载新行

有人能帮我找到错误吗？

这对我有用

soup = BeautifulSoup(pageHtml)
div = soup.select('#ImmoListe')[0]
table = div.select('table > tbody')[0]
k = 0
for room in table.find_all('tr'):
    if 'onmouseout' in str(room):
        print room
        k = k + 1
print "Total ",k

让我知道状态

你为什么要使用

”。加入（pageHtml）

，是不是

pageHtml

已经是一个大字符串了？运行你提供的代码后，我得到了

。我换了电脑，工作正常。。另一个仍然不起作用。。无论如何，谢谢你。