Python 键盘错误0开<;dl>;标签

Python 键盘错误0开<;dl>;标签,python,python-3.x,beautifulsoup,html-parsing,Python,Python 3.x,Beautifulsoup,Html Parsing,我正在尝试解析HTML站点,但我有一个错误 代码如下: from urllib.request import urlopen as uReq from bs4 import BeautifulSoup as soup my_url = "http://www.kontrakt.szczecin.pl/mieszkanie-sprzedaz-6664m2-339600pln-potulicka-nowe-miasto-szczecin-zachodniopomorskie,351165" #P

我正在尝试解析HTML站点,但我有一个错误

代码如下:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = "http://www.kontrakt.szczecin.pl/mieszkanie-sprzedaz-6664m2-339600pln-potulicka-nowe-miasto-szczecin-zachodniopomorskie,351165"

#PL: otwiera połączenie z wybraną stroną, pobieranie zawartości strony (urllib)
#EN: Opens a connection and grabs url

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

#html parsing (BeautifulSoup)
page_soup = soup(page_html, "html.parser") #html.parser -> zapisujemy do html, nie np. do xml

#PL: zbiera tabelkę z numerami ofert, kuchnią i innymi danymi o nieruchomości z tabelki
#EN: grabs the data about real estate like kitchen, offer no, etc.
containers = page_soup.findAll("section",{"class":"clearfix"},{"id":"quick-summary"})

# print(len(containers)) - len(containers) sprawdza ile takich obiektów istnieje na stronie
#PL: Co prawda na stronie jest tylko jedna taka tabelka, ale dla dobra nauki zrobię tak jak gdyby tabelek było wiele.
#EN: There is only one table, but for the sake of knowledge I do the container variable
container = containers[0]
print(len(container.dl))
print(container.dl[0])
这是有错误的日志,显示

runfile('/home/bartosz/Pulpit/web_scrap.py', wdir='/home/bartosz/Pulpit')
36
Traceback (most recent call last):

  File "<ipython-input-70-e826e21c585a>", line 1, in <module>
    runfile('/home/bartosz/Pulpit/web_scrap.py', wdir='/home/bartosz/Pulpit')

  File "/home/bartosz/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "/home/bartosz/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/bartosz/Pulpit/web_scrap.py", line 30, in <module>
    print(container.dl[0])

  File "/home/bartosz/anaconda3/lib/python3.6/site-packages/bs4/element.py", line 1011, in __getitem__
    return self.attrs[key]

KeyError: 0
runfile('/home/bartosz/Pulpit/web_scrap.py',wdir='/home/bartosz/Pulpit')
36
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
运行文件('/home/bartosz/Pulpit/web_scrap.py',wdir='/home/bartosz/Pulpit')
文件“/home/bartosz/anaconda3/lib/python3.6/site packages/spyder/utils/site/sitecustomize.py”,第705行,在runfile中
execfile(文件名、命名空间)
文件“/home/bartosz/anaconda3/lib/python3.6/site packages/spyder/utils/site/sitecustomize.py”,第102行,在execfile中
exec(编译(f.read(),文件名,'exec'),命名空间)
文件“/home/bartosz/Pulpit/web_scrap.py”,第30行,in
打印(container.dl[0])
文件“/home/bartosz/anaconda3/lib/python3.6/site packages/bs4/element.py”,第1011行,在__
返回self.attrs[键]
关键错误:0

len(container.dl)显示dl中有36个。如果我使用len(container.dl.dt),它会显示:1.

您需要访问元素的内容,而不是通过直接索引,而是通过
。contents
属性:

print(container.dl.contents[0])
应该有用

通过直接索引,您可以访问标记的属性,例如,如果是
,则
dl['class']
将打印
myclass

编辑:

要打印容器.dl的所有内容,请执行以下操作:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = "http://www.kontrakt.szczecin.pl/mieszkanie-sprzedaz-6664m2-339600pln-potulicka-nowe-miasto-szczecin-zachodniopomorskie,351165"

with uReq(my_url) as uClient:
    page_soup = soup(uClient.read(), "html.parser")

container = page_soup.findAll("section",{"class":"clearfix"},{"id":"quick-summary"})[0]

print(len(container.dl))
print('-' * 80)
for content in container.dl.contents:
    print(content)
    print('-' * 80)
打印(第一行是
container.dl.contents
的长度):

36
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
当事人数
--------------------------------------------------------------------------------
351165
--------------------------------------------------------------------------------
利兹巴博科
--------------------------------------------------------------------------------
4.
--------------------------------------------------------------------------------
塞纳
--------------------------------------------------------------------------------
339 600兹罗提
--------------------------------------------------------------------------------
Cena za m2
--------------------------------------------------------------------------------
5096兹罗提
--------------------------------------------------------------------------------
波维耶什尼亚
--------------------------------------------------------------------------------
66,64平方米
--------------------------------------------------------------------------------
皮特罗
--------------------------------------------------------------------------------
1.
--------------------------------------------------------------------------------
利兹巴皮耶特酒店
--------------------------------------------------------------------------------
6.
--------------------------------------------------------------------------------
库奇尼伤寒
--------------------------------------------------------------------------------
阿内克斯
--------------------------------------------------------------------------------
巴尔肯
--------------------------------------------------------------------------------
德
--------------------------------------------------------------------------------
罗扎吉·奥格泽瓦尼亚
--------------------------------------------------------------------------------
米耶斯基公司
--------------------------------------------------------------------------------
戈尔卡沃达
--------------------------------------------------------------------------------
沃多奇·米耶斯基
--------------------------------------------------------------------------------
罗扎吉·布丁库
--------------------------------------------------------------------------------
怀索基布洛克
--------------------------------------------------------------------------------
材料
--------------------------------------------------------------------------------
西利卡特
--------------------------------------------------------------------------------
韩国武道威
--------------------------------------------------------------------------------
2019
--------------------------------------------------------------------------------
温达
--------------------------------------------------------------------------------
德
--------------------------------------------------------------------------------
斯坦·尼鲁科莫奇
--------------------------------------------------------------------------------
斯坦·德韦洛佩斯基
--------------------------------------------------------------------------------
雷尼克
--------------------------------------------------------------------------------
皮尔沃特尼
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------

有些事情发生了。我确实打印了(container.dl.dt.contents),但它只将列表中的第一个显示为['numeroferty']。我不知道为什么我看不到dt的其余部分。@bart我编辑了我的答案,我打印了所有
container.dl.contents
谢谢!我用另一种方式处理,但你的似乎更容易。顺便说一句,这是项目主题,我想请您检查一下:
36
--------------------------------------------------------------------------------


--------------------------------------------------------------------------------
<dt>Numer oferty</dt>
--------------------------------------------------------------------------------
<dd>351165</dd>
--------------------------------------------------------------------------------
<dt>Liczba pokoi</dt>
--------------------------------------------------------------------------------
<dd>4</dd>
--------------------------------------------------------------------------------
<dt>Cena</dt>
--------------------------------------------------------------------------------
<dd><span class="tag price">339 600 PLN</span></dd>
--------------------------------------------------------------------------------
<dt>Cena za m2</dt>
--------------------------------------------------------------------------------
<dd>5 096 PLN</dd>
--------------------------------------------------------------------------------
<dt>Powierzchnia</dt>
--------------------------------------------------------------------------------
<dd>66,64 m2</dd>
--------------------------------------------------------------------------------
<dt>Piętro</dt>
--------------------------------------------------------------------------------
<dd>1</dd>
--------------------------------------------------------------------------------
<dt>Liczba pięter</dt>
--------------------------------------------------------------------------------
<dd>6</dd>
--------------------------------------------------------------------------------
<dt>Typ kuchni</dt>
--------------------------------------------------------------------------------
<dd>Aneks</dd>
--------------------------------------------------------------------------------
<dt>Balkon</dt>
--------------------------------------------------------------------------------
<dd>Tak</dd>
--------------------------------------------------------------------------------
<dt>Rodzaj ogrzewania</dt>
--------------------------------------------------------------------------------
<dd>CO miejskie</dd>
--------------------------------------------------------------------------------
<dt>Gorąca woda</dt>
--------------------------------------------------------------------------------
<dd>Wodociąg miejski</dd>
--------------------------------------------------------------------------------
<dt>Rodzaj budynku</dt>
--------------------------------------------------------------------------------
<dd>Wysoki blok</dd>
--------------------------------------------------------------------------------
<dt>Materiał</dt>
--------------------------------------------------------------------------------
<dd>Silikat</dd>
--------------------------------------------------------------------------------
<dt>Rok budowy</dt>
--------------------------------------------------------------------------------
<dd>2019</dd>
--------------------------------------------------------------------------------
<dt>Winda</dt>
--------------------------------------------------------------------------------
<dd>Tak</dd>
--------------------------------------------------------------------------------
<dt>Stan nieruchomości</dt>
--------------------------------------------------------------------------------
<dd>Stan deweloperski</dd>
--------------------------------------------------------------------------------
<dt>Rynek</dt>
--------------------------------------------------------------------------------
<dd>Pierwotny</dd>
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------