Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/295.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 显示web摘要的内容_Python_Html_Beautifulsoup - Fatal编程技术网

Python 显示web摘要的内容

Python 显示web摘要的内容,python,html,beautifulsoup,Python,Html,Beautifulsoup,下面的代码将所有字段显示在屏幕上。是否有一种方法可以让字段彼此“并排”,就像它们在数据库或电子表格中显示一样。在源代码中,字段轨迹、日期、日期时间、等级、距离和奖品可以在resultsBlockHeader div类和Fin(终点位置)灰狗陷阱中找到,SP timeSec和Time Distance在Div resultsBlock中找到。我试图让它们像这样显示 赛道,日期,日期时间,等级,距离,奖品,fin,灰狗,陷阱,sp,timeSec,timeDistance都在一行中。任何帮助都将不胜

下面的代码将所有字段显示在屏幕上。是否有一种方法可以让字段彼此“并排”,就像它们在数据库或电子表格中显示一样。在源代码中,字段轨迹、日期、日期时间、等级、距离和奖品可以在resultsBlockHeader div类和Fin(终点位置)灰狗陷阱中找到,SP timeSec和Time Distance在Div resultsBlock中找到。我试图让它们像这样显示 赛道,日期,日期时间,等级,距离,奖品,fin,灰狗,陷阱,sp,timeSec,timeDistance都在一行中。任何帮助都将不胜感激

from urllib import urlopen

from bs4 import BeautifulSoup
html = urlopen("http://www.gbgb.org.uk/resultsMeeting.aspx?id=135754")

bsObj = BeautifulSoup(html, 'lxml')
nameList = bsObj. findAll("div", {"class": "track"})
for name in nameList:
 print(name. get_text())

nameList = bsObj. findAll("div", {"class": "date"})
for name in nameList:
 print(name. get_text())

 nameList = bsObj. findAll("div", {"class": "datetime"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("div", {"class": "grade"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("div", {"class": "distance"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("div", {"class": "prizes"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "first essential fin"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "essential greyhound"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "trap"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "sp"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "timeSec"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "timeDistance"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "essential trainer"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("li", {"class": "first essential comment"})
for name in nameList:
 print(name. get_text())
nameList = bsObj. findAll("div", {"class": "resultsBlockFooter"})
for name in nameList:
 print(name. get_text())
 nameList = bsObj. findAll("li", {"class": "first essential"})
for name in nameList:
 print(name. get_text())

首先,确保你没有违反网站的法律规定——坚持合法

标记不是很容易刮取,但我要做的是遍历竞赛标头,并为每个标头获取所需的竞赛信息。然后,获取同级结果块并提取行。开始使用的示例代码-提取轨迹和灰狗:

from pprint import pprint
from urllib2 import urlopen

from bs4 import BeautifulSoup


html = urlopen("http://www.gbgb.org.uk/resultsMeeting.aspx?id=135754")
soup = BeautifulSoup(html, 'lxml')

rows = []
for header in soup.find_all("div", class_="resultsBlockHeader"):
    track = header.find("div", class_="track").get_text(strip=True)

    results = header.find_next_sibling("div", class_="resultsBlock").find_all("ul", class_="line1")
    for result in results:
        greyhound = result.find("li", class_="greyhound").get_text(strip=True)

        rows.append({
            "track": track,
            "greyhound": greyhound
        })

pprint(rows)
请注意,您在表中看到的每一行实际上都由标记中的3行表示:

<ul class="contents line1">
   ...
</ul>
<ul class="contents line2">
   ...
</ul>
<ul class="contents line3">
   ...
</ul>
    ...
    ...
    ...

greyhound
值位于第一个
ul
(使用
line1
类),您可能需要使用
结果获取
line2
line3
。查找下一个兄弟姐妹(“ul”,class=“line2”)
结果。首先查找下一个兄弟姐妹(“ul”,class=“line3”)
,确保你没有违反网站的法律规定——坚持合法

标记不是很容易刮取,但我要做的是遍历竞赛标头,并为每个标头获取所需的竞赛信息。然后,获取同级结果块并提取行。开始使用的示例代码-提取轨迹和灰狗:

from pprint import pprint
from urllib2 import urlopen

from bs4 import BeautifulSoup


html = urlopen("http://www.gbgb.org.uk/resultsMeeting.aspx?id=135754")
soup = BeautifulSoup(html, 'lxml')

rows = []
for header in soup.find_all("div", class_="resultsBlockHeader"):
    track = header.find("div", class_="track").get_text(strip=True)

    results = header.find_next_sibling("div", class_="resultsBlock").find_all("ul", class_="line1")
    for result in results:
        greyhound = result.find("li", class_="greyhound").get_text(strip=True)

        rows.append({
            "track": track,
            "greyhound": greyhound
        })

pprint(rows)
请注意,您在表中看到的每一行实际上都由标记中的3行表示:

<ul class="contents line1">
   ...
</ul>
<ul class="contents line2">
   ...
</ul>
<ul class="contents line3">
   ...
</ul>
    ...
    ...
    ...

greyhound
值位于第一个
ul
(使用
line1
类),您可能需要使用
结果获取
line2
line3
。查找下一个兄弟姐妹(“ul”,class=“line2”)
结果。查找下一个兄弟姐妹(“ul”,class=“line3”)

将内容存储在数组中,然后稍后打印PS:本网站上发布的所有信息(“内容”)仅供您个人非商业使用。您可以将本网站上发布的信息复制一份,供您个人非商业使用,作为备份。除此之外,您不得复制、修改、重新分发、传输、出租、租赁或重新许可本网站中包含的任何内容,并且不会被授予进一步的权利或版权。嗨,该程序只给了我一个很长的列表,我如何将列表拆分成一个数组,以使元素相互对应。任何建议都非常感谢。将内容存储在数组中,然后稍后打印。PS:本网站上发布的所有信息(“内容”)仅供您个人、非商业用途。您可以将本网站上发布的信息复制一份,供您个人非商业使用,作为备份。除此之外,您不得复制、修改、重新分发、传输、出租、租赁或重新许可本网站包含的任何内容,也不得获得进一步的权利或版权。您好,该程序只给了我一个很长的列表,我如何将列表分解成一个数组以获得相应的元素。任何建议都非常感谢。