Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从BeautifulSoup输出的格式_Python_Python 3.x_Beautifulsoup - Fatal编程技术网

Python 从BeautifulSoup输出的格式

Python 从BeautifulSoup输出的格式,python,python-3.x,beautifulsoup,Python,Python 3.x,Beautifulsoup,通过阅读BeautifulSoup文档,我成功地编写了一个简短的python脚本来刮取一个表并将其打印出来,但是我不知道如何将其格式化为一个表。最终目标是从网站上获取足球比赛预测:并将其保存到文本文件中 以下是我迄今为止编写的代码: import urllib import urllib.request from bs4 import BeautifulSoup def make_soup(url): thepage = urllib.request.urlopen(url)

通过阅读BeautifulSoup文档,我成功地编写了一个简短的python脚本来刮取一个表并将其打印出来,但是我不知道如何将其格式化为一个表。最终目标是从网站上获取足球比赛预测:并将其保存到文本文件中

以下是我迄今为止编写的代码:

import urllib
import urllib.request
from bs4 import BeautifulSoup

def make_soup(url):
    thepage = urllib.request.urlopen(url)
    soupdata = BeautifulSoup(thepage, "html.parser")
    return soupdata

soup = make_soup("https://afootballreport.com/predictions/over-1.5-goals/")
for record in soup.findAll('tr'):

    for data in record.findAll('td'):

     print(data.text.strip())
这是输出:

03/28
17:30
Iceland Reykjavik Youth Cup


Fjölnir / Vængir U19
Valur / KH U19
Over 1.5
Valur / KH U19 have over 1.5 goals in 100% of their games in the last 2 months (total games 6).
03/28
17:30
Saudi Arabia Pro League


Al Ittifaq
Al Quadisiya
Over 1.5
Al Ittifaq have over 1.5 goals in 100% of their games in the last 2 months (total games 8).
我想让它每行都有一列:日期、时间、足球联赛、主队、AwayTeam、提示、描述。 像这样:

Date, Time, Football League, HomeTeam, AwayTeam, Tip, Description
03/28, 17:30, Iceland Reykjavik Youth Cup, Fjölnir / Vængir U19, Valur / KH U19, Over 1.5, Valur / KH U19 have over 1.5 goals in 100% of their games in the last 2 months (total games 6).

有人能帮我吗?

你做了很多工作。每当我看到
标记时,我都会首先尝试pandas的
.read\u html()
。它将为您完成大部分工作,然后您可以根据需要操作数据帧

import pandas as pd

tables = pd.read_html('https://afootballreport.com/predictions/over-1.5-goals/')
table = tables[0]

table[['Date', 'Time']] = table['Home team - Away team'].str.split('  ', expand=True)
table = table.drop(['Home team - Away team'],axis=1)
table =  table.rename(columns={"Unnamed: 3":"Description"})   


table[['Football League', 'Home Team', 'Away Team']] = table['Tip'].str.split('  ', expand=True)
table = table.drop(['Tip'],axis=1)
输出:

print (table.head(5).to_string())
      Logic                                        Description   Date   Time              Football League             Home Team       Away Team
0  Over 1.5  Valur / KH U19 have over 1.5 goals in 100% of ...  03/28  17:30  Iceland Reykjavik Youth Cup  Fjölnir / Vængir U19  Valur / KH U19
1  Over 1.5  Al Ittifaq have over 1.5 goals in 100% of thei...  03/28  17:30      Saudi Arabia Pro League            Al Ittifaq    Al Quadisiya
2  Over 1.5  Sarreguemines have over 1.5 goals in 100% of t...  03/28  19:00            France National 3         Sarreguemines   Strasbourg II
3  Over 1.5  Mons Calpe have over 1.5 goals in 100% of thei...  03/28  19:29   Gibraltar Premier Division            Mons Calpe   Glacis United
4  Over 1.5  Glacis United have over 1.5 goals in 100% of t...  03/28  19:29   Gibraltar Premier Division            Mons Calpe   Glacis United
编辑:

print (table.head(5).to_string())
      Logic                                        Description   Date   Time              Football League             Home Team       Away Team
0  Over 1.5  Valur / KH U19 have over 1.5 goals in 100% of ...  03/28  17:30  Iceland Reykjavik Youth Cup  Fjölnir / Vængir U19  Valur / KH U19
1  Over 1.5  Al Ittifaq have over 1.5 goals in 100% of thei...  03/28  17:30      Saudi Arabia Pro League            Al Ittifaq    Al Quadisiya
2  Over 1.5  Sarreguemines have over 1.5 goals in 100% of t...  03/28  19:00            France National 3         Sarreguemines   Strasbourg II
3  Over 1.5  Mons Calpe have over 1.5 goals in 100% of thei...  03/28  19:29   Gibraltar Premier Division            Mons Calpe   Glacis United
4  Over 1.5  Glacis United have over 1.5 goals in 100% of t...  03/28  19:29   Gibraltar Premier Division            Mons Calpe   Glacis United
如果您使用的是Pandas版本0.24.2

import pandas as pd

tables = pd.read_html('https://afootballreport.com/predictions/over-1.5-goals/')
table = tables[0]

table[['Date', 'Time']] = table['Home team - Away team'].str.split('  ', expand=True)
table = table.drop(['Home team - Away team'],axis=1)
table =  table.rename(columns={"Logic":"Description"})   


table[['Football League', 'Home Team', 'Away Team']] = table['Home team - Away team.1'].str.split('  ', expand=True)
table = table.drop(['Home team - Away team.1'],axis=1)

看看pprint。我喜欢叫它“漂亮的打印机”。谢谢你,看起来容易多了。然而,主队和客队栏没有显示?如果它显示出它将是完美的哦,是的。我没注意到。我会看看我能不能修好它实际上它就在那里,只是补偿一下。我会把它打印出来,在上面的编辑中给你看如果你给我几分钟的时间,我会在你想要的时候修正表格以正确地保存数据太好了,谢谢,然后我该如何命名这些列?重命名为Date,Time,Football League,homesteam,AwayTeam,Tip,Logic?我希望能够在另一个脚本中使用数据,因此必须能够通过hometeam和awayteam进行搜索