Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/286.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 美苏元';t返回页面源上的真实文本_Python_Web Scraping_Beautifulsoup_Python 3.7 - Fatal编程技术网

Python 美苏元';t返回页面源上的真实文本

Python 美苏元';t返回页面源上的真实文本,python,web-scraping,beautifulsoup,python-3.7,Python,Web Scraping,Beautifulsoup,Python 3.7,我正在尝试使用请求和BeautifulSoup从livescore.com中获取足球比赛结果。出于某种原因,它返回的不是团队名称和得分,而是: 03-12-2019 - __home_team__ - __home_score__ - __away_team__ - __away_score__ 我的代码: import requests from bs4 import BeautifulSoup from datetime import date, timedelta yesterday

我正在尝试使用请求和BeautifulSoup从livescore.com中获取足球比赛结果。出于某种原因,它返回的不是团队名称和得分,而是:

03-12-2019 - __home_team__ - __home_score__ - __away_team__ - __away_score__
我的代码:

import requests
from bs4 import BeautifulSoup
from datetime import date, timedelta

yesterday = date.today() - timedelta(days=1)
checkDate = '2019-' + yesterday.strftime('%m') + '-'  + yesterday.strftime('%d')
url = 'https://www.livescore.com/soccer/' + checkDate
playDate = yesterday.strftime('%d') + '-'  + yesterday.strftime('%m') + '-2019'

response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

home = soup.find_all('div', class_='ply tright name')
away = soup.find_all('div', class_='ply name')
hScore = soup.find_all('span', class_='hom')
aScore = soup.find_all('span', class_='awy')

with open('Scores.csv', 'a') as f:
    for h, a, hs, aws in zip(home, away, hScore, aScore):
        f.write(playDate + ',' + h.text + ',' + hs.text + ',' + a.text + ',' + aws.text + '\n')
        print(playDate + ' - ' + h.text + ' ' + hs.text + ' - ' + a.text + ' ' + aws.text)
源代码:

<a href="/soccer/england/premier-league/crystal-palace-vs-afc-bournemouth/6-18427820/" class="match-row scorelink even  " data-type="evt" data-id="soccer-6-18427820" data-stg-id="159">
   <div class="min ">
      <div>
         <span>FT</span> 
         <span class="ico-alert tt hidden">
            <svg class="inc icon-warning">
               <use xlink:href="#icon-warning"></use>
            </svg>
            <span class="tip" data-type="tooltip">Limited coverage</span>
         </span>
      </div>
   </div>
   <div class="ply tright name"><span>Crystal Palace</span></div>
   <div class="sco"> <span class="hom">1</span><span> - </span><span class="awy">0</span> </div>
   <div class="ply name"><span>AFC Bournemouth</span></div>
   <div class="star-container" data-type="star-container">
      <div class=" " data-type="star">
         <svg>
            <use xlink:href="#icon-star"></use>
         </svg>
      </div>
   </div>
</a>

我所尝试的:

1.)获取“a”标记(不返回任何内容)

2.)使用
find_all('span',class_=None)
(返回单个空格字符)

预期输出为(例如,随机名称):

4-12-2019,切尔西,1,1,利物浦
(用于CSV文件)


04-12-2019-切尔西1-利物浦1
(用于print()函数)

您必须使用selenium才能呈现页面

from bs4 import BeautifulSoup
from datetime import date, timedelta
from selenium import webdriver

yesterday = date.today() - timedelta(days=1)
checkDate = '2019-' + yesterday.strftime('%m') + '-'  + yesterday.strftime('%d')
url = 'https://www.livescore.com/soccer/' + checkDate
playDate = yesterday.strftime('%d') + '-'  + yesterday.strftime('%m') + '-2019'

driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
driver.get(url)

soup = BeautifulSoup(driver.page_source, 'html.parser')

home = soup.find_all('div', class_='ply tright name')
away = soup.find_all('div', class_='ply name')
hScore = soup.find_all('span', class_='hom')
aScore = soup.find_all('span', class_='awy')    


with open('Scores.csv', 'a') as f:
    for h, a, hs, aws in zip(home, away, hScore, aScore):
        f.write(playDate + ',' + h.text + ',' + hs.text + ',' + a.text + ',' + aws.text + '\n')
        print(playDate + ' - ' + h.text + ' ' + hs.text + ' - ' + a.text + ' ' + aws.text)

driver.close()
输出:

03-12-2019 - Crystal Palace 1 - AFC Bournemouth 0
03-12-2019 - Burnley 1 - Manchester City 4
03-12-2019 - Burton Albion 1 - Southend United 1
03-12-2019 - Eastleigh 0 - Wrexham 2
03-12-2019 - Farsley Celtic 1 - Brackley Town 1
03-12-2019 - Hereford 2 - York City  2
03-12-2019 - Kidderminster Harriers 1 - Gateshead 1
03-12-2019 - Leamington 3 - Darlington 0
03-12-2019 - Hungerford Town 1 - Tonbridge Angels 0
03-12-2019 - Brighton & Hove Albion U21 0 - Newport County * 0
03-12-2019 - Colchester United 1 - Stevenage 2
03-12-2019 - Shrewsbury Town 1 - Manchester City Academy * 1
03-12-2019 - Milton Keynes Dons 2 - Coventry City 0
03-12-2019 - Port Vale * 2 - Mansfield Town 2
03-12-2019 - Portsmouth 2 - Northampton Town 1
03-12-2019 - Salford City 3 - Wolverhampton Wanderers Academy 0
03-12-2019 - Walsall 3 - Chelsea U21 2
03-12-2019 - Cremonese 1 - Empoli 0
03-12-2019 - Genoa 3 - Ascoli 2
03-12-2019 - Fiorentina 2 - Cittadella 0
03-12-2019 - Angers 0 - Marseille 2
03-12-2019 - Bordeaux 6 - Nimes 0
03-12-2019 - Brest 5 - Strasbourg 0
03-12-2019 - Lyon 0 - Lille 1
03-12-2019 - Le Havre 2 - Le Mans 0
03-12-2019 - Auxerre 1 - Valenciennes 1
03-12-2019 - Niort 0 - AC Ajaccio 1
03-12-2019 - Troyes 1 - Rodez 0
03-12-2019 - Grenoble 1 - Clermont Foot 1
03-12-2019 - Chateauroux 1 - Sochaux 1
03-12-2019 - Paris FC 0 - Guingamp 3
03-12-2019 - Lens 3 - Chambly 0
03-12-2019 - Orleans 0 - Lorient 4
03-12-2019 - Royal Antwerp * 3 - Genk 3
03-12-2019 - Sporting Covilha 1 - Benfica 1
03-12-2019 - Brora Rangers 1 - Greenock Morton 3
03-12-2019 - Ayr United 0 - Dunfermline Athletic 1
03-12-2019 - Stenhousemuir 2 - Elgin City 2
03-12-2019 - Panetolikos 5 - Ialysos 1
03-12-2019 - Ergotelis 0 - Trikala 1
03-12-2019 - Fatih Karagumruk SK 1 - Goztepe 2
03-12-2019 - Yeni Malatyaspor 3 - Keciorengucu 1
03-12-2019 - Alanyaspor 5 - Adanaspor 1
03-12-2019 - Esenler Erokspor 0 - Sivasspor 2
03-12-2019 - Fenerbahce 4 - Istanbulspor AS 0
03-12-2019 - Cefn Druids AFC 2 - Cardiff Met University 1
03-12-2019 - TNS 1 - Carmarthen 0
03-12-2019 - Glentoran ? - Glenavon ?
03-12-2019 - Legia Warszawa II 0 - Piast Gliwice 2
03-12-2019 - Gornik Leczna 0 - Legia Warszawa 2
03-12-2019 - Sibenik 0 - NK Lokomotiva 4
03-12-2019 - MTK Budapest 0 - Diosgyori VTK 0
03-12-2019 - Szeged-Grosics Akademia 0 - Fehervar FC 1
03-12-2019 - Gaz Metan Medias 1 - FC Voluntari 0
03-12-2019 - CSM Politehnica Iasi 1 - FC FCSB 2
03-12-2019 - Beroe 3 - CSKA 1948 4
03-12-2019 - Slavia Sofia 1 - Botev Plovdiv 2
03-12-2019 - Bnei Yehuda Tel Aviv FC 1 - Hapoel Raanana FC 1
03-12-2019 - Maccabi Netanya FC 1 - Hapoel Ironi Kiryat Shmona 0
03-12-2019 - Hapoel Kfar Saba FC 0 - Hapoel Beer Sheva FC 1
03-12-2019 - Union 1 - Huracan 0
03-12-2019 - Club Atletico Platense 2 - Atlanta 1
03-12-2019 - Club Atletico Mitre 0 - Independiente Rivadavia 0
03-12-2019 - San Martin San Juan 1 - CA Alvarado 1
03-12-2019 - Santamarina 2 - Villa Dalmine 1
03-12-2019 - Atletico Rafaela 2 - Chacarita Juniors 0
03-12-2019 - Quilmes 1 - Brown de Adrogue 1
03-12-2019 - Gimnasia Mendoza 0 - San Martin de Tucuman 3
03-12-2019 - CR Vasco DA Gama RJ 1 - Cruzeiro 0
03-12-2019 - Royal Pari 0 - San Jose 1
03-12-2019 - Luqueno 0 - General Diaz 5
03-12-2019 - CD Motagua 5 - CD Vida 2
03-12-2019 - Laos U23 0 - Thailand U23 2
03-12-2019 - Indonesia U23 8 - Brunei U23 0
03-12-2019 - Singapore U23 0 - Vietnam U23 1
03-12-2019 - Al Riffa ? - Al Hidd ?
03-12-2019 - Al-Najma Manama ? - Busaiteen ?
03-12-2019 - East Riffa ? - Manama Club ?
03-12-2019 - PSS Sleman 5 - Perseru Badak Lampung 1
03-12-2019 - Persib Bandung 0 - Persela Lamongan 2
03-12-2019 - Al Akhdoud 1 - Ohod 2
03-12-2019 - Al-Wehda 1 - Al Khaleej 0
03-12-2019 - FC Masr * 0 - El Gounah 0
03-12-2019 - Al Ahly 3 - Bani Sweef 1
03-12-2019 - AS Slimane ? - Esperance ?
03-12-2019 - Etoile Metlaoui ? - Etoile du Sahel ?
03-12-2019 - __home_team__ __home_score__ - __away_team__ __away_score__

分数是动态加载的。我可以;似乎找不到api端点来实现这一点,所以您需要先使用Selenium来呈现页面,然后可以提取html和语法。我猜您使用了“inspect元素”或类似的东西来查看页面?@AlexanderCécile是的。@chitown88完美!我曾考虑使用硒,但它似乎是我想要的一个缓慢版本。我想这是没办法的,因为分数是动态加载的。谢谢lot@DeusExPersona硒不是唯一的替代品。你可以查询他们用来获取分数的任何东西。