Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/url/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python BeautifulSoup不删除此url_Python_Url_Beautifulsoup - Fatal编程技术网

Python BeautifulSoup不删除此url

Python BeautifulSoup不删除此url,python,url,beautifulsoup,Python,Url,Beautifulsoup,我试图从url中获取一些玩家数据行(tr),但在运行代码时似乎什么也没发生。我肯定我的代码是好的,因为它与其他包含表格的统计网站一起工作。谁能告诉我为什么什么都没发生?提前谢谢 import urllib import urllib.request from bs4 import BeautifulSoup def make_soup(url): thepage = urllib.request.urlopen(url) soupdata = BeautifulSoup(thepage, "h

我试图从url中获取一些玩家数据行(tr),但在运行代码时似乎什么也没发生。我肯定我的代码是好的,因为它与其他包含表格的统计网站一起工作。谁能告诉我为什么什么都没发生?提前谢谢

import urllib
import urllib.request
from bs4 import BeautifulSoup

def make_soup(url):
thepage = urllib.request.urlopen(url)
soupdata = BeautifulSoup(thepage, "html.parser")
return soupdata

soup = make_soup("https://www.whoscored.com/Regions/252/Tournaments/7/Seasons/6365/Stages/13832/PlayerStatistics/England-Championship-2016-2017")
for record in soup.findAll('tr'):
    print(record.text)

此页面使用javascript获取数据,您可以在此链接中找到原始数据:

https://www.whoscored.com/StatisticsFeed/1/GetPlayerStatistics?category=summary&subcategory=all&statsAccumulationType=0&isCurrent=true&playerId=&teamIds=&matchId=&stageId=13832&tournamentOptions=7&sortBy=Rating&sortAscending=&age=&ageComparisonType=&appearances=&appearancesComparisonType=&field=Overall&nationality=&positionOptions=&timeOfTheGameEnd=&timeOfTheGameStart=&isMinApp=true&page=&includeZeroValues=&numberOfPlayersToPick=10

url的每个字段都可以更改以获取您所需的数据。

之所以发生这种情况,是因为网站不希望您将其删除

我过去常常发送请求,并拍摄模拟照片 它创建的浏览器

它使用的是一种安全服务(他们甚至有一些关于在他们的网站上搜刮的信息)-看看,它很有趣-

  • 可能会有帮助

简短回答:您要查找的玩家数据在该URL中不在

然后你可能想问为什么我在那页看到过他们,他们怎么不在那里

因此,我将尝试解释使用Chrome等现代浏览器浏览该url时会发生什么

您:键入url并按enter键

Chrome:Gotcha。我会尽快给你那一页,等一下。(从该url获取内容),很好,现在我有了它!但是等等,让我 在我向您展示之前,请先阅读/解析它(阅读其中的内容 内容),哦,糟了,这个javascript告诉我要获得额外的 来自另一个url的信息,好的,我会这样做;哦,等等,还有一个 一个告诉我在标题中加载广告,我不喜欢,但是 我只想照我说的去做;等一下,这些css告诉我 以粗体显示玩家名称,ok不错;哦,这是我的另一张照片 url xxx我需要加载,没问题。。。哦,伙计,有多少东西 让我来处理?我对这个网站不满意。。。(在一个 一堆其他的东西…)最后一切都准备好了!现在看看

你:玩家xxx其实很不错,我来看看。(点击播放器xxx)

铬:

正如你每次浏览网页时所看到的,浏览器会做很多“幕后”的工作来向用户显示它。因此基本上:url输入>>从url获取的内容>>解析的内容>>获取的附加内容>>呈现的所有内容>>显示的页面(一个或多个步骤可能同时完成)

对于你的代码,它只是“从url获取的内容”,而且你想要的统计数据恰好是“附加内容”,必须从其他地方加载,所以你什么也得不到

那我怎么得到这些数据呢?一旦知道了负责加载这些统计数据的URL,只需简单地查找它们。我如何找到这些URL?你可以随时阅读Java脚本。。。如果你有足够的耐心

获取所需内容的最简单方法是在加载页面时分析流量,并找出所有幕后流量。我会推荐,但你可以使用任何你认为合适的工具

现在让我们看看加载该页面时会发生什么:

实际上,要完全呈现您访问的页面,有数百个请求,您需要做的就是找出哪一个提供了“实际”或“实际”统计数据。这里有一个url,即使里面有“StatisticsFeed”,它会是那个吗?让我们来看一看:

没错!那现在怎么办模拟此请求并解析内容,因为它已经是JSON格式的,内置模块
JSON
可以轻松完成这项工作,您甚至不必使用
BeautifulSoup

你可能会问,为什么我直接浏览这个链接时什么都没有?这是因为他们在服务器上设置了限制,这样只有具有有效头的请求才能获得提要。那我怎么绕过它呢使用正确的参数(主要是标题)模拟“生动”,以便他们相信您

{
    "playerTableStats": [{
        "name": "Conor Hourihane",
        "firstName": "Conor",
        "lastName": "Hourihane",
        "playerId": 134172,
        "height": 181,
        "weight": 62,
        "age": 25,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-MC-",
        "positionText": "Midfielder",
        "playedPositionsShort": "M(C)",
        "teamId": 142,
        "teamName": "Barnsley",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "ie",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.8705882352941181,
        "ranking": 1,
        "apps": 17,
        "subOn": 0,
        "minsPlayed": 1530,
        "manOfTheMatch": 4,
        "yellowCard": 5.0,
        "redCard": 0.0,
        "goal": 3,
        "assistTotal": 8,
        "shotsPerGame": 2.2352941176470589,
        "aerialWonPerGame": 0.6470588235294118,
        "passSuccess": 81.370449678800867
    },
    {
        "name": "Anthony Knockaert",
        "firstName": "Anthony",
        "lastName": "Knockaert",
        "playerId": 86794,
        "height": 172,
        "weight": 69,
        "age": 25,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-AML-AMR-",
        "positionText": "Midfielder",
        "playedPositionsShort": "AM(LR)",
        "teamId": 211,
        "teamName": "Brighton",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "fr",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.6722222222222216,
        "ranking": 2,
        "apps": 18,
        "subOn": 1,
        "minsPlayed": 1471,
        "manOfTheMatch": 5,
        "yellowCard": 4.0,
        "redCard": 0.0,
        "goal": 6,
        "assistTotal": 0,
        "shotsPerGame": 2.3888888888888888,
        "aerialWonPerGame": 0.22222222222222221,
        "passSuccess": 83.420593368237348
    },
    {
        "name": "Lewis Dunk",
        "firstName": "Lewis",
        "lastName": "Dunk",
        "playerId": 86441,
        "height": 192,
        "weight": 88,
        "age": 25,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 211,
        "teamName": "Brighton",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-eng",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.660000000000001,
        "ranking": 3,
        "apps": 18,
        "subOn": 0,
        "minsPlayed": 1620,
        "manOfTheMatch": 3,
        "yellowCard": 8.0,
        "redCard": 0.0,
        "goal": 1,
        "assistTotal": 1,
        "shotsPerGame": 0.61111111111111116,
        "aerialWonPerGame": 3.5,
        "passSuccess": 79.72251867662753
    },
    {
        "name": "Tom Clarke",
        "firstName": "Tom",
        "lastName": "Clarke",
        "playerId": 133974,
        "height": 180,
        "weight": 77,
        "age": 28,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 181,
        "teamName": "Preston",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-eng",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.6126315789473677,
        "ranking": 4,
        "apps": 19,
        "subOn": 0,
        "minsPlayed": 1692,
        "manOfTheMatch": 4,
        "yellowCard": 0.0,
        "redCard": 0.0,
        "goal": 2,
        "assistTotal": 0,
        "shotsPerGame": 0.89473684210526316,
        "aerialWonPerGame": 5.4736842105263159,
        "passSuccess": 66.666666666666657
    },
    {
        "name": "Pontus Jansson",
        "firstName": "Pontus",
        "lastName": "Jansson",
        "playerId": 121123,
        "height": 194,
        "weight": 89,
        "age": 25,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 19,
        "teamName": "Leeds",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "se",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.5976923076923066,
        "ranking": 5,
        "apps": 13,
        "subOn": 0,
        "minsPlayed": 1126,
        "manOfTheMatch": 1,
        "yellowCard": 6.0,
        "redCard": 0.0,
        "goal": 1,
        "assistTotal": 0,
        "shotsPerGame": 0.53846153846153844,
        "aerialWonPerGame": 3.5384615384615383,
        "passSuccess": 86.336633663366342
    },
    {
        "name": "Angus MacDonald",
        "firstName": "Angus",
        "lastName": "MacDonald",
        "playerId": 110825,
        "height": 184,
        "weight": 70,
        "age": 24,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 142,
        "teamName": "Barnsley",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-eng",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.5066666666666677,
        "ranking": 6,
        "apps": 12,
        "subOn": 0,
        "minsPlayed": 1080,
        "manOfTheMatch": 0,
        "yellowCard": 3.0,
        "redCard": 0.0,
        "goal": 0,
        "assistTotal": 0,
        "shotsPerGame": 0.33333333333333331,
        "aerialWonPerGame": 4.833333333333333,
        "passSuccess": 72.147651006711413
    },
    {
        "name": "Marc Roberts",
        "firstName": "Marc",
        "lastName": "Roberts",
        "playerId": 138949,
        "height": 183,
        "weight": 81,
        "age": 26,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 142,
        "teamName": "Barnsley",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-eng",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.503125,
        "ranking": 7,
        "apps": 16,
        "subOn": 0,
        "minsPlayed": 1440,
        "manOfTheMatch": 1,
        "yellowCard": 3.0,
        "redCard": 0.0,
        "goal": 2,
        "assistTotal": 2,
        "shotsPerGame": 0.625,
        "aerialWonPerGame": 7.0625,
        "passSuccess": 61.595547309833023
    },
    {
        "name": "Bradley Johnson",
        "firstName": "Bradley",
        "lastName": "Johnson",
        "playerId": 12490,
        "height": 178,
        "weight": 68,
        "age": 29,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-MC-ML-",
        "positionText": "Midfielder",
        "playedPositionsShort": "M(CL)",
        "teamId": 20,
        "teamName": "Derby",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-eng",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.4954545454545443,
        "ranking": 8,
        "apps": 11,
        "subOn": 0,
        "minsPlayed": 952,
        "manOfTheMatch": 1,
        "yellowCard": 4.0,
        "redCard": 0.0,
        "goal": 2,
        "assistTotal": 1,
        "shotsPerGame": 1.3636363636363635,
        "aerialWonPerGame": 4.0909090909090908,
        "passSuccess": 71.908127208480565
    },
    {
        "name": "Christophe Berra",
        "firstName": "Christophe",
        "lastName": "Berra",
        "playerId": 8287,
        "height": 186,
        "weight": 81,
        "age": 31,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 165,
        "teamName": "Ipswich",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-sct",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.4789473684210526,
        "ranking": 9,
        "apps": 19,
        "subOn": 0,
        "minsPlayed": 1710,
        "manOfTheMatch": 3,
        "yellowCard": 4.0,
        "redCard": 0.0,
        "goal": 0,
        "assistTotal": 1,
        "shotsPerGame": 0.94736842105263153,
        "aerialWonPerGame": 6.2105263157894735,
        "passSuccess": 58.636363636363633
    },
    {
        "name": "Adam Webster",
        "firstName": "Adam",
        "lastName": "Webster",
        "playerId": 109922,
        "height": 191,
        "weight": 0,
        "age": 21,
        "isManOfTheMatch": false,
        "isActive": true,
        "isOpta": true,
        "playedPositions": "-DC-",
        "positionText": "Defender",
        "playedPositionsShort": "D(C)",
        "teamId": 165,
        "teamName": "Ipswich",
        "seasonId": 6365,
        "seasonName": "2016/2017",
        "tournamentId": 7,
        "tournamentRegionId": 252,
        "tournamentRegionCode": "gb-eng",
        "regionCode": "gb-eng",
        "tournamentName": "Championship",
        "tournamentShortName": "EC",
        "rating": 7.4780000000000006,
        "ranking": 10,
        "apps": 15,
        "subOn": 1,
        "minsPlayed": 1227,
        "manOfTheMatch": 2,
        "yellowCard": 1.0,
        "redCard": 0.0,
        "goal": 0,
        "assistTotal": 0,
        "shotsPerGame": 0.2,
        "aerialWonPerGame": 5.0666666666666664,
        "passSuccess": 58.256029684601117
    }],
    "paging": {
        "currentPage": 1,
        "totalPages": 34,
        "resultsPerPage": 10,
        "totalResults": 338,
        "firstRecordIndex": 1,
        "lastRecordIndex": 10
    },
    "statColumns": ["apps",
    "subOn",
    "minsPlayed",
    "goal",
    "assistTotal",
    "yellowCard",
    "redCard",
    "shotsPerGame",
    "passSuccess",
    "aerialWonPerGame",
    "manOfTheMatch"]
}