尝试使用html(Python 3.6)请求刮取JS web时出现问题
上周我试图从Epic Games Store webpage()中获取信息,我第一次尝试使用Requests模块,但我很快意识到我需要一个支持javascript Web的模块。 这就是我现在正在尝试的,但有一个问题。。。 当我在页面上使用“inspect element”时,一切正常,但当我执行此操作时:尝试使用html(Python 3.6)请求刮取JS web时出现问题,python,html,python-3.x,web-scraping,python-requests-html,Python,Html,Python 3.x,Web Scraping,Python Requests Html,上周我试图从Epic Games Store webpage()中获取信息,我第一次尝试使用Requests模块,但我很快意识到我需要一个支持javascript Web的模块。 这就是我现在正在尝试的,但有一个问题。。。 当我在页面上使用“inspect element”时,一切正常,但当我执行此操作时: from requests_html import HTMLSession session = HTMLSession() r = session.get("https://ww
from requests_html import HTMLSession
session = HTMLSession()
r = session.get("https://www.epicgames.com/store/en-US/")
r.html.render()
print(r.html.html)
结果是一个无法读取的html文件,没有加载大部分元素。
结果:
您可以对此进行测试,从web上选择一个游戏,然后在结果文件中按ctrl+f组合键选择它的名称。你会发现这里没有火柴。
我能做什么
先谢谢你!:)
编辑:当我从浏览器手动下载HTML时,发生的情况与此完全相同。因此,主页面不包含您要查找的数据意味着,存储数据将在之后接收。因此,我们可以使用
请求
模拟浏览器所做的操作来获取数据
如果您查看开发人员工具中的网络选项卡,您将看到当页面加载时,它从graphql
endpoint接收存储数据。这意味着,如果模拟请求,可以获得存储数据:
导入请求
端点=”https://graphql.epicgames.com/graphql"
#这个查询就是发送到服务器的内容
#加载页面时,我不知道如何加载
#我自己写的,所以我基本上是复制粘贴的
#有效负载中的二进制数据。
query=b'{“query”:“\\n query storefrontDiscoverQuery(\\n$locale:String,\\n$country:String\u0021\\n){\\n Storefront{\\n storefrontModules(locale:$locale){\\n…on StorefrontBreaker{\\n type\\n title\\n titleGroup\\n description\\n backgroundColors\\n layout\\n link{\\n src\\n linkText\\n}\\n image{\\n src\\n alt\\n}\\n}\\n…在StorefrontFreeGames上{\\n type\\n title\\n}\\n…在StorefrontCardGroup上{\\n type\\n title\\n link{\\n src\\n linkText\\n}\\n提供{\\n名称空间\\n id\\n提供{\\n\\n title\\n id\\n namespace\\n description\\n keyImages{\\n type\\n url\\n}\\n卖方{\\n id\\n name\\n}\\n URLSLAG\\n项目{\\n id\\n namespace\\n}\\n customAttributes{\\n key\\n value\\n}\\n categories{\\n path\\n}\\n price(国家:$country){\\n totalPrice{\\n折扣价格\\n原始价格\\n凭证折扣\\n折扣\\n价格(地区:$locale){\\n原始价格\\n折扣价格\\n中间价格\\n}\\n}\\n}\\n线路优惠{\\n appliedRules{\\n id\\n endDate\\n}\\n}\\n}\\n linkedOfferId\\n linkedOffer{\\n effectiveDate\\n customAttributes{\\n key\\n value\\n}\\n}\\n}\\n}\\n}\\n}\\n…在StorefrontFeaturedCarousel{\\n type\\n title\\n幻灯片上{\\n title\\n眉毛\\n description\\n backgroundColor\\n image{\\n src\\n alt\\n}\\n mobileImage{\\n src\\n alt\\n}\\n链接{\\n src\\n linkText\\n}\\n}\\n}\\n}\\n…在StorefrontTiles上{\\n type\\n title\\n tiles{\\n label\\n流派\\n link{\\n src\\n linkText\\n}\\n}\\n}\\n}\\n}\\n}\\n}\\n,“变量”:{“语言环境”:“en-US”,“国家”:“US”}'
data=requests.post(端点,头={“内容类型”:“application/json;charset=UTF-8”
},数据=查询)
打印(data.json())
它给了我们(小心,它相当大。)
此外,您还可以通过以下方式获取每种产品的信息:
导入请求,json
端点=”https://graphql.epicgames.com/graphql"
查询={
“查询”:“\n查询目录查询(\n$PRODUCTNAME SPACE:String!,\n$offerId:String!,\n$LOCATE:String,\n$country:String!,\n$LINEOFFEREQ:[LINEOFFEREQ]!){\n目录{\n目录提供(命名空间:$PRODUCTNAME SPACE,\n id:$offerId,\n区域设置:$locale){\n命名空间\n有效日期\n id\n
{
"data": {
"Catalog": {
"catalogOffer": {
"namespace": "cosmos",
"effectiveDate": "2019-07-12T00:00:00.000Z",
"id": "1c55202badfc4212b4f82553d5d22c3e",
"customAttributes": [
{
"key": "com.epicgames.app.blacklist",
"value": "KR"
},
{
"key": "isPrepurchase",
"value": "true"
},
{
"key": "availableDate",
"value": "1573570800"
},
{
"key": "developerName",
"value": "Human Head Studios, Inc."
}
],
"items": [
{
"id": "70c30983cf0948e4bffc23505f232b11",
"status": "ACTIVE",
"customAttributes": [
{
"key": "SupportedPlatforms",
"value": "Windows"
}
]
},
{
"id": "974e25b4bce6425d9af79cd5ffd64152",
"status": "ACTIVE",
"customAttributes": [
{
"key": "SupportedPlatforms",
"value": "Windows"
}
]
},
{
"id": "159d92ebec254ecf8373709a99388a62",
"status": "ACTIVE",
"customAttributes": [
{
"key": "SupportedPlatforms",
"value": "Windows"
}
]
},
{
"id": "cc67628ab455419cb3d4ecc907febbb7",
"status": "ACTIVE",
"customAttributes": [
{
"key": "SupportedPlatforms",
"value": "Windows"
}
]
},
{
"id": "2f742aa604a441d1a145f70411e9d8d2",
"status": "ACTIVE",
"customAttributes": [
{
"key": "SupportedPlatforms",
"value": "Windows"
}
]
}
]
}
},
"PriceEngine": {
"price": {
"totalPrice": {
"discountPrice": 2999,
"originalPrice": 2999,
"voucherDiscount": 0,
"discount": 0,
"currencyCode": "USD",
"currencyInfo": {
"decimals": 2
},
"fmtPrice": {
"originalPrice": "$29.99",
"discountPrice": "$29.99",
"intermediatePrice": "$29.99"
}
},
"lineOffers": [
{
"appliedRules": []
}
]
}
}
},
"extensions": {
"cacheControl": {
"version": 1,
"hints": [
{
"path": [
"Catalog"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer"
],
"maxAge": 0
},
{
"path": [
"PriceEngine"
],
"maxAge": 0
},
{
"path": [
"PriceEngine",
"price"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"customAttributes"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"items"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"items",
0,
"customAttributes"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"items",
1,
"customAttributes"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"items",
2,
"customAttributes"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"items",
3,
"customAttributes"
],
"maxAge": 0
},
{
"path": [
"Catalog",
"catalogOffer",
"items",
4,
"customAttributes"
],
"maxAge": 0
},
{
"path": [
"PriceEngine",
"price",
"totalPrice"
],
"maxAge": 0
},
{
"path": [
"PriceEngine",
"price",
"totalPrice",
"currencyInfo"
],
"maxAge": 0
},
{
"path": [
"PriceEngine",
"price",
"totalPrice",
"fmtPrice"
],
"maxAge": 0
},
{
"path": [
"PriceEngine",
"price",
"lineOffers"
],
"maxAge": 0
},
{
"path": [
"PriceEngine",
"price",
"lineOffers",
0,
"appliedRules"
],
"maxAge": 0
}
]
}
}
}