Python 如何使用请求将URL中的日历选项作为标题发送？_Python_Web Scraping_Beautifulsoup_Python Requests

Python 如何使用请求将URL中的日历选项作为标题发送？

python web-scraping

Python 如何使用请求将URL中的日历选项作为标题发送？,python,web-scraping,beautifulsoup,python-requests,Python,Web Scraping,Beautifulsoup,Python Requests,我正在尝试在此URL中使用BeautifulSoup：我的问题是，我想设置一个特定的日期来显示。我在chrome网络选项中看到了标头发送的内容。但我没能找到正确的页面，我正在询问2019年1月1日至2020年12月31日的数据的URL，但我总是得到2021年的数据这就是我现在尝试的： from bs4 import BeautifulSoup import requests from urllib.parse import parse_qsl session = requests.Sess

我正在尝试在此URL中使用BeautifulSoup：

我的问题是，我想设置一个特定的日期来显示。我在chrome网络选项中看到了标头发送的内容。但我没能找到正确的页面，我正在询问2019年1月1日至2020年12月31日的数据的URL，但我总是得到2021年的数据

这就是我现在尝试的：

from bs4 import BeautifulSoup
import requests
from urllib.parse import parse_qsl

session = requests.Session()
session.headers.update({'User-Agent':'Mozilla/5.0'})

qs = 'country%5B%5D=26&dateFrom=2019-01-01&dateTo=2020-12-31&currentTab=custom&limit_from=0'
payload = dict(parse_qsl(qs))

html_text = session.post('https://es.investing.com/dividends-calendar/', data=payload).text
soup = BeautifulSoup(html_text,'lxml')
job = soup.find_all('table')

这个qs是（我猜）从发送的数据显示正确的日历范围，这个qs是我从Chrome网络选项中得到的

知道我做错了什么吗

很多坦克

您缺少一些其他必需的标题，而且数据的URL与主URL不同

from bs4 import BeautifulSoup
import requests

session = requests.Session()

headers = {
    "User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0",
    "Accept" : "*/*",
    "Accept-Language" : "en-GB,en;q=0.7,de;q=0.3",
    "Referer": "https://es.investing.com/dividends-calendar/",
    "Content-Type" : "application/x-www-form-urlencoded",
    "X-Requested-With" : "XMLHttpRequest",
    "Origin" : "https://es.investing.com",
}

payload = {
    'country[]': [26, 5, 22, 4], 
    'dateFrom': '2020-12-01', 
    'dateTo': '2020-12-24', 
    'currentTab': 'custom', 
    'limit_from': 0
}

r = session.post('https://es.investing.com/dividends-calendar/Service/getCalendarFilteredData', data=payload, headers=headers)
j = r.json()

soup = BeautifulSoup(j['data'], 'lxml')

for row in soup.find_all('tr')[1:]:
    values = [td.text for td in row.find_all('td')]
    print(values[1:])

为您提供输出启动：

['Annaly Capital Management Pd Pref\xa0（Only_Pd）'、'01.12.2020'、'046875'、'31.12.2020'、'7,50%]
['Kimberly-Clark de Mexico\xa0（KCDMY）'、'01.12.2020'、'0461541'、'5,83%']
['Thales\xa0（TCFP）''01.12.2020'，'0,4'，'03.12.2020'，'0,53%']
['McCormick&Co\xa0（MKC）'、'01.12.2020'、'0,34'、'30.11.2020'、'1,45%']
['McCormick&Comp\xa0（MKCv）''01.12.2020'，'0,34'，'30.11.2020'，'1,48%']
['Goldman Sachs\xa0（GS）''01.12.2020'，'1,25'，'30.12.2020'，'1,66%']
['Schlumberger\xa0（SLB）''01.12.2020'，'0125'，'14.01.2021'，'1,97%']
['Avery Dennison\xa0（AVY）'、'01.12.2020'、'0,62'、'16.12.2020'、'1,55%']
['Ardagh Group\xa0（ARD）''01.12.2020'，'0,15'，'16.12.2020'，'3,45%']

尊敬的先生！工作正常，但您如何知道哪些标题被重新查询，哪些不是？我使用浏览器网络工具查看使用了哪些标题。首先将它们全部添加，然后尝试一次删除一个。你也许可以进一步减少它们。任何软件推荐开始？？我只使用标准的Firefox浏览器