抓取'的网站__hpKey';然后在python中使用请求和美化组登录

抓取'的网站__hpKey';然后在python中使用请求和美化组登录,python,python-3.x,web-scraping,python-requests,forms-authentication,Python,Python 3.x,Web Scraping,Python Requests,Forms Authentication,这是我的第一个编码项目,所以我可能没有掌握所有的术语。我正试图用python的请求和BeautifulSoup库登录NHS献血网站。我已经成功地做到了这一点,但只有当我使用从浏览器的“网络”选项卡的登录标题中复制并粘贴的“_hpKey”值时,它才起作用。我想能够刮网站找到这个令牌,而不必使用我复制和粘贴的令牌 我已设法找到“\uuuhpkey”,但在尝试登录时,此密钥似乎不起作用 s = requests.session() soup_key = BeautifulSoup(s.get('htt

这是我的第一个编码项目,所以我可能没有掌握所有的术语。我正试图用python的请求和BeautifulSoup库登录NHS献血网站。我已经成功地做到了这一点,但只有当我使用从浏览器的“网络”选项卡的登录标题中复制并粘贴的“_hpKey”值时,它才起作用。我想能够刮网站找到这个令牌,而不必使用我复制和粘贴的令牌

我已设法找到“\uuuhpkey”,但在尝试登录时,此密钥似乎不起作用

s = requests.session()
soup_key = BeautifulSoup(s.get('https://my.blood.co.uk/Account/SignIn').content, 'html.parser')
key = soup_key.find('input', {'name': '__hpKey'})['value']
我刚刚在networks login选项卡中输入了“key”的值,因为使用上面的代码无法成功登录。我已经缩小了需要传递到登录门户的四个元素的范围。这些是:

data = {
  'LoginEmailAddress': 'email',
  'LoginPassword': 'password',
  'Question-Reason': '',
  '__hpKey': 'key'                ## 'key' is a 216 character key ending in ==
然后,我将这4个元素传递到登录门户,并使用BeautifulSoup解析我的捐赠者档案中的网页标题。标题让我知道它是否已成功登录

login_req = s.post('https://my.blood.co.uk/Account/Login', data=data)
soup = BeautifulSoup(s.get('https://my.blood.co.uk/Home/Landing?load=Yourdonations').content, 'html.parser')
print(soup.title)       # If logged in prints "My Donor Record", else prints "My Donor Record - Sign in or Register"
那么,如何查找传递到登录门户时有效的“\uuuhpkey”值呢


谢谢

请求中包含一些验证字段。这些字段位于表单的隐藏
input
标记中。最快的方法是获取表单下的所有输入,并按有效负载的原样发送所有输入:

import requests
from bs4 import BeautifulSoup

s = requests.Session()

email = "your@email.com"
password = "your_password"

r = s.get("https://my.blood.co.uk/Account/SignIn")
soup = BeautifulSoup(r.text, "html.parser")
form = soup.findAll("form")[1]

payload = dict([
    (t["name"],t["value"]) 
    for t in form.findAll("input")
    if t.has_attr("value")
])
payload["Type-Fax"] = "" # maybe not necessary ?
payload["LoginEmailAddress"] = email
payload["LoginPassword"] = password

print(payload)
r = s.post("https://my.blood.co.uk/Account/Login", data = payload)

soup = BeautifulSoup(s.get('https://my.blood.co.uk/Home/Landing?load=Yourdonations').content, 'html.parser')
print(soup.title)

请注意,我没有使用有效帐户测试上述代码

请求中包含一些验证字段。这些字段位于表单的隐藏
input
标记中。最快的方法是获取表单下的所有输入,并按有效负载的原样发送所有输入:

import requests
from bs4 import BeautifulSoup

s = requests.Session()

email = "your@email.com"
password = "your_password"

r = s.get("https://my.blood.co.uk/Account/SignIn")
soup = BeautifulSoup(r.text, "html.parser")
form = soup.findAll("form")[1]

payload = dict([
    (t["name"],t["value"]) 
    for t in form.findAll("input")
    if t.has_attr("value")
])
payload["Type-Fax"] = "" # maybe not necessary ?
payload["LoginEmailAddress"] = email
payload["LoginPassword"] = password

print(payload)
r = s.post("https://my.blood.co.uk/Account/Login", data = payload)

soup = BeautifulSoup(s.get('https://my.blood.co.uk/Home/Landing?load=Yourdonations').content, 'html.parser')
print(soup.title)
请注意,我没有使用有效的帐户测试上述代码