Python-BeautifulSoup从多个选项中提取值
代码:Python-BeautifulSoup从多个选项中提取值,python,python-3.x,beautifulsoup,Python,Python 3.x,Beautifulsoup,代码: 从bs4导入美化组 导入请求 html=“” 我已阅读并同意通过电子邮件向我发送有关Clos19的更新,包括最新产品发布和特别活动。取消选中此框可选择退出 继续 ''' soup=BeautifulSoup(html,'html.parser') CSRFToken=soup.find('input',type='hidden')[“value”] 打印(CSRFToken) 我正在尝试打印80f1330a-7a4e-4878-a6ab-710356f47961,但终端返回时事
从bs4导入美化组
导入请求
html=“”
我已阅读并同意通过电子邮件向我发送有关Clos19的更新,包括最新产品发布和特别活动。取消选中此框可选择退出
继续
'''
soup=BeautifulSoup(html,'html.parser')
CSRFToken=soup.find('input',type='hidden')[“value”]
打印(CSRFToken)
我正在尝试打印80f1330a-7a4e-4878-a6ab-710356f47961,但终端返回时事通讯,因为这是它找到的第一份
当name=“CSRFToken”?时如何查找value的值?您可以在
find
调用中使用name
参数:
from bs4 import BeautifulSoup
import requests
html = ''' <div class="fieldset-content checkout-login__footer">
<hr class="checkout-login__newsletter-rule">
<div class="row checkout-login__footer with-info">
<div class="columns medium-9 checkout-login__subscribtion-container">
<input type="checkbox" name="subscriptionForms[0]_value" id="emailSignUp">
<label for="emailSignUp">
I have read the <a href="/en-gb/customer-care/privacy-policy">Privacy Policy</a> and give consent to email me with updates about Clos19, including the latest product releases and special events.<small class="info js-privacy-policy-info hide"> Uncheck this box to opt out</small>
</label>
<input type="hidden" class="hidden" name="subscriptionForms[0]_type" value="NEWSLETTER" />
<input type="hidden" class="hidden" name="subscriptionForms[0]_location" value="CHECKOUT" />
</div>
<div class="columns medium-3 small-12">
<button type="submit" class="button button-primary float-right positive">Continue</button>
</div>
</div>
</div>
<div>
<input type="hidden" name="CSRFToken" value="80f1330a-7a4e-4878-a6ab-710356f47961" /> '''
soup = BeautifulSoup(html, 'html.parser')
CSRFToken = soup.find('input', type='hidden')["value"]
print(CSRFToken)
输出:
from bs4 import BeautifulSoup as soup
print(soup(html, 'html.parser').find('input', {'name':"CSRFToken"})['value'])
您可以在
find
调用中使用name
参数:
from bs4 import BeautifulSoup
import requests
html = ''' <div class="fieldset-content checkout-login__footer">
<hr class="checkout-login__newsletter-rule">
<div class="row checkout-login__footer with-info">
<div class="columns medium-9 checkout-login__subscribtion-container">
<input type="checkbox" name="subscriptionForms[0]_value" id="emailSignUp">
<label for="emailSignUp">
I have read the <a href="/en-gb/customer-care/privacy-policy">Privacy Policy</a> and give consent to email me with updates about Clos19, including the latest product releases and special events.<small class="info js-privacy-policy-info hide"> Uncheck this box to opt out</small>
</label>
<input type="hidden" class="hidden" name="subscriptionForms[0]_type" value="NEWSLETTER" />
<input type="hidden" class="hidden" name="subscriptionForms[0]_location" value="CHECKOUT" />
</div>
<div class="columns medium-3 small-12">
<button type="submit" class="button button-primary float-right positive">Continue</button>
</div>
</div>
</div>
<div>
<input type="hidden" name="CSRFToken" value="80f1330a-7a4e-4878-a6ab-710356f47961" /> '''
soup = BeautifulSoup(html, 'html.parser')
CSRFToken = soup.find('input', type='hidden')["value"]
print(CSRFToken)
输出:
from bs4 import BeautifulSoup as soup
print(soup(html, 'html.parser').find('input', {'name':"CSRFToken"})['value'])
谢谢,这很有效,但是有什么方法可以避免bs4 import BeautifulSoup,因为我的脚本中有很多其他调用使用它吗?我实际上已经将scipt改编为使用bs4 import BeautifulSoup作为汤,因为它似乎更容易使用:)@PiersThomas我不确定我是否理解。您是否在努力提高程序的效率?或者,如果您已经导入了
美化组
,那么你就不必再这样做了。谢谢,这已经奏效了,但是有没有办法避免bs4导入BeautifulSoup,因为我的脚本中的许多其他调用都使用了它?我实际上已经将scipt改编为使用bs4导入BeautifulSoup作为汤,因为它似乎更容易使用:)@PiersThomas我不确定我是否理解。您是否在努力提高程序的效率?或者,如果您已经导入了BeautifulSoup
,则无需再次导入。