Python 正在分析BeautifulSoup中“选择”下的所有选项_Python_Python 2.7_Web Scraping_Beautifulsoup_Mechanize

Python 正在分析BeautifulSoup中“选择”下的所有选项

python python-2.7 web-scraping

Python 正在分析BeautifulSoup中“选择”下的所有选项,python,python-2.7,web-scraping,beautifulsoup,mechanize,Python,Python 2.7,Web Scraping,Beautifulsoup,Mechanize,我有一个HTML，它有多个选择标记，每个选择标记下有多个下拉选项我想解析每个select下的所有选项并存储它们这就是html的外观 <select name="primary_select"> <option></option> <option></option> </select> <select name="secondary_select"> <option><

我有一个HTML，它有多个选择标记，每个选择标记下有多个下拉选项我想解析每个select下的所有选项并存储它们

这就是html的外观

<select name="primary_select">
    <option></option>
    <option></option>
</select>
<select name="secondary_select">
    <option></option>
    <option></option>
</select>

我得到以下错误

AttributeError: 'ResultSet' object has no attribute 'findAll'

Thaks for Help:

findAll返回一个列表，其中不能直接应用其他findAll

from bs4 import BeautifulSoup
html = '''<select name="primary_select">
    <option></option>
    <option></option>
</select>
<select name="secondary_select">
    <option></option>
    <option></option>
</select>'''
soup = BeautifulSoup(html)
subject_options = [i.findAll('option') for i in soup.findAll('select', attrs = {'name': 'primary_select'} )]
print subject_options

 subject_options = soup.findAll('select', attrs = {'name': 'primary_select'} )[0].findAll("option")

我想解析每个select下的所有选项并存储它们

输出：

findAll返回一个列表，其中不能直接应用另一个findAll

from bs4 import BeautifulSoup
html = '''<select name="primary_select">
    <option></option>
    <option></option>
</select>
<select name="secondary_select">
    <option></option>
    <option></option>
</select>'''
soup = BeautifulSoup(html)
subject_options = [i.findAll('option') for i in soup.findAll('select', attrs = {'name': 'primary_select'} )]
print subject_options

我想解析每个select下的所有选项并存储它们

输出：

是，结果集没有findAll属性

这应该起作用：

subject_options = [
    r.findAll('option')
    for r in soup.findAll('select', attrs = {'name': 'primary_select'} )
]

但是为什么你不从一开始就请求一个选项呢

subject_options = soup.findAll(
    lambda t: t.name == 'option' and t.parent.attrs.get('name') == 'primary_select'
)

是，结果集没有findAll属性

这应该起作用：

subject_options = [
    r.findAll('option')
    for r in soup.findAll('select', attrs = {'name': 'primary_select'} )
]

但是为什么你不从一开始就请求一个选项呢

subject_options = soup.findAll(
    lambda t: t.name == 'option' and t.parent.attrs.get('name') == 'primary_select'
)

一个简单的修改就解决了这个问题

我只需要添加一个[0]，因为它提供了一个与条件匹配的所有元素的列表

感谢您的帮助：

 subject_options = soup.findAll('select', attrs = {'name': 'primary_select'} )[0].findAll("option")

一个简单的修改就解决了这个问题

我只需要添加一个[0]，因为它提供了一个与条件匹配的所有元素的列表

感谢您的帮助：

 subject_options = soup.findAll('select', attrs = {'name': 'primary_select'} )[0].findAll("option")

谢谢你紧凑的脚本

为了获得所选选项的实际值，我发现它可以与.getText函数配合使用，以防有人还想扩展它

代码：

输出

谢谢你紧凑的脚本

为了获得所选选项的实际值，我发现它可以与.getText函数配合使用，以防有人还想扩展它

代码：

输出

大概也许？哎呀，@avinash raj建议的.select选项更具可读性：^哎呀，@avinash raj建议的.select选项更具可读性：^

subject_options = soup.select('select[aria-label=Seitenauswahl] > option')

for i in subject_options:
    print(i.getText())

max_pagnation=subject_options[-1].getText()
print("Max=" + max_pagnation)

1
2
3

Max=3