Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/mercurial/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python如何在BeautifulSoup中提取具有相同类名的数据_Python_Beautifulsoup - Fatal编程技术网

Python如何在BeautifulSoup中提取具有相同类名的数据

Python如何在BeautifulSoup中提取具有相同类名的数据,python,beautifulsoup,Python,Beautifulsoup,我正在尝试使用python中的BeautifulSoup库提取数据。我用拉链和汤来榨汁 我的html数据如下所示: <li> <ul class="features"> <li>Year: <strong>2016</strong></li> <li>Kilometers: <strong>81,000</strong></li>

我正在尝试使用python中的BeautifulSoup库提取数据。我用拉链和汤来榨汁

我的html数据如下所示:

<li>

    <ul class="features">

        <li>Year: <strong>2016</strong></li>

        <li>Kilometers: <strong>81,000</strong></li>

    </ul>
    <ul class="features">

        <li>Doors: <strong>2 door</strong></li>

        <li>Color: <strong>White</strong></li>

    </ul>
    <ul class="features">

    </ul>

</li>

输出:

Year: 2016
Kilometers: 81,000
Doors: 2 door
Color: White


如何在单独的变量中存储年份、公里数、门数和颜色?

查找包含文本的元素
li
,然后查找下一个强标记。 声明空列表并追加

代码

from bs4 import BeautifulSoup

html='''<li>

    <ul class="features">

        <li>Year: <strong>2016</strong></li>

        <li>Kilometers: <strong>81,000</strong></li>

    </ul>
    <ul class="features">

        <li>Doors: <strong>2 door</strong></li>

        <li>Color: <strong>White</strong></li>

    </ul>
    <ul class="features">

    </ul>

</li>
'''
soup=BeautifulSoup(html,'html.parser')
Year=[]
KiloMeter=[]
Doors=[]
Color=[]
for year,km,dor,colr in zip(soup.select('ul.features li:contains("Year:")'),soup.select('ul.features li:contains("Kilometers:")'),soup.select('ul.features li:contains("Doors:")'),soup.select('ul.features li:contains("Color:")')):
    Year.append(year.find_next('strong').text)
    KiloMeter.append(km.find_next('strong').text)
    Doors.append(dor.find_next('strong').text)
    Color.append(colr.find_next('strong').text)

print(Year,KiloMeter,Doors,Color)
您可以尝试:

from bs4 import BeautifulSoup as bs
from io import StringIO

data = """<li>
    <ul class="features">
        <li>Year: <strong>2016</strong></li>
        <li>Kilometers: <strong>81,000</strong></li>
    </ul>
    <ul class="features">
        <li>Doors: <strong>2 door</strong></li>
        <li>Color: <strong>White</strong></li>
    </ul>
    <ul class="features">
    </ul>
</li>"""

soup = bs(StringIO(data))
Year, Km, Doors, Color = list(map(lambda x: x.text.split(':')[1].strip(), soup.select('.features > li')))
print(Year, Km, Doors, Color)
从bs4导入美化组作为bs
从io导入StringIO
data=“”
    • 年份:2016年
    • 公里数:81000
    • 门:2门
    • 颜色:白色
  • “”“ soup=bs(StringIO(数据)) 年份,公里,门,颜色=列表(地图(lambda x:x.text.split(':')[1].strip(),soup.select('.features>li')) 打印(年份、公里、车门、颜色)
    我不想将其添加到数组中,这样我就可以像km=km一样进行操作。下一步查找(“strong”).text我是对的?如果你不在数组中使用,那么如果你有多个元素,变量值将改变,你将始终得到元素的最后一个值。但是我得到了所有年份的打印。如果你想打印第一个值,那么就执行for循环的结束<代码>年=“”。加入(年[:1])print(年)对其他人也这样做
    ['2016'] ['81,000'] ['2 door'] ['White']
    
    from bs4 import BeautifulSoup as bs
    from io import StringIO
    
    data = """<li>
        <ul class="features">
            <li>Year: <strong>2016</strong></li>
            <li>Kilometers: <strong>81,000</strong></li>
        </ul>
        <ul class="features">
            <li>Doors: <strong>2 door</strong></li>
            <li>Color: <strong>White</strong></li>
        </ul>
        <ul class="features">
        </ul>
    </li>"""
    
    soup = bs(StringIO(data))
    Year, Km, Doors, Color = list(map(lambda x: x.text.split(':')[1].strip(), soup.select('.features > li')))
    print(Year, Km, Doors, Color)