如何在一定条件下通过python选择Html页面中的上一个元素_Python_Beautifulsoup

如何在一定条件下通过python选择Html页面中的上一个元素

python

如何在一定条件下通过python选择Html页面中的上一个元素,python,beautifulsoup,Python,Beautifulsoup,您好，我正在尝试从网站获取一些数据，然后我应该在页面中找到我上次使用的最后一个元素，并选择第一个元素的previews元素。请检查我的代码，我将在我的示例中详细解释：以下是示例HTML代码： <div class="post" id="7517049"> <div class="p-head"> <div class="p-c p-c-time"><span class="p-time" data="1554741054" ti

您好，我正在尝试从网站获取一些数据，然后我应该在页面中找到我上次使用的最后一个元素，并选择第一个元素的previews元素。请检查我的代码，我将在我的示例中详细解释：

以下是示例HTML代码：

<div class="post" id="7517049">
    <div class="p-head">
        <div class="p-c p-c-time"><span class="p-time" data="1554741054" title="2019-04-08 @ 21:00:54 ( Your Time )"><span class="t-n-m">45</span> <span class="t-u">mins</span></span>
        </div>
        <div class="p-c p-c-cat"><span class="p-cat c-5 c-7 "><a href="http://predb.me?cats=tv" class="c-adult">TV</a><a href="http://predb.me?cats=tv-hd" class="c-child">HD</a></span></div>
        <div class="p-c p-c-title">
            <h2><a class="p-title" href="http://predb.me?post=7517049">The.Repair.Shop.S04E02.720p.WEBRip.x264-LiGATE</a></h2>
            <a rel="nofollow" href="http://predb.me?post=7517049" class="tb tb-perma" title="Visit the permanent page for this release."></a>
        </div>
    </div>
</div>

<div class="post" id="7517048">
    <div class="p-head">
        <div class="p-c p-c-time"><span class="p-time" data="1554740951" title="2019-04-08 @ 20:59:11 ( Your Time )"><span class="t-n-m">47</span> <span class="t-u">mins</span></span>
        </div>
        <div class="p-c p-c-cat"><span class="p-cat c-24 c-25 "><a href="http://predb.me?cats=books" class="c-adult">Books</a><a href="http://predb.me?cats=books-ebooks" class="c-child">eBooks</a></span></div>
        <div class="p-c p-c-title">
            <h2><a class="p-title" href="http://predb.me?post=7517048">John.Bell.Young.Puccini.A.Listeners.Guide.Dover.Books.on.Music.and.Music.History.2016.RETAiL.ePub.eBook-VENTOLiN</a></h2>
            <a rel="nofollow" href="http://predb.me?post=7517048" class="tb tb-perma" title="Visit the permanent page for this release."></a>
        </div>
    </div>
</div>

<div class="post" id="7517047">
    <div class="p-head">
        <div class="p-c p-c-time"><span class="p-time" data="1554740927" title="2019-04-08 @ 20:58:47 ( Your Time )"><span class="t-n-m">48</span> <span class="t-u">mins</span></span>
        </div>
        <div class="p-c p-c-cat"><span class="p-cat c-5 c-6 "><a href="http://predb.me?cats=tv" class="c-adult">TV</a><a href="http://predb.me?cats=tv-sd" class="c-child">SD</a></span></div>
        <div class="p-c p-c-title">
            <h2><a class="p-title" href="http://predb.me?post=7517047">The.Repair.Shop.S04E01.WEB.h264-LiGATE</a></h2>
            <a rel="nofollow" href="http://predb.me?post=7517047" class="tb tb-perma" title="Visit the permanent page for this release."></a>
        </div>
    </div>
</div>

python代码可以显示previous元素，但我需要通过TV value显示包含

标记的previous元素。如果您试图为搜索的帖子显示3个

元素中的文本，可以尝试以下方法：

from bs4 import BeautifulSoup

search = "The.Repair.Shop.S04E01.WEB.h264-LiGATE"
soup = BeautifulSoup(my_driver, "html.parser")

rls = soup.find("a", text=search)
div_parent = rls.find_previous('div', class_='p-head')

for div in div_parent.find_all('div'):
    print(div.get_text(strip=True))

这将显示以下3项：

48分钟
电视台
.Repair.Shop.S04E01.WEB.h264-LiGATE

beautiful soup有父/子标记的概念，这可能是您想要的，但是如果您发布一些用于尝试解决问题的python示例代码，您将获得更多帮助（和更少的反对票）。@Chris好的，我编辑帖子并添加我的pythoncodes@Chris请查看您建议的搜索结果，请回答这个问题以显示该HTML的预期输出。

from bs4 import BeautifulSoup

search = "The.Repair.Shop.S04E01.WEB.h264-LiGATE"
soup = BeautifulSoup(my_driver, "html.parser")

rls = soup.find("a", text=search)
div_parent = rls.find_previous('div', class_='p-head')

for div in div_parent.find_all('div'):
    print(div.get_text(strip=True))