Html 如果两者都存在，请选择一个_Html_Xpath_Web Scraping_Scrapy

Html 如果两者都存在，请选择一个

html xpath web-scraping scrapy

Html 如果两者都存在，请选择一个,html,xpath,web-scraping,scrapy,Html,Xpath,Web Scraping,Scrapy,我有点像 <li class="ProductPrice"> 80.00 50.00 </li> <li class="ProductPrice"> 100.00

我有点像

<li class="ProductPrice">
    <span class="Regular Price">80.00</span>
    <span class="Sale Price">50.00</span>
</li>
<li class="ProductPrice">
    <span class="Regular Price">100.00</span>
</li>

其中某些项目可能没有销售价格范围

我想提取当前零售价格，在这里我会选择销售价格，或者如果同时存在销售价格和常规价格，则只选择销售价格

我是XPath新手，所以我不确定如何翻译if-else。

XSLT-1.0解决方案有点复杂：

<xsl:template match="li">
    <xsl:choose>
        <xsl:when test="span/@class='Sale Price'">
            <xsl:value-of select="span[@class='Sale Price']" />
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="span[@class='Regular Price']" />            
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

因此，如果您有可用的XPath-2.0或更高版本，则可以使用以下内容：

<xsl:template match="li">
    <xsl:value-of select="if (span/@class='Sale Price') then (span[@class='Sale Price']) else (span[@class='Regular Price'])" />            
</xsl:template>

两种解决方案的输出相同：

逻辑并不完全是您想要的，但它非常接近。

您可以在python中使用lxml模块。这是迄今为止我使用过的最简单的模块

from lxml import html

data = '''<li class="ProductPrice">
              <span class="Regular Price">80.00</span>
              <span class="Sale Price">50.00</span>
          </li>
          <li class="ProductPrice">
              <span class="Regular Price">100.00</span>
          </li>
       '''

#make the html object
tree = html.fromstring(data)
li = tree.xpath('//li') #get all the li tags

for i in li:

    sp = i.xpath('.//span[contains(@class,"Sale Price")]/text()')
    rp = i.xpath('.//span[contains(@class,"Regular Price")]/text()')

    if sp:
        print('Price is :',sp[0])
    else:
        print('Price is :',rp[0])

我所做的是，提取销售价格并检查它是否存在。如果存在，则程序将打印，否则程序将打印常规价格

笔记记得把它放进去。在xpath中使用单独的元素时。 xpath的结果总是以列表的形式出现。对于任何疑问，请发表评论

我想提取当前的零售价格，在这里我会选择销售价格或如果同时存在销售和常规价格，则选择销售仅限价格

如果您知道销售价格总是在常规价格之后，请使用XPath表达式

span[@class = 'Regular Price' or @class = 'Sale Price'][last()]

在XPath 2.0中，即使不知道顺序，也可以使用这种方法：

(span[@class = 'Sale Price'], span[@class = 'Regular Price'])[1]

当您在xpath中使用contains时，就不需要像在常规Price中那样使用复合类名。试试像这样的方法[contains@class，常规]/text代替。@谢谢您的提醒。

(span[@class = 'Sale Price'], span[@class = 'Regular Price'])[1]