Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/css/40.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Css 如何通过scrapy提取类似a href中的文本_Css_Scrapy_Href - Fatal编程技术网

Css 如何通过scrapy提取类似a href中的文本

Css 如何通过scrapy提取类似a href中的文本,css,scrapy,href,Css,Scrapy,Href,我想按如下方式提取文本 子主题_选择器='。子主题::文本', 年份选择器='。年份::文本' 但我不知道如何提取主题,你能帮我吗? 主题选择器='//a[contains@href,/set/theme-]/@href' <div class='tags floatleft'> <a href='/sets/10251-1/Brick-Bank'>10251-1</a> <a href='/sets/theme-Creator-Exp

我想按如下方式提取文本

子主题_选择器='。子主题::文本', 年份选择器='。年份::文本' 但我不知道如何提取主题,你能帮我吗? 主题选择器='//a[contains@href,/set/theme-]/@href'

<div class='tags floatleft'>
    <a href='/sets/10251-1/Brick-Bank'>10251-1</a> 
    <a href='/sets/theme-Creator-Expert'>Creator Expert</a> 
    <a class='subtheme' href='/sets/theme-Creator-Expert/subtheme-Modular-Buildings'>Modular Buildings</a> 
    <a class='year' href='/sets/theme-Creator-Expert/year-2016'>2016</a> 
</div>

你说得对。您可以非常简单地对其进行测试,甚至无需实际删除站点:

import scrapy

TEXT = """
<div class='tags floatleft'>
    <a href='/sets/10251-1/Brick-Bank'>10251-1</a> 
    <a href='/sets/theme-Creator-Expert'>Creator Expert</a> 
    <a class='subtheme' href='/sets/theme-Creator-Expert/subtheme-Modular-Buildings'>Modular Buildings</a> 
    <a class='year' href='/sets/theme-Creator-Expert/year-2016'>2016</a> 
</div>
"""

s = scrapy.Selector(text=TEXT)
link = s.xpath('//a[contains(@href,"/sets/theme-")]/@href').extract_first()
text = s.xpath('//a[contains(@href,"/sets/theme-")]/text()').extract_first()
print(link)
print(text)

你是什么意思?原因是什么?这是最后一个'a[href*=\/sets\/theme-]::text'
/sets/theme-Creator-Expert
Creator Expert