如何使用css选择器使用python'；刮痒吗？_Python_Css_Python 3.x_Scrapy_Web Crawler

如何使用css选择器使用python'；刮痒吗？

python css python-3.x scrapy web-crawler

如何使用css选择器使用python'；刮痒吗？,python,css,python-3.x,scrapy,web-crawler,Python,Css,Python 3.x,Scrapy,Web Crawler,为了学习scrapy，我正在抓取本网站的所有元素：然而，我不明白如何抓取作者的url。我尝试使用css选择器： >>> response.css('a::attr(href)').extract() ['/', '/login', '/author/Ralph-Waldo-Emerson', '/tag/life/page/1/', '/tag/regrets/page/1/', 'https://www.goodreads.com/quotes', 'https://scr

为了学习scrapy，我正在抓取本网站的所有元素：

然而，我不明白如何抓取作者的url。我尝试使用css选择器：

>>> response.css('a::attr(href)').extract()
['/', '/login', '/author/Ralph-Waldo-Emerson', '/tag/life/page/1/', '/tag/regrets/page/1/', 'https://www.goodreads.com/quotes', 'https://scrapinghub.com']

然后：

然而，我没有得到作者的个人网址。因此，如何使用css选择器获取上述url

更新

我已经知道我可以做到：

response.css('a::attr(href)').extract()[2]

然而，我想这并不可靠。你知道如何获取bio链接吗？

这可能有用：

>>> os.path.dirname(response.url)
'http://quotes.toscrape.com'

>> response.css('a::attr(href)').extract()[2]
u'/author/Bob-Marley'

>>> os.path.dirname(response.url) + response.css('a::attr(href)').extract()[2]
u'http://quotes.toscrape.com/author/Bob-Marley'

>>> os.path.dirname(response.url)
'http://quotes.toscrape.com'

>> response.css('a::attr(href)').extract()[2]
u'/author/Bob-Marley'

>>> os.path.dirname(response.url) + response.css('a::attr(href)').extract()[2]
u'http://quotes.toscrape.com/author/Bob-Marley'