Python 如何分析这些差距?
请帮助从易趣页面获取价格 在下面的脚本中,我从两个特定页面获取价格Python 如何分析这些差距?,python,xpath,python-3.x,lxml,Python,Xpath,Python 3.x,Lxml,请帮助从易趣页面获取价格 在下面的脚本中,我从两个特定页面获取价格 import pprint import requests import lxml.etree import lxml.html import lxml.cssselect import re def get_doc(url): try: req = requests.get(url) except Exception: print('Error open. __', Exce
import pprint
import requests
import lxml.etree
import lxml.html
import lxml.cssselect
import re
def get_doc(url):
try:
req = requests.get(url)
except Exception:
print('Error open. __', Exception)
else:
html = req.text
doc = lxml.html.document_fromstring(html)
return doc
for url in ['http://www.ebay.com/itm/DW-PDP-Concept-Pearlescent-White-Maple-Drumset-/121271668104?pt=US_Drums&hash=item1c3c5acd88', 'http://www.ebay.com/itm/LOT-OF-20-DRUM-SET-TUNING-KEYS-DW-TAMA-PEARL-SABIAN-and-OTHER-UNIQUE-KEYS-/291092068092?pt=US_Drums&hash=item43c67076fc']:
doc = get_doc(url)
title = doc.xpath('//h1[@id="itemTitle"]/text()')
priceUSD = doc.xpath('//span[@itemprop="price"]/text()')
print(title, priceUSD)
问题是第一页的价格有一个空格(&_n_b_s_p_;)。因此获取错误的xpath值文本()。情况如下:
['DW/PDP概念珠光白色枫木圆桶]['US$1\xa0200,00']
[20个鼓组调音键!DW!塔玛!珀尔!萨比安!和其他
唯一的钥匙!!']['US$6,05']
p、 美国。
它的价格不正确:“US$1\xa0200,00”替换
\xa0
:
priceUSD = [t.replace('\xa0', '') for t in
doc.xpath('//span[@itemprop="price"]/text()')]
顺便说一句,我没有修改就得到了以下输出:
['DW/PDP Concept Pearlescent White Maple Drumset'] ['US $1,200.00']
['LOT OF 20 DRUM SET TUNING KEYS! DW! TAMA! PEARL! SABIAN! and OTHER UNIQUE KEYS!!'] ['US $6.05']
“//h1[@id=“itemtTitle”]/text()”=>“//span[@itemprop=“price”]/text()”@Sergey,你说得对。我从本地副本复制了错误的行。谢谢