Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby/21.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ruby 如何使用Nokogiri刮取HTML?_Ruby_Screen Scraping_Nokogiri - Fatal编程技术网

Ruby 如何使用Nokogiri刮取HTML?

Ruby 如何使用Nokogiri刮取HTML?,ruby,screen-scraping,nokogiri,Ruby,Screen Scraping,Nokogiri,我不能放弃产品的价格,我得到的每种价格的产量如下: <div class="pu-final"> <span class="fk-font-17 fk-bold">Rs. 1999</span> </div> 您可以执行以下操作: require 'nokogiri' doc = Nokogiri::HTML::Document.parse <<-eotl <div class="pu-final">

我不能放弃产品的价格,我得到的每种价格的产量如下:

<div class="pu-final">
  <span class="fk-font-17 fk-bold">Rs. 1999</span>
</div>
您可以执行以下操作:

require 'nokogiri'

doc = Nokogiri::HTML::Document.parse <<-eotl
<div class="pu-final">

                    <span class="fk-font-17 fk-bold">Rs. 1999</span>
</div>
eotl

doc.at_css('div.pu-final > span.fk-font-17.fk-bold').class
# => Nokogiri::XML::Element
doc.at_css('div.pu-final > span.fk-font-17.fk-bold').text 
# => "Rs. 1999"
完整代码

require 'nokogiri'
require 'open-uri'

url = "http://www.flipkart.com/mens-footwear/shoes/casual-shoes/pr?sid=osp,cil,nit,e1f"
doc = Nokogiri::HTML(open(url))

doc.css("div.pu-details.lastUnit").each do |dv|
  product_name = dv.at_css('div.pu-title a').text.strip
  product_price = dv.xpath("normalize-space(.//div[contains(@class,'pu-final')]/span)").to_s
  print product_name,"  <----->  ",product_price,"\n"
end
需要“nokogiri”
需要“打开uri”
url=”http://www.flipkart.com/mens-footwear/shoes/casual-shoes/pr?sid=osp,cil,nit,e1f“
doc=Nokogiri::HTML(打开(url))
doc.css(“div.pu-details.lastUnit”)。每个都有|
product_name=dv.at_css('div.pu-title a')。text.strip
product_price=dv.xpath(“规范化空间(.//div[contains(@class,'pu-final'))]/span)”
打印产品名称“”,产品价格“”\n
结束
输出

Fila Storm Zender Sneakers  <----->  Rs. 1819
Puma Future Cat M1 Big 102 O Sneakers  <----->  Rs. 3849
Fila Filamotor V4 Sneakers  <----->  Rs. 1449
Adidas Volantis Hiking Shoes  <----->  Rs. 2999
Fila Varsity Sneakers  <----->  Rs. 1249
Puma Evo Speed F1 Low BMW Sneakers  <----->  Rs. 2609
Lee Cooper Running and Walking Shoes  <----->  Rs. 1329
Lee Cooper Running and Walking Shoes  <----->  Rs. 1329
United Colors of Benetton Sneakers  <----->  Rs. 2799
United Colors of Benetton Party Wear Shoes  <----->  Rs. 2449
Timberland 6 In Premium Boots  <----->  Rs. 8490
Timberland Ek Mid Boots  <----->  Rs. 8490
Clarks Montacute Lord Boots  <----->  Rs. 3249
Clarks Latch Mast Corporate Casuals  <----->  Rs. 1999
Levi's Boots  <----->  Rs. 2999
Fila Storm Zender运动鞋1819
彪马未来猫M1大102 O运动鞋3849
Fila Filamotor V4运动鞋1449
阿迪达斯沃兰蒂斯登山鞋Rs.2999
Fila Varsity运动鞋1249号
彪马Evo Speed F1低宝马运动鞋2609
Lee Cooper跑步和步行鞋Rs.1329
Lee Cooper跑步和步行鞋Rs.1329
贝纳通联合颜色运动鞋2799
贝纳通联彩派对穿2449卢比鞋
Timberland 6号优质靴子Rs.8490
Timberland Ek Mid靴子Rs.8490
克拉克蒙塔卡蒂勋爵靴Rs.3249
克拉克公司休闲Rs.1999
Levi's靴子2999卢比

我尝试了相同的代码,只做了一点小改动,效果很好。试试看

改变

price = item.at_css(".pu-final")


我试过了,但它对所有产品都有相同的价值,它是如何绑定到特定产品的,它的价格与产品名称同时打印的。@shamshul2007相同的价值意味着什么?你能给出HTML中更相关的部分吗?您给出的html,根据我的答案将适用于您..我想废弃该页面,输出应类似于产品名称=产品价格我认为这是一个错误的答案,因为它将匹配
div
标记中包含的任何文本元素,而不仅仅是“fk-font-17 fk bold”类。@shamshul2007,所以您需要产品名称及其价格。。是吗?你为什么关了邮局?我正在写答案。
Fila Storm Zender Sneakers  <----->  Rs. 1819
Puma Future Cat M1 Big 102 O Sneakers  <----->  Rs. 3849
Fila Filamotor V4 Sneakers  <----->  Rs. 1449
Adidas Volantis Hiking Shoes  <----->  Rs. 2999
Fila Varsity Sneakers  <----->  Rs. 1249
Puma Evo Speed F1 Low BMW Sneakers  <----->  Rs. 2609
Lee Cooper Running and Walking Shoes  <----->  Rs. 1329
Lee Cooper Running and Walking Shoes  <----->  Rs. 1329
United Colors of Benetton Sneakers  <----->  Rs. 2799
United Colors of Benetton Party Wear Shoes  <----->  Rs. 2449
Timberland 6 In Premium Boots  <----->  Rs. 8490
Timberland Ek Mid Boots  <----->  Rs. 8490
Clarks Montacute Lord Boots  <----->  Rs. 3249
Clarks Latch Mast Corporate Casuals  <----->  Rs. 1999
Levi's Boots  <----->  Rs. 2999
price = item.at_css(".pu-final")
price = item.at_css(".pu-final").text unless item.at_css(".pu-final").nil?