Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ruby-on-rails/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ruby on rails 如何使用nokogiri刮取多个页面以及如何使用rails快速刮取_Ruby On Rails_Nokogiri - Fatal编程技术网

Ruby on rails 如何使用nokogiri刮取多个页面以及如何使用rails快速刮取

Ruby on rails 如何使用nokogiri刮取多个页面以及如何使用rails快速刮取,ruby-on-rails,nokogiri,Ruby On Rails,Nokogiri,我试图从99页的网页中刮取某些元素。我一辈子都不知道该怎么做。 这是我的密码: require 'open-uri' require 'nokogiri' @title = [] html_content = open("https://www.imdb.com/list/ls057823854/? sort=list_order,asc&st_dt=&mode=detail&page=1").read doc = Nokogiri::HTML(html_content

我试图从99页的网页中刮取某些元素。我一辈子都不知道该怎么做。 这是我的密码:

require 'open-uri'
require 'nokogiri'
@title = []
html_content = open("https://www.imdb.com/list/ls057823854/? 
sort=list_order,asc&st_dt=&mode=detail&page=1").read
doc = Nokogiri::HTML(html_content)
doc.search(".lister-item-header/a").each do |title|
@title << title.text.strip
需要“打开uri”
需要“nokogiri”
@标题=[]
html_content=open(“https://www.imdb.com/list/ls057823854/? 
排序=列表顺序,asc&st\U dt=&mode=detail&page=1“。读取
doc=Nokogiri::HTML(HTML\u内容)
文档搜索(“.lister item header/a”)。每个do都有标题|

@标题如果你想收集所有标题,这里有刮板代码

require 'open-uri'
require 'nokogiri'
require 'json'

@title = []
url = "https://www.imdb.com/list/ls057823854/?sort=list_order,asc&st_dt=&mode=detail&page="
html_content = open(url+"1").read
doc = Nokogiri::HTML(html_content)

max = doc.search(".pagination-range").first.text.split("of")[1].gsub(",","").strip.to_i
max = (max / 100).floor + 1

doc.search(".lister-item-header/a").each do |title|
    @title << title.text.strip
end

for i in 2..max
    html_content = open(url+i.to_s).read
    doc = Nokogiri::HTML(html_content)

    doc.search(".lister-item-header/a").each do |title|
        @title << title.text.strip
    end
    sleep(1) 
end

File.open("imdb-titles.json","w") do |f|
    f.write(JSON.pretty_generate(@title))
end
需要“打开uri”
需要“nokogiri”
需要“json”
@标题=[]
url=”https://www.imdb.com/list/ls057823854/?sort=list_order,asc&st_dt=&mode=详细信息&page=”
html_content=open(url+“1”)。阅读
doc=Nokogiri::HTML(HTML\u内容)
max=doc.search(“.pagination range”).first.text.split(“of”)[1].gsub(“,”,”).strip.to_i
最大值=(最大值/100)。地板+1
文档搜索(“.lister item header/a”)。每个do都有标题|

@标题css错误,您的块需要一个
end