Ruby 如何在引号之间查找特定文本
我正在尝试编写一个Ruby脚本,它将从图像中获取Flickr BBCode,只找到实际的图像链接,而忽略所有其他内容 Flickr的BBCode如下所示:Ruby 如何在引号之间查找特定文本,ruby,flickr,Ruby,Flickr,我正在尝试编写一个Ruby脚本,它将从图像中获取Flickr BBCode,只找到实际的图像链接,而忽略所有其他内容 Flickr的BBCode如下所示: <a href="http://www.flickr.com/photos/user/9049969465/" title="Wiggle Wiggle by Anonymous, on Flickr"><img src="https://farm3.staticflickr.com/2864/92917419471_248
<a href="http://www.flickr.com/photos/user/9049969465/" title="Wiggle Wiggle by Anonymous, on Flickr"><img src="https://farm3.staticflickr.com/2864/92917419471_248187_c.jpg" width="800" height="526" alt="Wiggle Wiggle"></a>
我需要知道如何扫描输入,只找到以https开头、以引号结尾的部分。感谢您的任何帮助不要试图这样做
相反,使用HTML解析器。像Nokogiri这样的
需要“nokogiri”
doc=Nokogiri::HTML.parse“”
css('a')。每个do |链接|
放置link.attr(:href)
结束
如果您试图解析HTML,您应该使用正确的HTML解析器
例如,这在以下方面是微不足道的:
需要“nokogiri”
bbcode=%Q[]
Nokogiri::HTML(bbcode).css('a')[0]['href']
# => "http://www.flickr.com/photos/user/9049969465/"
显然,您必须在其中添加一些错误检查,但这是最基本的。需要“nokogiri”
require 'nokogiri'
doc = Nokogiri::HTML (<<-eol)
<a href="http://www.flickr.com/photos/user/9049969465/" title="Wiggle Wiggle by Anonymous, on Flickr"><img src="https://farm3.staticflickr.com/2864/92917419471_248187_c.jpg" width="800" height="526" alt="Wiggle Wiggle"></a>
eol
doc.at_css("a")['href']
# => "http://www.flickr.com/photos/user/9049969465/"
doc.at("a")['href']
# => "http://www.flickr.com/photos/user/9049969465/"
doc=Nokogiri::HTML(“http://www.flickr.com/photos/user/9049969465/"
当然,css('a')[0]
可以简化为at_css('a')
@theTinMan是的,我也这样做了。:)+1,是的,否则。好的,谢谢。我甚至都不知道。回来修复我的代码。谢谢你的帮助!
require 'nokogiri'
doc = Nokogiri::HTML.parse '<a href="http://www.flickr.com/photos/user/9049969465/" title="Wiggle Wiggle by Anonymous, on Flickr"><img src="https://farm3.staticflickr.com/2864/92917419471_248187_c.jpg" width="800" height="526" alt="Wiggle Wiggle"></a>'
doc.css('a').each do |link|
puts link.attr(:href)
end
require 'nokogiri'
bbcode = %Q[<a href="http://www.flickr.com/photos/user/9049969465/" title="Wiggle Wiggle by Anonymous, on Flickr"><img src="https://farm3.staticflickr.com/2864/92917419471_248187_c.jpg" width="800" height="526" alt="Wiggle Wiggle"></a>]
Nokogiri::HTML(bbcode).css('a')[0]['href']
# => "http://www.flickr.com/photos/user/9049969465/"
require 'nokogiri'
doc = Nokogiri::HTML (<<-eol)
<a href="http://www.flickr.com/photos/user/9049969465/" title="Wiggle Wiggle by Anonymous, on Flickr"><img src="https://farm3.staticflickr.com/2864/92917419471_248187_c.jpg" width="800" height="526" alt="Wiggle Wiggle"></a>
eol
doc.at_css("a")['href']
# => "http://www.flickr.com/photos/user/9049969465/"
doc.at("a")['href']
# => "http://www.flickr.com/photos/user/9049969465/"