Ruby on rails 正在尝试清理Rails中的HTML片段,以仅获取所需内容';s位于图像标记之间,以便我可以在图像中显示
我正在使用Feedjira解析一些RSS提要,我得到的数据如下所示:Ruby on rails 正在尝试清理Rails中的HTML片段,以仅获取所需内容';s位于图像标记之间,以便我可以在图像中显示,ruby-on-rails,image,sanitize,Ruby On Rails,Image,Sanitize,我正在使用Feedjira解析一些RSS提要,我得到的数据如下所示: <table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;"><tr><td width="80" align="center" valign="top"><font style="font-size:85%;font-family:arial,sans-serif"><a hre
<table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;"><tr><td width="80" align="center" valign="top"><font style="font-size:85%;font-family:arial,sans-serif"><a href="http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNEnMLee_eB0lY7hCtIqJCf8Iy2StQ&clid=c3a7d30bb8a4878e06b80cf16b898331&cid=52778768548994&ei=xaUHVaj4GcLBmQLyjIDIDw&url=http://www.foxnews.com/weather/2015/03/15/cyclone-pam-vanuatu/?intcmp%3Dlatestnews"><img src="//t1.gstatic.com/images?q=tbn:ANd9GcTHyV7D2Zf-QfzLZ-7qJlk0mE3nU7qM3-mnENtJPURJTk8o9Kh-Iqc_focHCHAALYhnRuY1Nop6" alt="" border="1" width="80" height="80"><br><font size="-2">Fox News</font></a></font></td><td valign="top" class="j"><font style="font-size:85%;font-family:arial,sans-serif"><br><div style="padding-top:0.8em;"><img alt="" height="1" width="1"></div><div class="lh"><a href="http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNHTAYRk1bcvBCJxvZ4M0OUUrXTXQg&clid=c3a7d30bb8a4878e06b80cf16b898331&cid=52778768548994&ei=xaUHVaj4GcLBmQLyjIDIDw&url=http://www.dailymail.co.uk/wires/reuters/article-2997951/Aid-agencies-begin-helicopter-flights-cyclone-stricken-Vanuatu.html"><b>Aid agencies begin...
['img'],:attributes=>{'img'=>['src']})%>“alt=“…”style=“width:72px;高度:72px“>
但是,这将在我的html中插入以下内容:
<img class="media-object" src="<img src="//t1.gstatic.com/images?q=tbn:ANd9GcTHyV7D2Zf-QfzLZ-7qJlk0mE3nU7qM3-mnENtJPURJTk8o9Kh-Iqc_focHCHAALYhnRuY1Nop6">Fox News <img> Aid agencies begin flights to cyclone-stricken Vanuatu, official toll lowered Daily Mail TANNA, March 17 (Reuters) - International aid agencies began emergency flights on Tuesday to some of the remote outer islands of Vanuatu, which they fear have been devastated by a monster cyclone that tore through the South Pacific island nation. Relief, hardship as Cyclone Pam survivors battle onBangkok Post UN says 24 dead in Vanuatu after Cyclone Pam7Online WSVN-TV Fears for food supplies in Vanuatu as capital cleans upThe Star Online Xinhua -MSNBC -Bloomberg all 4,389 news articles » " alt="..." style="width:72px;height:72px">
Fox新闻
感谢您的帮助!对于这种非常简单的HTML解析,正则表达式是简单可靠的。例如
feedjira_output =~ /src="([^"]+)"/
这将源url放在正则表达式组中(可通过$1
变量访问)。谢谢!这几乎可以工作,但它在src=之后添加了一个额外的“如下:src=”“//t1.gstatic.com/images?q=tbn:和9gcthyv7d2zf-QfzLZ-7qJlk0mE3nU7qM3-mnENtJPURJTk8o9Kh-Iqc_fochchaalyhuynr1nop6“酷-很高兴它成功了!请考虑投票并将我的答案标记为已接受的答案。事实上,你描述的方式完全正确,我的错。非常感谢。但最终还是这样做的:“alt=”…”style=“width:72px;高度:72px“>很高兴它能工作。您可以考虑重构的下一步是将代码移动到控制器,并将其分配给视图中使用的@变量。
feedjira_output =~ /src="([^"]+)"/