Ruby on rails HTTParty解析HTML

Ruby on rails HTTParty解析HTML,ruby-on-rails,html-parsing,httparty,Ruby On Rails,Html Parsing,Httparty,我正在寻找一种方法,从格式相当好但不太完美的xml网站中提取特定内容: <html> <head> <title>title</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <META HTTP-EQUIV="expires" CONTENT="now"> <meta h

我正在寻找一种方法,从格式相当好但不太完美的xml网站中提取特定内容:

<html>
  <head>
    <title>title</title>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    <META HTTP-EQUIV="expires" CONTENT="now">
    <meta http-equiv=refresh content=300>
  </head>
<body bgcolor="#FFFFFF">
<p><font face="Arial, Helvetica, sans-serif" size="2"><img src="pict.gif" width="503" height="43"><br></font></p>
<p><font face="Arial, Helvetica, sans-serif" size="2">Please Note: ...<br></font></p>
<font face="Arial, Helvetica, sans-serif" size="3"><B>The Schedule</B></font><p></p>
<table border=0 width="100%">
  <tr> 
    <td><font face="Arial, Helvetica, sans-serif" size="2"><B>CONTENT A</B></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
        <td><font face="Arial, Helvetica, sans-serif" size="2"><B>CONTENT B</B></font>&nbsp;&nbsp;</td>
        <td><font face="Arial, Helvetica, sans-serif" size="2"><B>CONTENT C</B></font>&nbsp;&nbsp;</td>
        <td><font face="Arial, Helvetica, sans-serif" size="2"><B>CONTENT D</B></font>&nbsp;&nbsp;</td>
    <td><font face="Arial, Helvetica, sans-serif" size="2"><B>CONTENT E</B></font></td>
  </tr>
...
但是xml不会解析,如果我切换到format:html,那么我似乎没有任何解析的优点。我在这里的想法很接近吗

谢谢,
Peter

我知道这是很久以前的事了,但我认为Nokogiri xpath解析会更好。我知道这是很久以前的事了,但我认为Nokogiri xpath解析会更好。
include HTTParty
base_uri "http://website.com/"
basic_auth "name", "pw"
format :xml

def download_and_process_index_file  
  s = self.class.get("theurl.html")
  thehtml = s.parsed_response
  #print CONTENT A
  puts thehtml['html']['body']['table']['tr']['td']['font']['b']
end