Asp.net mvc 4 选择div失败
我正在尝试解析Asp.net mvc 4 选择div失败,asp.net-mvc-4,html-agility-pack,Asp.net Mvc 4,Html Agility Pack,我正在尝试解析div class=“base shortstory中的信息: <div id="dle-content"> <div class="base shortstory"> <h3 class="btl"><a href="http://someurl.com/htc-jetstream.html">HTC Jetstream</a></h3> </div> <div
div class=“base shortstory
中的信息:
<div id="dle-content">
<div class="base shortstory">
<h3 class="btl"><a href="http://someurl.com/htc-jetstream.html">HTC Jetstream</a></h3>
</div>
<div class="base shortstory">
<h3 class="btl"><a href="http://someurl.com/samsung.html">Samsung S4</a></h3>
</div>
<div class="base shortstory">
<h3 class="btl"><a href="http://someurl.com/dell.html">Dell Streak</a></h3>
</div>
</div>
这是密码
const string url = "http://someurl.com/catalogue";
const string rootUrl = "http://someurl.com";
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(url);
int dealsCount = 0;
HtmlNode root = doc.DocumentNode.SelectSingleNode("//div[@id='dle-content']");
int i = 1;
//this is for the default page
while (i<=10)
{
try
{
string node= String.Format("//div[{0}]", i);
var link =
doc.DocumentNode.SelectSingleNode(node);
var href = link.SelectSingleNode("//div[@class='mlink']//span[@class='argmore']//a[@href]").Attributes["href"].Value;
string title = link.SelectSingleNode("//h3[@class='btl']//a[@href]").InnerText.Trim();
string description = link.SelectSingleNode("//div[@class='maincont']//div[1]").InnerText.Replace("\n", " ").Replace("\r", "").Replace("\t", "").Trim();
description = RemoveHTMLComments(description);
var imageURL = link.SelectSingleNode("//div[@class='maincont']//div[1]//a//img").Attributes["src"].Value;
var price = link.SelectSingleNode("//div[@class='mlink']//span[3]//font").InnerText.Trim();
price = Regex.Match(price, @"\d+").Value;
var partnerdealID = href;
//no information
var isActivesStr = link.SelectSingleNode("//div[@class='mlink']//span[2]/font").InnerText.Trim();
bool isActive;
if (isActivesStr.Contains("Нет в наличии"))
{
isActive = false;
}
else
{
isActive = true;
}
var dealUrl = href; //requires login - show the page itself
}
catch (Exception)
{
}
i += 1;
}
const字符串url=”http://someurl.com/catalogue";
常量字符串rootUrl=”http://someurl.com";
HtmlWeb hw=新的HtmlWeb();
HtmlDocument doc=hw.Load(url);
int-dealsunt=0;
HtmlNode root=doc.DocumentNode.SelectSingleNode(//div[@id='dle-content']);
int i=1;
//这是默认页面
而(i所有XPATH表达式都以“/”开头,这意味着“从文档的根开始递归搜索”。因此,当您执行此操作时:
link.SelectSingleNode("//div[@class='mlink']//span[@class='argmore']//a[@href]")
您将不会从链接开始,而是从文档的根目录开始。您可能希望这样做:
link.SelectSingleNode("div[@class='mlink']...etc...")
这相当于
link.SelectSingleNode("./div[@class='mlink']...etc...")
“.”表示当前节点。“/”表示仅搜索直接子节点,而不是递归搜索