Asp.net mvc 4 如何在HtmlAlityPack中获取具有特定值的元素
我有一个ASP.NETMVC4项目,尝试用HtmlAlityPack解析html文档。我有以下HTML:Asp.net mvc 4 如何在HtmlAlityPack中获取具有特定值的元素,asp.net-mvc-4,html-agility-pack,Asp.net Mvc 4,Html Agility Pack,我有一个ASP.NETMVC4项目,尝试用HtmlAlityPack解析html文档。我有以下HTML: <td class="pl22"> <p class='pb10 pt10 t_grey'>Experience:</p> <p class='bold'>any</p> </td> <td class='pb10 pl20'> <p class='t_grey pb10 pt10'>
<td class="pl22">
<p class='pb10 pt10 t_grey'>Experience:</p>
<p class='bold'>any</p>
</td>
<td class='pb10 pl20'>
<p class='t_grey pb10 pt10'>Education:</p>
<p class='bold'>any</p>
</td>
<td class='pb10 pl20'>
<p class='pb10 pt10 t_grey'>Schedule:</p>
<p class='bold'>part-time</p>
<p class='text_12'>2/2 (day/night)</p>
</td>
但它给我带来了不同的元素,它们位于页面顶部。我的经验、教育和时间表都是静态价值观。另外,我的any、any兼职日/夜是动态值。有人能帮我吗?如果您想保留XPath,可以这样做
var html = "<td class='pl22'><p class='pb10 pt10 t_grey'>Experience:</p><p class='bold'>any</p></td><td class='pb10 pl20'><p class='t_grey pb10 pt10'>Education:</p><p class='bold'>any</p></td><td class='pb10 pl20'><p class='pb10 pt10 t_grey'>Schedule:</p><p class='bold'>part-time</p><p class='text_12'>2/2 (day/night)</p></td> ";
var doc = new HtmlDocument
{
OptionDefaultStreamEncoding = Encoding.UTF8
};
doc.LoadHtml(html);
var part1 = doc.DocumentNode.SelectSingleNode("//td[@class='pl22']/p[@class='bold']");
var part2 = doc.DocumentNode.SelectNodes("//td[@class='pb10 pl20']/p[@class='bold']");
foreach (var item in part2)
{
Console.WriteLine(item.InnerText);
}
var part3 = doc.DocumentNode.SelectSingleNode("//td[@class='pb10 pl20']/p[@class='text_12']");
Console.WriteLine(part1.InnerText);
Console.WriteLine(part3.InnerText);
下面是一个更侧重于表格标题(
Experience
、Education
和Schedule
)的替代方案,而不是节点类:
private static List<string> GetValues(HtmlDocument doc, string header) {
return doc.DocumentNode.SelectNodes(string.Format("//p[contains(text(), '{0}')]/following-sibling::p", header)).Select(x => x.InnerText).ToList();
}
我不能用你的方法得到值。我得到了不同的价值谢谢你,你的例子更敏捷。
any
part-time
any
2/2 (day/night)
private static List<string> GetValues(HtmlDocument doc, string header) {
return doc.DocumentNode.SelectNodes(string.Format("//p[contains(text(), '{0}')]/following-sibling::p", header)).Select(x => x.InnerText).ToList();
}
var experiences = GetValues(doc, "Experience");
var educations = GetValues(doc, "Education");
var schedules = GetValues(doc, "Schedule");
experiences.ForEach(Console.WriteLine);
educations.ForEach(Console.WriteLine);
schedules.ForEach(Console.WriteLine);