C# HtmlAgilityPack在c中同时获取两个节点#
我试图解析一个html页面, 我将从这段代码中获得一对节点C# HtmlAgilityPack在c中同时获取两个节点#,c#,xpath,html-agility-pack,C#,Xpath,Html Agility Pack,我试图解析一个html页面, 我将从这段代码中获得一对节点 <li class="classli"> <div class="element">element1</div> <div class="description">description1</div> </li> <li class="classli"> <div class="element">
<li class="classli">
<div class="element">element1</div>
<div class="description">description1</div>
</li>
<li class="classli">
<div class="element">element2</div>
<div class="description">description2</div>
</li>
<li class="classli">
<div class="xxxelementclass">element3</div>
<div class="description">description3</div>
</li>
<li class="classli">
<div class="element">element4</div>
<div class="xxxclass">description4</div>
</li>
在HTML页面中,并非所有(li)标记都包含相同的子标记I
将仅在两者都存在的情况下获取描述和元素使您的for-each中的xpath如下所示
//li[contains(@class,'classli') and ./div[contains(@class,'element')] and ./div[contains(@class,'description')]]
这只考虑具有给定类的div作为子节点的元素,还注意到在每个内部都需要开始从Li节点开始寻找衰变节点,因此需要使用< <代码> //<代码>子或<代码> //< /代码>对于诸如
之类的子句。./div[contains(@class,'element')]
./div[contains(@class,'description')]
由CSS类匹配的正确XPath表达式有点复杂。采用一种温和的方法,即发布的第二个代码段,任务的XPath如下所示(格式化为行以便于阅读):
谢谢大家的帮助 我就是这样解决的
foreach(var node in doc.SelectNodes("//li[contains(@class,classli)]"))
{
List<HTMLNODE> Child = node.childnodes.where(o=> (o.getattribbutevalue(class,"") == "element") or (o.getattribbutevalue(class,"") == "description")).AsEnumerable().ToList();
}
For(int i = 0; i <= Child.count-1;i=i+2)
{
listelement.add(Child[i].InnerHtml;
listdescription.add(Child[i+1].InnerHtml;
}
foreach(doc.SelectNodes(//li[contains(@class,classli)])中的var节点)
{
列出Child=node.childnodes.where(o=>(o.getattribbutevalue(class,“”==“element”)或(o.getattribbutevalue(class,“”==“description”)).AsEnumerable().ToList();
}
对于(int i=0;i
var query = @"//li[contains(concat(' ', @class, ' '), ' classli ')]
[div[contains(concat(' ', @class, ' '), ' element ')]]
[div[contains(concat(' ', @class, ' '), ' description ')]]";
foreach(var node in doc.SelectNodes(query))
{
var elementQuery = "div[contains(concat(' ', @class, ' '), ' element ')]";
listelement.add(node.SelectSingleNode(elementQuery).InnerText);
var descriptionQuery = "div[contains(concat(' ', @class, ' '), ' description ')]";
listdescription.add(node.SelectSingleNode(descriptionQuery).InnerText);
}
foreach(var node in doc.SelectNodes("//li[contains(@class,classli)]"))
{
List<HTMLNODE> Child = node.childnodes.where(o=> (o.getattribbutevalue(class,"") == "element") or (o.getattribbutevalue(class,"") == "description")).AsEnumerable().ToList();
}
For(int i = 0; i <= Child.count-1;i=i+2)
{
listelement.add(Child[i].InnerHtml;
listdescription.add(Child[i+1].InnerHtml;
}