C# 比如html agility pack中的语句或删除尾随空格?
我正在尝试将数据从网站下载到数据表中。问题是我无法访问正确的节点,因为似乎存在blank空格。以下是我目前的代码:C# 比如html agility pack中的语句或删除尾随空格?,c#,html-agility-pack,C#,Html Agility Pack,我正在尝试将数据从网站下载到数据表中。问题是我无法访问正确的节点,因为似乎存在blank空格。以下是我目前的代码: public static DataTable downloadtable() { DataTable dt = new DataTable(); string htmlCode = ""; using (WebClient client = new WebClient()) {
public static DataTable downloadtable()
{
DataTable dt = new DataTable();
string htmlCode = "";
using (WebClient client = new WebClient())
{
client.Headers.Add(HttpRequestHeader.UserAgent, "AvoidError");
htmlCode = client.DownloadString("https://www.eex.com/en/Market%20Data/Trading%20Data/Power/Hour%20Contracts%20%7C%20Spot%20Hourly%20Auction/Area%20Prices/spot-hours-area-table/2013-08-22");
}
//this is just to check the file structure from text file
System.IO.StreamWriter file = new System.IO.StreamWriter("c:\\temp\\test.txt");
file.WriteLine(htmlCode);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlCode);
dt = new DataTable();
foreach (HtmlNode table in doc.DocumentNode.SelectNodes("//table[@class='list electricity']/tr/th[@class='title'][.='Market Area']"))
{
//This is the problem name where I get the error
foreach (HtmlNode row in table.SelectNodes("//td[@class='title'][.=' 00-01 ']"))
{
foreach (var cell in row.SelectNodes("//td"))
{
//this is to check for correct result, final result would be to dump it into datatable
Console.WriteLine(cell.InnerText);
}
}
}
return dt;
}
我试图从代码中的链接下载小时价格,但似乎失败了,因为后面有空格(我想)。
节点的名称是否有like语句?或者您可以删除尾随空格吗?我相信您的问题是,您试图从
td
节点内部检索td
,该节点显然没有更多的td
<tr>
<td class="title"> 00-01 </td>
<td class="spacer"></td>
<td class="r">€/MWh</td>
<td class="spacer"></td>
<td>35.34</td>
<td class="spacer"></td>
<td>34.02</td>
<td class="spacer"></td>
<td>34.02</td>
</tr>
如果只需要00-01行,可以使用此行:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlCode);
foreach (HtmlNode row in doc2.DocumentNode.SelectNodes("//td[@class='title'][(normalize-space(.)='00-01')]/ancestor::table"))
{
foreach (var cell in row.SelectNodes("./tr/td"))
{
if (string.IsNullOrEmpty(cell.InnerText.Trim()))
continue;
Console.WriteLine(cell.InnerText.Trim());
}
}
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlCode);
foreach (HtmlNode row in doc.DocumentNode.SelectNodes("//td[@class='title']"))
{
if (row.InnerText.Trim() == "00-01")
{
foreach (var cell in row.ParentNode.ChildNodes)
{
if (string.IsNullOrEmpty(cell.InnerText.Trim()))
continue;
Console.WriteLine(cell.InnerText.Trim());
}
}
}
或者您可以将其用作:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlCode);
foreach (HtmlNode row in doc2.DocumentNode.SelectNodes("//td[@class='title'][(normalize-space(.)='00-01')]"))
{
foreach (var cell in row.ParentNode.ChildNodes)
{
if (string.IsNullOrEmpty(cell.InnerText.Trim()))
continue;
Console.WriteLine(cell.InnerText.Trim());
}
}