C# 如何获得链接'；使用html agility pack分别设置标题和href值？_C#_.net_Html Agility Pack

C# 如何获得链接'；使用html agility pack分别设置标题和href值？

c# .net

C# 如何获得链接'；使用html agility pack分别设置标题和href值？,c#,.net,html-agility-pack,C#,.net,Html Agility Pack,我试图下载一个包含如下表格的页面 <table id="content-table"> <tbody> <tr> <th id="name">Name</th> <th id="link">link</th> </tr> <tr class="tt_row"> <td class="ttr_name">

我试图下载一个包含如下表格的页面

<table id="content-table">
  <tbody>
    <tr>
      <th id="name">Name</th>
      <th id="link">link</th>
    </tr>

    <tr class="tt_row">

      <td class="ttr_name">
       <a title="name_of_the_movie" href="#"><b>name_of_the_movie</b></a>
       <br>
       <span class="pre">message</span>
      </td>

      <td class="td_dl">
        <a href="download_link"><img alt="Download" src="#"></a>
      </td>

    </tr>

    <tr class="tt_row"> .... </tr>
    <tr class="tt_row"> .... </tr>
  </tbody>
</table>

目前我不知道如何检查nameNode和linkNode并提取其中的数据

任何帮助都将不胜感激

问候

nameNode.Attributes["title"]
linkNode.Attributes["href"]

假设您获得了正确的节点

假设您得到的节点正确。

我现在无法测试它，但它应该是以下几行中的某个：

    string name= namenode.Element("a").Element("b").InnerText;
    string url= linknode.Element("a").GetAttributeValue("href","unknown");

我现在无法测试它，但它应该是以下几行中的一行：

    string name= namenode.Element("a").Element("b").InnerText;
    string url= linknode.Element("a").GetAttributeValue("href","unknown");

public const string UrlExtractor=@“（？：href\s*=）（？：[\s”“]*）（？！#| | | mailto |位置.| javascript |.*css |.*此\）（？*？*？（：[\s>“]）；
公共静态匹配GetMatchRegEx（字符串文本）
{
返回新的正则表达式（UrlExtractor，RegexOptions.IgnoreCase）.Match（text）；
}

以下是如何提取所有Href Url。我在我的一个项目中使用这个正则表达式，您可以修改它以满足您的需要，也可以重写它以匹配标题。我想批量匹配它们会更方便。

public const string UrlExtractor=@“（？：href\s*=）（？：[\s”“]*）（？！#| | mailto | location.| javascript | | css | | | | | | | | | | | | | this\）（？*？*；
公共静态匹配GetMatchRegEx（字符串文本）
{
返回新的正则表达式（UrlExtractor，RegexOptions.IgnoreCase）.Match（text）；
}

以下是如何提取所有Href Url。我在我的一个项目中使用这个正则表达式，您可以修改它以满足您的需要，也可以重写它以匹配标题。我想批量匹配更方便

我需要按代码单击链接（href）并获取该链接中的一些数据，问题是如何操作。我需要按代码单击链接（href）并获取该链接中的一些数据，问题是如何操作。

    public const string UrlExtractor = @"(?: href\s*=)(?:[\s""']*)(?!#|mailto|location.|javascript|.*css|.*this\.)(?<url>.*?)(?:[\s>""'])";

    public static Match GetMatchRegEx(string text)
    {
        return new Regex(UrlExtractor, RegexOptions.IgnoreCase).Match(text);
    }