C# 在html敏捷包中加速解析_C#_Html Agility Pack_Web Crawler_Google Local Search

C# 在html敏捷包中加速解析

c# web-crawler

C# 在html敏捷包中加速解析,c#,html-agility-pack,web-crawler,google-local-search,C#,Html Agility Pack,Web Crawler,Google Local Search,这是我使用html agility pack获取某些标记的方法。我用这种方法来做谷歌本地排名。这似乎需要相当多的时间和记忆密集型，有人有任何建议，使它更好吗 private void findGoogleLocal(HtmlNode node) { String name = String.Empty; // // ---------------------------------------- if (node.Attribute

这是我使用html agility pack获取某些标记的方法。我用这种方法来做谷歌本地排名。这似乎需要相当多的时间和记忆密集型，有人有任何建议，使它更好吗

 private void findGoogleLocal(HtmlNode node) {

     String   name        = String.Empty;
     // 
     // ----------------------------------------
     if (node.Attributes["id"] != null) {

       if (node.Attributes["id"].Value.ToString().Contains("panel_") &&   node.Attributes["id"].Value.ToString() != "panel__")
        {
        GoogleLocalResults.Add(new Result(URLGoogleLocal, Listing, node, SearchEngine.Google, SearchType.Local, ResultType.GooglePlaces));
        }
    }

    if (node.HasChildNodes) {
      foreach (HtmlNode children in node.ChildNodes)  {
        findGoogleLocal(children);
      }
    }

  }

Fizzler:HAP的CSS选择器引擎

Fizzler:HAP的CSS选择引擎

为什么这个方法必须是递归的？只需一次性获得所有节点（例如使用HAP中的Linq支持）：

我只想添加另一个干净、简单和快速的解决方案：使用。

谢谢你的作品完美！foreach（Result x in results）{GoogleLocalResults.Add（x）；}您甚至可以简化这个过程，因为HtmlNode有一个默认可用的Id属性。因此x.Id.Contains（“面板”）和&！x、 Id==“panel_uu;”，无需检查x。Id==null。谢谢你的工作完美！foreach（Result x in results）{GoogleLocalResults.Add（x）；}您甚至可以简化这个过程，因为HtmlNode有一个默认可用的Id属性。因此x.Id.Contains（“面板”）和&！x、 Id==“panel_uuu”，无需检查x。Id==null。

var results = node.Descendants()
                  .Where(x=> x.Attributes["id"]!= null && 
                             x.Attributes["id"].Value.Contains("panel_") &&  
                             x.Attributes["id"].Value!= "panel__")
                  .Select( x=> new Result(URLGoogleLocal, Listing, x, SearchEngine.Google, SearchType.Local, ResultType.GooglePlaces));

var results = node
                .SelectNodes(@"//*[contains(@id, 'panel_') and @id != 'panel__']")
                .Select(x => new Result(URLGoogleLocal, Listing, x, SearchEngine.Google, SearchType.Local, ResultType.GooglePlaces));
foreach (var result in results)
    GoogleLocalResults.Add(result);