C# 如何使用asp.net从网页中刮取数据
我想从我的html页面中获取值。 我尝试使用HttpWebRequest获得相同的结果,但到目前为止我无法做到。请帮助C# 如何使用asp.net从网页中刮取数据,c#,asp.net,web-scraping,C#,Asp.net,Web Scraping,我想从我的html页面中获取值。 我尝试使用HttpWebRequest获得相同的结果,但到目前为止我无法做到。请帮助 <div class="container"> <div class="one-third column"> <ol start="181"> <li><a href="/lyrics/hindi-lyrics-of-Aaye%20Din%20Bahar%20Ke.html">Aaye Din
<div class="container">
<div class="one-third column">
<ol start="181">
<li><a href="/lyrics/hindi-lyrics-of-Aaye%20Din%20Bahar%20Ke.html">Aaye Din Bahar Ke</a>
</li><li><a href="/lyrics/hindi-lyrics-of-Aayega%20Aane%20Wala.html">Aayega Aane Wala</a>
</li><li><a href="/lyrics/hindi-lyrics-of-Aayi%20Milan%20Ki%20Raat.html">Aayi Milan Ki Raat</a>
</li><li><a href="/lyrics/hindi-lyrics-of-Aiyyaa.html">Aiyyaa</a>
</li><li><a href="/lyrics/hindi-lyrics-of-Ajab%20Gazabb%20Love.html">Ajab Gazabb Love</a>
</li></ol>
</div>
<div class="sixteen columns">
<hr>
More Pages:
<a href="hindi-songs-starting-A.html">1</a> : <a href="hindi-songs-starting-A-page-2.html">2</a> : 3 : <a href="hindi-songs-starting-A-page-4.html">4</a> : <a href="hindi-songs-starting-A-page-5.html">5</a> : <a href="hindi-songs-starting-A-page-6.html">6</a> :
<hr>
<center>
<h4>Hindi Lyrics By Movie Title</h4>
<p>
<a href="/lyrics/hindi-songs-starting-0.html">0-9</a>
<a href="/lyrics/hindi-songs-starting-A.html">A</a>
<a href="/lyrics/hindi-songs-starting-B.html">B</a>
<a href="/lyrics/hindi-songs-starting-W.html">W</a>
X
<a href="/lyrics/hindi-songs-starting-Y.html">Y</a>
<a href="/lyrics/hindi-songs-starting-Z.html">Z</a>
| <a href="http://www.hindilyrics.net/songs/">Top Songs</a>
</p>
</center>
</div>
更多页面:
: : 3 : : : :
电影标题印地语歌词
X
|
这是我的html,我想获取所有链接。你可以使用
System.Net.Http.HttpClient
类和GetAsync()
方法。HttpClient类具有用于异步下载网站的良好功能。或者您可以使用WebRequest
class-一种非常基本的方法 我们可以使用htmlagilitypack在坐着的时候擦伤。你可以从这里下载
你试过的。分享你的代码(C#)
string urls = "your web page";
string result = string.Empty;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urls);
request.UserAgent = @"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5";
using (var stream = request.GetResponse().GetResponseStream())
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
result = reader.ReadToEnd();
}
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(new StringReader(result));
var elements = doc.DocumentNode.SelectNodes("//div[@class='one-third column']");
foreach (HtmlNode item in elements)
{
var node1 = item.SelectNodes(".//li");
foreach (HtmlNode li in node1)
{
var a = li.SelectSingleNode("//a").Attributes["href"].Value;//your link
}
}