C#:webclient下载html页面的源代码,而不是实际的资源
我从一个MSDN博客中采用了这段代码,并添加了webclient来下载资源C#:webclient下载html页面的源代码,而不是实际的资源,c#,webclient-download,C#,Webclient Download,我从一个MSDN博客中采用了这段代码,并添加了webclient来下载资源 string formUrl = "My login url"; string formParams = string.Format("userName={0}&password={1}&x={2}&y={3}&login={4}", "user", "password","0","0","login"); string cookieHead
string formUrl = "My login url";
string formParams = string.Format("userName={0}&password={1}&x={2}&y={3}&login={4}", "user", "password","0","0","login");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
string pageSource;
string getUrl = "Resource url";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
System.Console.WriteLine(sr.ToString());
}
WebClient wc = new WebClient();
wc.Headers["Content-Type"] = "application/x-www-form-urlencoded";
wc.DownloadFile("Resource url","C:\\abc.tgz");
Console.Read();
但是abc.tgz不是它应该的样子。所以当我用记事本打开它时,我注意到它是“我的登录URL”页面的源文件。。
我哪里做错了
是否有webclient的任何属性可用于查看错误。。ie。。基址等?让我们简化一下,好吗:
public class CookiesAwareWebClient : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public CookiesAwareWebClient()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
((HttpWebRequest)request).CookieContainer = CookieContainer;
return request;
}
}
class Program
{
static void Main()
{
using (var client = new CookiesAwareWebClient())
{
var values = new NameValueCollection
{
{ "userName", "user" },
{ "password", "password" },
{ "x", "0" }, // <- I doubt the server cares about the x position of where the user clicked on the image submit button :-)
{ "y", "0" }, // <- I doubt the server cares about the y position of where the user clicked on the image submit button :-)
{ "login", "login" },
};
// We authenticate first
client.UploadValues("http://example.com/login", values);
// Now we can download
client.DownloadFile("http://example.com/abc.tgz", @"c:\abc.tgz");
}
}
}
公共类CookiesAwareWebClient:WebClient
{
公共CookieContainer CookieContainer{get;private set;}
公共CookiesAwareWebClient()
{
CookieContainer=新CookieContainer();
}
受保护的覆盖WebRequest GetWebRequest(Uri地址)
{
var request=base.GetWebRequest(地址);
((HttpWebRequest)请求)。CookieContainer=CookieContainer;
返回请求;
}
}
班级计划
{
静态void Main()
{
使用(var client=new CookiesAwareWebClient())
{
var values=新的NameValueCollection
{
{“用户名”,“用户”},
{“密码”,“密码”},
{“x”,“0”},//您确定命名的所有表单字段都完全正确吗?它没有下载“源代码”-它提供的是登录页面HTML;仅此而已。在大多数情况下,这样的安全页面不用于抓取。您已经检查过,这是否在网站条款和条件中。是吗?getUrl
与formUrl
不同。它们应该是两个独立的请求吗?听起来您的代码做的正是它所做的上传至。它正在下载被重定向到的页面。非常感谢Darin!