C# 提取网站纯html
我正在尝试使用以下代码访问网站内容:C# 提取网站纯html,c#,html,https,httpclient,yahoo,C#,Html,Https,Httpclient,Yahoo,我正在尝试使用以下代码访问网站内容: HttpClient httpClient = new HttpClient(); string htmlresult = ""; var response = await httpClient.GetAsync(url); if (response.IsSuccessStatusCode) { htmlresult = await response.Content.ReadAsStringAsync(); } return htmlresul
HttpClient httpClient = new HttpClient();
string htmlresult = "";
var response = await httpClient.GetAsync(url);
if (response.IsSuccessStatusCode)
{
htmlresult = await response.Content.ReadAsStringAsync();
}
return htmlresult;
除了https://www.yahoo.com
,这可能会给我一个加密的字符串,而不是普通的html,类似于下面的内容
雅虎使用接受编码:gzip、deflate、br
,因此您案例中的内容是g-zipped。快速修复代码-启用自动解压缩:
private async Task<String> GetUrl(string url)
{
HttpClientHandler handler = new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};
HttpClient httpClient = new HttpClient(handler);
string htmlresult = "";
var response = await httpClient.GetAsync(url);
if (response.IsSuccessStatusCode)
{
htmlresult = await response.Content.ReadAsStringAsync();
}
return htmlresult;
}
专用异步任务GetUrl(字符串url)
{
HttpClientHandler handler=新的HttpClientHandler()
{
AutomaticDecompression=DecompressionMethods.GZip | DecompressionMethods.Deflate
};
HttpClient HttpClient=新的HttpClient(处理程序);
字符串htmlresult=“”;
var response=wait-httpClient.GetAsync(url);
if(响应。IsSuccessStatusCode)
{
htmlresult=await response.Content.ReadAsStringAsync();
}
返回htmlresult;
}
雅虎使用接受编码:gzip、deflate、br
,因此您案例中的内容是g-zipped。快速修复代码-启用自动解压缩:
private async Task<String> GetUrl(string url)
{
HttpClientHandler handler = new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};
HttpClient httpClient = new HttpClient(handler);
string htmlresult = "";
var response = await httpClient.GetAsync(url);
if (response.IsSuccessStatusCode)
{
htmlresult = await response.Content.ReadAsStringAsync();
}
return htmlresult;
}
专用异步任务GetUrl(字符串url)
{
HttpClientHandler handler=新的HttpClientHandler()
{
AutomaticDecompression=DecompressionMethods.GZip | DecompressionMethods.Deflate
};
HttpClient HttpClient=新的HttpClient(处理程序);
字符串htmlresult=“”;
var response=wait-httpClient.GetAsync(url);
if(响应。IsSuccessStatusCode)
{
htmlresult=await response.Content.ReadAsStringAsync();
}
返回htmlresult;
}
您确定它不仅仅是压缩或编码的吗?Yahoo使用接受编码:gzip,deflate,br
这是正确的答案您确定它不仅仅是压缩或编码的吗?Yahoo使用接受编码:gzip,deflate,br
这是正确的答案