Vb.net httpwebrequest获取奇怪的字符而不是html代码
我正在尝试爬网一些网站。它就像一个符咒。但有一个大问题。在某些页面(不是mutch)上,我得到了一些奇怪的字符,而不是html代码 看起来是这样的:Vb.net httpwebrequest获取奇怪的字符而不是html代码,vb.net,Vb.net,我正在尝试爬网一些网站。它就像一个符咒。但有一个大问题。在某些页面(不是mutch)上,我得到了一些奇怪的字符,而不是html代码 看起来是这样的: ;� 您正在下载的数据是GZip压缩的。你需要把它解压缩。将您的功能更改为: Dim request As HttpWebRequest Dim response As HttpWebResponse Public Function GetRequest(ByVal Params() As Object) As String() Dim
;� 您正在下载的数据是GZip压缩的。你需要把它解压缩。将您的功能更改为:
Dim request As HttpWebRequest
Dim response As HttpWebResponse
Public Function GetRequest(ByVal Params() As Object) As String()
Dim url As String = Params(0)
Dim mycookie As String = Params(1)
'request.AllowAutoRedirect = True
request = CType(HttpWebRequest.Create(url), HttpWebRequest)
request.CookieContainer = New CookieContainer()
If Not mycookie Like "nocookie" Then
request.Headers("Cookie") = mycookie
End If
request.AutomaticDecompression = DecompressionMethods.GZip
response = CType(request.GetResponse(), HttpWebResponse)
Dim html(1) As String
html(0) = request.Address.ToString()
html(1) = New StreamReader(response.GetResponseStream).ReadToEnd()
Return html
End Function
用法:
Dim params(1) As Object
params(0) = url
Dim page As String = GetRequest(params)(1)
您正在下载的数据是GZip压缩的。你需要把它解压缩。将您的功能更改为:
Dim request As HttpWebRequest
Dim response As HttpWebResponse
Public Function GetRequest(ByVal Params() As Object) As String()
Dim url As String = Params(0)
Dim mycookie As String = Params(1)
'request.AllowAutoRedirect = True
request = CType(HttpWebRequest.Create(url), HttpWebRequest)
request.CookieContainer = New CookieContainer()
If Not mycookie Like "nocookie" Then
request.Headers("Cookie") = mycookie
End If
request.AutomaticDecompression = DecompressionMethods.GZip
response = CType(request.GetResponse(), HttpWebResponse)
Dim html(1) As String
html(0) = request.Address.ToString()
html(1) = New StreamReader(response.GetResponseStream).ReadToEnd()
Return html
End Function
用法:
Dim params(1) As Object
params(0) = url
Dim page As String = GetRequest(params)(1)
将响应的编码设置为UTF-8。但是。。如果我也在其他网站上爬行,那不是有问题吗?嗯,你会得到回复,不是吗?这个问题看起来像是编码问题。你能分享你想下载的网址吗?当然,看起来我得寻找网站的编码,对吧?UTF-8不起作用:(我得到了与上面相同的字符。将响应的编码设置为UTF-8。但是..如果我也在爬网其他网站,这不是有问题吗?嗯,你得到了响应,不是吗?这个问题看起来像是编码问题。你能共享你试图下载的URL吗?当然,看起来我必须要寻找响应的编码现场,对吗?UTF-8不起作用:(我得到了与上述相同的角色。显然,这可以自动完成:@AndrewMorton,是的,我想是的,两种方法都可以解决他的问题。非常感谢!非常感谢你的帮助。非常感谢!!显然,这可以自动完成:@AndrewMorton,是的,我想是的,两种方法都可以解决他的问题。非常感谢你r帮帮忙。真的很感激!!