从HttpWebRequest获取HTMLDocument而不使用HtmlAlityPack
我试图编写一个函数,使用“HttpWebRequest”而不是浏览器返回“htmlDocument”,但我一直在传输innerhtml 我不明白如何设置“mWebPage”的值,因为VB不接受HTMLDocument的“New” 我知道我可以使用“HtmlAgilityPack”,但我想测试我当前的代码,只更改web请求,而不更改所有解析代码。(为此,我需要一个HtmlDocument) 在这个测试之后,我将尝试更改解析代码从HttpWebRequest获取HTMLDocument而不使用HtmlAlityPack,html,vb.net,Html,Vb.net,我试图编写一个函数,使用“HttpWebRequest”而不是浏览器返回“htmlDocument”,但我一直在传输innerhtml 我不明白如何设置“mWebPage”的值,因为VB不接受HTMLDocument的“New” 我知道我可以使用“HtmlAgilityPack”,但我想测试我当前的代码,只更改web请求,而不更改所有解析代码。(为此,我需要一个HtmlDocument) 在这个测试之后,我将尝试更改解析代码 Function mWebRe(ByVal mUrl As Strin
Function mWebRe(ByVal mUrl As String) As HTMLDocument
Dim request As HttpWebRequest = CType(WebRequest.Create(mUrl), HttpWebRequest)
' Set some reasonable limits on resources used by this request
request.MaximumAutomaticRedirections = 4
request.MaximumResponseHeadersLength = 4
' Set credentials to use for this request.
request.Credentials = CredentialCache.DefaultCredentials
'Here I've tryed many types
Dim mWebPage As HTMLDocument
Try
Dim request2 As HttpWebRequest = WebRequest.Create(mUrl)
Dim response2 As HttpWebResponse = request2.GetResponse()
Dim reader2 As StreamReader = New StreamReader(response2.GetResponseStream())
Dim WebContent As String = reader2.ReadToEnd()
'This is my last attempt
'This gives Null Reference Exception
mWebPage.Body.InnerHtml = WebContent
Catch ex As Exception
MsgBox(ex.ToString)
End Try
Return mWebPage
End Function
我尝试了很多方法(也导入了HTML对象库),但都没有成功:(好的,现在这已经越来越成为一种黑客行为,但这应该能奏效 首先,您需要在类级别实例化WebBrowser控件:
Private m_objWebBrowser As WebBrowser
接下来,为DocumentCompleted事件添加一个事件处理程序,该事件包含所有HTML解析数据。您可以使用WebBrowser控件的OpenNew方法获取HtmlDocument的实例
Private Sub HandleParsing(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
'Use your code for generating WebContent.
Dim WebContent As String = "<html></html>"
Dim mWebPage As HtmlDocument = DirectCast(sender, WebBrowser).Document.OpenNew(True)
mWebPage.Write(WebContent)
End Sub
我在网上找到了一个解决方案,并修改了我的代码,如下所示: 若要使其正常工作,必须激活对“Microsoft HTML对象库”的引用(在.Com引用中) 它已经过时了,但它似乎是不使用webbrowser生成html文档的唯一方法 我希望它能帮助别人
Function mWebRe(ByVal mUrl As String) As MSHTML.HTMLDocument
Dim request As HttpWebRequest = WebRequest.Create(mUrl)
Dim doc As MSHTML.IHTMLDocument2 = New MSHTML.HTMLDocument
' Set some reasonable limits on resources used by this request
request.MaximumAutomaticRedirections = 4
request.MaximumResponseHeadersLength = 4
' Set credentials to use for this request.
request.Credentials = CredentialCache.DefaultCredentials
Try
Dim response As HttpWebResponse = request.GetResponse()
Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
Dim WebContent As String = reader.ReadToEnd()
doc.clear()
doc.write(WebContent)
doc.close()
'To make sure that the data is fully load.
While (doc.readyState <> "complete")
'This for more waiting (if needed)
'System.Threading.Thread.Sleep(1000)
Application.DoEvents()
End While
Catch ex As Exception
MsgBox(ex.ToString)
End Try
Return doc
End Function
函数mWebRe(ByVal mUrl作为字符串)作为MSHTML.HTMLDocument
Dim请求作为HttpWebRequest=WebRequest.Create(mUrl)
Dim doc作为MSHTML.IHTMLDocument2=新的MSHTML.HTMLDocument
'为此请求使用的资源设置一些合理的限制
request.MaximumAutomaticRedirections=4
request.MaximumResponseHeadersLength=4
'设置用于此请求的凭据。
request.Credentials=CredentialCache.DefaultCredentials
尝试
Dim响应为HttpWebResponse=request.GetResponse()
Dim reader As StreamReader=新StreamReader(response.GetResponseStream())
Dim WebContent As String=reader.ReadToEnd()
clear()文件
文件写入(网络内容)
文件关闭()
'以确保数据已完全加载。
而(文件readyState“完成”)
'这是为了等待更多时间(如果需要)
'系统.线程.线程.睡眠(1000)
Application.DoEvents()
结束时
特例
MsgBox(例如ToString)
结束尝试
退货单
端函数
抱歉,这也给了我一个空引用例外。问题是您的HtmlDocument是空引用。从WebBrowser控件获取一个实例。我编辑了答案以说明如何进行。这样,我在Dim mWebPage上获得空引用,即HtmlDocument=webBrowser1.Document.OpenNew(true)
编辑以显示从WebBrowser控件获取HtmlDocument类型实例的正确方法。我非常感谢您的努力(如果我学会了编写代码,这只是为了得到像您这样的人的帮助)但你的建议并不是我所需要的。我在网上找到了一个解决方案,我写的就像回答:它生锈了,但这是我一直在寻找的。
Function mWebRe(ByVal mUrl As String) As MSHTML.HTMLDocument
Dim request As HttpWebRequest = WebRequest.Create(mUrl)
Dim doc As MSHTML.IHTMLDocument2 = New MSHTML.HTMLDocument
' Set some reasonable limits on resources used by this request
request.MaximumAutomaticRedirections = 4
request.MaximumResponseHeadersLength = 4
' Set credentials to use for this request.
request.Credentials = CredentialCache.DefaultCredentials
Try
Dim response As HttpWebResponse = request.GetResponse()
Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
Dim WebContent As String = reader.ReadToEnd()
doc.clear()
doc.write(WebContent)
doc.close()
'To make sure that the data is fully load.
While (doc.readyState <> "complete")
'This for more waiting (if needed)
'System.Threading.Thread.Sleep(1000)
Application.DoEvents()
End While
Catch ex As Exception
MsgBox(ex.ToString)
End Try
Return doc
End Function