从HttpWebRequest获取HTMLDocument而不使用HtmlAlityPack

从HttpWebRequest获取HTMLDocument而不使用HtmlAlityPack,html,vb.net,Html,Vb.net,我试图编写一个函数,使用“HttpWebRequest”而不是浏览器返回“htmlDocument”,但我一直在传输innerhtml 我不明白如何设置“mWebPage”的值,因为VB不接受HTMLDocument的“New” 我知道我可以使用“HtmlAgilityPack”,但我想测试我当前的代码,只更改web请求,而不更改所有解析代码。(为此,我需要一个HtmlDocument) 在这个测试之后,我将尝试更改解析代码 Function mWebRe(ByVal mUrl As Strin

我试图编写一个函数,使用“HttpWebRequest”而不是浏览器返回“htmlDocument”,但我一直在传输innerhtml

我不明白如何设置“mWebPage”的值,因为VB不接受HTMLDocument的“New”

我知道我可以使用“HtmlAgilityPack”,但我想测试我当前的代码,只更改web请求,而不更改所有解析代码。(为此,我需要一个HtmlDocument)

在这个测试之后,我将尝试更改解析代码

Function mWebRe(ByVal mUrl As String) As HTMLDocument
    Dim request As HttpWebRequest = CType(WebRequest.Create(mUrl), HttpWebRequest)

    ' Set some reasonable limits on resources used by this request
    request.MaximumAutomaticRedirections = 4
    request.MaximumResponseHeadersLength = 4

    ' Set credentials to use for this request.
    request.Credentials = CredentialCache.DefaultCredentials

    'Here I've tryed many types
    Dim mWebPage As HTMLDocument
    Try
        Dim request2 As HttpWebRequest = WebRequest.Create(mUrl)
        Dim response2 As HttpWebResponse = request2.GetResponse()
        Dim reader2 As StreamReader = New StreamReader(response2.GetResponseStream())
        Dim WebContent As String = reader2.ReadToEnd()

        'This is my last attempt
        'This gives Null Reference Exception
        mWebPage.Body.InnerHtml = WebContent


    Catch ex As Exception
        MsgBox(ex.ToString) 
    End Try

    Return mWebPage
End Function

我尝试了很多方法(也导入了HTML对象库),但都没有成功:(

好的,现在这已经越来越成为一种黑客行为,但这应该能奏效

首先,您需要在类级别实例化WebBrowser控件:

Private m_objWebBrowser As WebBrowser
接下来,为DocumentCompleted事件添加一个事件处理程序,该事件包含所有HTML解析数据。您可以使用WebBrowser控件的OpenNew方法获取HtmlDocument的实例

Private Sub HandleParsing(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)

    'Use your code for generating WebContent.
    Dim WebContent As String = "<html></html>"

    Dim mWebPage As HtmlDocument = DirectCast(sender, WebBrowser).Document.OpenNew(True)

    mWebPage.Write(WebContent)

End Sub

我在网上找到了一个解决方案,并修改了我的代码,如下所示: 若要使其正常工作,必须激活对“Microsoft HTML对象库”的引用(在.Com引用中)

它已经过时了,但它似乎是不使用webbrowser生成html文档的唯一方法

我希望它能帮助别人

Function mWebRe(ByVal mUrl As String) As MSHTML.HTMLDocument
    Dim request As HttpWebRequest = WebRequest.Create(mUrl)
    Dim doc As MSHTML.IHTMLDocument2 = New MSHTML.HTMLDocument

    ' Set some reasonable limits on resources used by this request
    request.MaximumAutomaticRedirections = 4
    request.MaximumResponseHeadersLength = 4

    ' Set credentials to use for this request.
    request.Credentials = CredentialCache.DefaultCredentials

    Try
        Dim response As HttpWebResponse = request.GetResponse()
        Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
        Dim WebContent As String = reader.ReadToEnd()

        doc.clear()
        doc.write(WebContent)
        doc.close()

        'To make sure that the data is fully load.
        While (doc.readyState <> "complete")
            'This for more waiting (if needed)
            'System.Threading.Thread.Sleep(1000)
            Application.DoEvents()
        End While
    Catch ex As Exception
        MsgBox(ex.ToString)
    End Try

    Return doc
End Function
函数mWebRe(ByVal mUrl作为字符串)作为MSHTML.HTMLDocument
Dim请求作为HttpWebRequest=WebRequest.Create(mUrl)
Dim doc作为MSHTML.IHTMLDocument2=新的MSHTML.HTMLDocument
'为此请求使用的资源设置一些合理的限制
request.MaximumAutomaticRedirections=4
request.MaximumResponseHeadersLength=4
'设置用于此请求的凭据。
request.Credentials=CredentialCache.DefaultCredentials
尝试
Dim响应为HttpWebResponse=request.GetResponse()
Dim reader As StreamReader=新StreamReader(response.GetResponseStream())
Dim WebContent As String=reader.ReadToEnd()
clear()文件
文件写入(网络内容)
文件关闭()
'以确保数据已完全加载。
而(文件readyState“完成”)
'这是为了等待更多时间(如果需要)
'系统.线程.线程.睡眠(1000)
Application.DoEvents()
结束时
特例
MsgBox(例如ToString)
结束尝试
退货单
端函数

抱歉,这也给了我一个空引用例外。问题是您的HtmlDocument是空引用。从WebBrowser控件获取一个实例。我编辑了答案以说明如何进行。这样,我在
Dim mWebPage上获得空引用,即HtmlDocument=webBrowser1.Document.OpenNew(true)
编辑以显示从WebBrowser控件获取HtmlDocument类型实例的正确方法。我非常感谢您的努力(如果我学会了编写代码,这只是为了得到像您这样的人的帮助)但你的建议并不是我所需要的。我在网上找到了一个解决方案,我写的就像回答:它生锈了,但这是我一直在寻找的。
Function mWebRe(ByVal mUrl As String) As MSHTML.HTMLDocument
    Dim request As HttpWebRequest = WebRequest.Create(mUrl)
    Dim doc As MSHTML.IHTMLDocument2 = New MSHTML.HTMLDocument

    ' Set some reasonable limits on resources used by this request
    request.MaximumAutomaticRedirections = 4
    request.MaximumResponseHeadersLength = 4

    ' Set credentials to use for this request.
    request.Credentials = CredentialCache.DefaultCredentials

    Try
        Dim response As HttpWebResponse = request.GetResponse()
        Dim reader As StreamReader = New StreamReader(response.GetResponseStream())
        Dim WebContent As String = reader.ReadToEnd()

        doc.clear()
        doc.write(WebContent)
        doc.close()

        'To make sure that the data is fully load.
        While (doc.readyState <> "complete")
            'This for more waiting (if needed)
            'System.Threading.Thread.Sleep(1000)
            Application.DoEvents()
        End While
    Catch ex As Exception
        MsgBox(ex.ToString)
    End Try

    Return doc
End Function