需要通过MSXML2.XMLHTTP帮助进行web抓取吗
我有代码可以正确地刮取数据需要通过MSXML2.XMLHTTP帮助进行web抓取吗,web,screen-scraping,Web,Screen Scraping,我有代码可以正确地刮取数据 Sub Phase3() Dim Doc As HTMLDocument Dim ar(1 To 250, 1 To 3) As String Dim URL As String Dim NumberofStocksymbols As Variant Dim objHTTP
Sub Phase3()
Dim Doc As HTMLDocument
Dim ar(1 To 250, 1 To 3) As String
Dim URL As String
Dim NumberofStocksymbols As Variant
Dim objHTTP As Object
'
Set Doc = New HTMLDocument
Set objHTTP = New MSXML2.XMLHTTP
'
RowCounter = 5
TotalStocksToLoad = 30
'
BaseURL = "https://www.barchart.com/stocks/quotes/"
'
NumberofStocksymbols = WorksheetFunction.Transpose(Range("B5:B" & TotalStocksToLoad + RowCounter).Value)
'
For StockCount = 1 To UBound(NumberofStocksymbols)
'
DoEvents
'
Application.StatusBar = "Processing " & StockCount & " - " & currentstocksymbol
'
URL = BaseURL & NumberofStocksymbols(StockCount) & "/analyst-ratings"
'
objHTTP.Open "GET", URL, False
objHTTP.send
'
If objHTTP.Status = 200 Then
Doc.body.innerHTML = objHTTP.responseText
Else
'There has been an error
Doc.body.innerHTML = ""
End If
'
On Error Resume Next
'
ar(StockCount, 1) = Doc.getElementsByClassName("block__colored-header")(3).innerText
ar(StockCount, 2) = Doc.getElementsByClassName("block__average_value")(3).innerText
ar(StockCount, 3) = Doc.getElementsByClassName("bold")(3).innerText
Next
'
Range("P5").Resize(TotalStocksToLoad, 3).Value = ar
End Sub
由于某些原因,以下代码不起作用
Sub Phase1()
Dim DOC As HTMLDocument
Dim ar(1 To 250, 1 To 8) As String
Dim URL As String
Dim Stocksymbols As Variant
Dim objHTTP As Object
'
Set Doc = New HTMLDocument
Set objHTTP = New MSXML2.XMLHTTP
'
CellCounter = 0 ' Left to right cell counter on the web page that is being scraped
RowCounter = 5
TotalStocksToLoad = 30
'
URL = "https://www.barchart.com/stocks/performance/percent-change/declines?timeFrame=3m&viewName=main"
'
For StockCount = 1 To TotalStocksToLoad
DoEvents
'
Application.StatusBar = "Processing " & I
'
objHTTP.Open "GET", URL, False
objHTTP.send
'
If objHTTP.Status = 200 Then
Doc.body.innerHTML = objHTTP.responseText
Else
'There has been an error
Doc.body.innerHTML = ""
End If
'
On Error Resume Next
'
ar(StockCount, 1) = Trim(DOC.getElementsByTagName("td")(CellCounter).innerText)
ar(StockCount, 2) = Trim(DOC.getElementsByTagName("td")(CellCounter + 1).innerText)
ar(StockCount, 3) = ""
ar(StockCount, 4) = DOC.getElementsByTagName("td")(CellCounter + 3).innerText
ar(StockCount, 5) = DOC.getElementsByTagName("td")(CellCounter + 4).innerText
ar(StockCount, 6) = DOC.getElementsByTagName("td")(CellCounter + 5).innerText
ar(StockCount, 7) = DOC.getElementsByTagName("td")(CellCounter + 7).innerText
ar(StockCount, 8) = DOC.getElementsByTagName("td")(CellCounter + 8).innerText
'
CellCounter = CellCounter + 12 ' Advance to next row on URL page
Next
Range("B5").Resize(TotalStocksToLoad, 8).Value = ar
Application.StatusBar = "Done"
End Sub
第二个代码运行时没有错误,我遇到的问题是,使用这种webscraping方法不会产生任何结果。如果我使用InternetExplorer进行网络垃圾处理,第二个代码可以正常工作。任何想法都将不胜感激。对不起,各位,我已经更正了我在这里的初始帖子,以分离代码,并完成我的想法,使之成为一个实际问题。