Excel 使用VBA从网站上抓取,但它不起作用。怎么办?

Excel 使用VBA从网站上抓取,但它不起作用。怎么办?,excel,vba,extract,extraction,Excel,Vba,Extract,Extraction,我有这个网站: 我已经写了代码,但即使是第一页也不起作用。我的目标是从每页中提取以下机构详细信息作为示例: Column 1: 103 West Lounge (Food Service Inspections) Column 2: 103 WEST PACES FERRY RD ATLANTA, GA 30318 (Skip this detail) View inspections: Column 3: July 10, 2012 Score: 92, Grade: A Column 4

我有这个网站:

我已经写了代码,但即使是第一页也不起作用。我的目标是从每页中提取以下机构详细信息作为示例:

Column 1: 103 West Lounge (Food Service Inspections)
Column 2: 103 WEST PACES FERRY RD ATLANTA, GA 30318
(Skip this detail) View inspections:
Column 3: July 10, 2012 Score: 92, Grade: A 
Column 4): July 26, 2013 Score: 90, Grade: A 
Column 5): February 19, 2014 Score: 98, Grade: A 
Column 6): December 12, 2014 Score: 100, Grade: A 
Column 6): November 13, 2015 Score: 99, Grade: A
目前,代码只从任何地方提取URL,没有任何详细信息,需要检查要更改的内容或错误:

Sub Test()
Dim IE As New InternetExplorer
Dim html As HTMLDocument
Dim link As Object
Dim ws As Worksheet

Set ws = Sheets("Sheet1")

Application.ScreenUpdating = False
Set IE = New InternetExplorer

' Test 2 pages (page 2 and page 3) starting from page 2. So far so good.
For i = 2 To 4 Step 2

myurl = "http://ga.healthinspections.us/georgia/search.cfm?start=" & i & "1&1=1&f=s&r=ANY&s=&inspectionType=Food&sd=03/26/2016&ed=04/25/2016&useDate=NO&county=Fulton&"
IE.Visible = False
IE.navigate myurl
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE

Set html = IE.document
' I assume here is the problem, because I need to supplement code part to find these details. 
Set link = html.getElementsByTagName("a")

' This part was intended to test if I can to extract at least one detail.
For m = 1 To 2
For Each myurl In link
Cells(m, 1) = link

Next
Next m
Next i
'Also I tried to test with msgbox but no luck either
'MsgBox link

IE.quit
Set IE = Nothing
Application.StatusBar = ""
Application.ScreenUpdating = True

End Sub

也许有些事情搞砸了,或者我只是缺乏知识希望得到任何帮助。

您有参考资料集吗?用于Microsoft Internet控件和Microsoft HTML对象库?如果是这样的话,试着用他来代替你的代码部分

Dim IE As New InternetExplorer
Dim html As MSHTML.HTMLDocument
Dim link As Object
Dim ws As Worksheet

Set ws = Sheets("Sheet1")

Application.ScreenUpdating = False
Set IE = New InternetExplorer

你有参考资料吗?用于Microsoft Internet控件和Microsoft HTML对象库?如果是这样的话,试着用他来代替你的代码部分

Dim IE As New InternetExplorer
Dim html As MSHTML.HTMLDocument
Dim link As Object
Dim ws As Worksheet

Set ws = Sheets("Sheet1")

Application.ScreenUpdating = False
Set IE = New InternetExplorer

您可以使用下面的方法获取innertext

Sub DumpData()

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://ga.healthinspections.us/georgia/search.cfm?start=1&1=1&f=s&r=ANY&s=&inspectionType=Food&sd=03/26/2016&ed=04/25/2016&useDate=NO&county=Fulton&"

'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
   DoEvents
Loop

RowCount = 1

With Sheets("Sheet1")
   .Cells.ClearContents
   RowCount = 1
   For Each itm In IE.Document.all
      .Range("A" & RowCount) = itm.tagName
      .Range("B" & RowCount) = itm.ID
      .Range("C" & RowCount) = itm.className
      .Range("D" & RowCount) = Left(itm.innerText, 1024)

      RowCount = RowCount + 1
   Next itm
End With
End Sub
我是从一个叫乔尔的好人那里得到的。他是这方面的天才


将数据导入工作表后,进行一些简单的清理,以去除多余的内容,您应该已经准备好了。

您可以使用以下方法获取内部文本

Sub DumpData()

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://ga.healthinspections.us/georgia/search.cfm?start=1&1=1&f=s&r=ANY&s=&inspectionType=Food&sd=03/26/2016&ed=04/25/2016&useDate=NO&county=Fulton&"

'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
   DoEvents
Loop

RowCount = 1

With Sheets("Sheet1")
   .Cells.ClearContents
   RowCount = 1
   For Each itm In IE.Document.all
      .Range("A" & RowCount) = itm.tagName
      .Range("B" & RowCount) = itm.ID
      .Range("C" & RowCount) = itm.className
      .Range("D" & RowCount) = Left(itm.innerText, 1024)

      RowCount = RowCount + 1
   Next itm
End With
End Sub
我是从一个叫乔尔的好人那里得到的。他是这方面的天才


一旦数据导入到工作表中,请进行一些简单的清理,以去除多余的内容,您应该已经准备好了。

当然我已经启用了这两个库,但运气不好。还将Dim html更改为MSHTML.HTMLDocument。代码本身运行没有错误,但它从某个地方提取URL,而这不是我正在搜索的。我所知道的只是由于Set link=html.getElementsByTagName(“a”)或其他原因而无法提取的内容。当然,我已经启用了这两个库,但运气不好。还将Dim html更改为MSHTML.HTMLDocument。代码本身运行没有错误,但它从某个地方提取URL,而这不是我正在搜索的。我所知道的只是由于Set link=html.getElementsByTagName(“a”)或其他原因而无法提取的内容。谢谢你,乔尔和你。它很好,至少有一些东西,但我不会做手工清洁893页。太乱了。:)你好谢谢你,乔尔和你。它很好,至少有一些东西,但我不会做手工清洁893页。太乱了。:)