用VBA Excel从嵌入式网页中提取数据
我正在尝试使用Excel中的VBA访问嵌入在网页中的网页中的数据。如果表位于非嵌入页面上,我知道如何执行此操作。我还知道如何使用VBA导航到此产品的页面。我不能只导航到嵌入页面,因为有一个产品id查找将零件号转换为id,而我没有访问该数据库的权限 以下是该页面的链接: 为了清晰起见,我会摆出一张元素的图片,但我没有10个代表点 我需要从中获取信息的表是产品生命周期表 如果使用以下代码将页面保存为VBA中的HTMLDocument,则可以在相应项下名为src的属性中看到正确的url:用VBA Excel从嵌入式网页中提取数据,excel,vba,web-scraping,Excel,Vba,Web Scraping,我正在尝试使用Excel中的VBA访问嵌入在网页中的网页中的数据。如果表位于非嵌入页面上,我知道如何执行此操作。我还知道如何使用VBA导航到此产品的页面。我不能只导航到嵌入页面,因为有一个产品id查找将零件号转换为id,而我没有访问该数据库的权限 以下是该页面的链接: 为了清晰起见,我会摆出一张元素的图片,但我没有10个代表点 我需要从中获取信息的表是产品生命周期表 如果使用以下代码将页面保存为VBA中的HTMLDocument,则可以在相应项下名为src的属性中看到正确的url: For Ea
For Each cell In Selection
link = "http://support.automation.siemens.com/US/llisapi.dll?func=cslib.csinfo&lang=en&objid=" & cell & "&caller=view"
ie.navigate link
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Dim doc As HTMLDocument
是否有方法使用VBA索引此表,或者我必须联系公司并尝试访问产品ID,以便直接导航到该页面
关于我在下面的评论,下面是录制的宏生成的代码:
ActiveCell.FormulaR1C1 = _
"http://support.automation.siemens.com/WW/llisapi.dll?func=cslib.csinfo&lang=en&objid=6ES7194-1AA01-0XA0&caller=view"
Range("F9").Select
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://support.automation.siemens.com/WW/llisapi.dll?func=ll&objid=7224052&nodeid0=10997566&caller=view&lang=en&siteid=cseus&aktprim=0&objaction=csopen&extranet=standard&viewreg=WW" _
, Destination:=Range("$F$9"))
.FieldNames = True
.RowNumbers = False
我知道在哪里可以找到字符串:URL;http://support.automation.siemens.com/WW/llisapi.dll?func=ll&objid=7224052&nodeid0=10997566&caller=view&lang=en&siteid=cseus&aktprim=0&objaction=csopen&extranet=standard&viewreg=WW,但我不知道如何将其保存到变量。我不确定是否完全理解您的问题,但是这里有一些代码可以将源代码隐藏在感兴趣的表后面。您可以使用instr和mid等函数提取感兴趣的数据
' open IE, navigate to the website of interest and loop until fully loaded
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate "http://support.automation.siemens.com/WW/view/en/7224052"
.Top = 50
.Left = 530
.Height = 400
.Width = 400
Do Until Not ie.Busy And ie.ReadyState = 4
DoEvents
Loop
End With
' Assign the source code behind the page to a variable
my_var = ie.document.frames(3).document.DocumentElement.innerhtml
' Extract the url for the "Product life cycle" table
pos_1 = InStr(1, my_var, "product life cycle", vbTextCompare)
pos_2 = InStr(pos_1, my_var, "/WW/llisapi", vbTextCompare)
pos_3 = InStr(pos_2, my_var, """><", vbTextCompare)
pos_4 = InStr(pos_3, my_var, """/>", vbTextCompare)
table_url = Mid(my_var, pos_2, pos_3 - pos_2)
table_url = Replace(table_url, "amp;", "", 1, -1, vbTextCompare)
table_url = "http://support.automation.siemens.com" & table_url
' navigate to the table url
ie.navigate table_url
Do Until Not ie.Busy And ie.ReadyState = 4
DoEvents
Loop
' assign the source code for this page to a variable and extract the desired information
my_var2 = ie.document.body.innerhtml
pos_1 = InStr(1, my_var2, "ET 200X, basic modules,", vbTextCompare)
' close ie
ie.Quit
我在让ron的代码正常工作时遇到了问题,我想是因为IE不容易使用框架。下面是一些代码,它将从您提到的表中提取一些数据,到目前为止它还不能处理图表
Sub FrameStrip()
Dim oFrames As Object
Dim tdelements As Object
Dim tdElement As Object
Dim oFrame As MSHTML.HTMLFrameElement
Dim oElement As Object
Dim sString As String
Dim myVar As Variant
Dim sLinks() As String
Dim i As Integer
Dim bfound As Boolean
Dim url As String
Dim oIE As InternetExplorer
Set oIE = New InternetExplorer
url = "http://support.automation.siemens.com/WW/view/en/7224052"
'Set address for use with relative source names
myVar = Split(url, "/")
sString = myVar(0) & "//" & myVar(2)
oIE.navigate url
oIE.Visible = True
Do Until (oIE.readyState = 4 And Not oIE.Busy)
DoEvents
Loop
Set oFrames = oIE.document.getElementsByTagName("frame")
ReDim sLinks(oFrames.Length)
'Get the source locations for each frame
i = 0
For Each oFrame In oFrames
sLinks(i) = sString & (oFrame.getAttribute("src"))
i = i + 1
Next oFrame
'Go through each frame to find the table
i = 0
bfound = False
Do While i < UBound(sLinks) And bfound = False
oIE.navigate sLinks(i)
Do Until (oIE.readyState = 4 And Not oIE.Busy)
DoEvents
Loop
Set oElement = oIE.document.getElementById("produktangaben")
bfound = IsSet(oElement)
i = i + 1
Loop
Set tdelements = oElement.getElementsByTagName("td")
'Display information about table
sString = ""
For Each tdElement In tdelements
Debug.Print tdElement.innerText
sString = sString & tdElement.innerText
Next tdElement
End Sub
Function IsSet(ByRef oElement As Object) As Boolean
Dim tdelements As Object
Dim bSet As Boolean
bSet = True
On Error GoTo ErrorSet
Set tdelements = oElement.getElementsByTagName("td")
On Error GoTo 0
Cleanup:
On Error Resume Next
Set tdelements = Nothing
On Error GoTo 0
IsSet = bSet
Exit Function
ErrorSet:
bSet = False
GoTo Cleanup:
End Function
我只是试着录制一个宏来实现这一点。这是有帮助的,因为它显示了嵌入页面的正确索引。但是,嵌入的页面在宏中作为字符串输入。这对我没有帮助,因为我知道在哪里可以找到这个字符串,但不知道获取它的代码。。。