如何使用VBA跳过Excel中缺少html标记的行_Excel_Vba_Web Scraping_Xmlhttprequest_Queryselector

如何使用VBA跳过Excel中缺少html标记的行

excel vba web-scraping

如何使用VBA跳过Excel中缺少html标记的行,excel,vba,web-scraping,xmlhttprequest,queryselector,Excel,Vba,Web Scraping,Xmlhttprequest,Queryselector,这个网站上列出了15个物体，每个物体在照片下面都有一个链接。第六个对象没有。在使用“我的代码”提取和传输内容时，不会跳过缺少的html href，在Excel中，14个href列在彼此的下面（第6个单元格应保持为空或“无数据”），但最后一个单元格会这样做（&error，因为1415）。不幸的是，我必须保持我的代码结构，只需要一个循环或条件来完成它。有人有什么想法吗？谢谢我的不完整代码： Public Sub GetData() Dim html As New HTMLDocument

这个网站上列出了15个物体，每个物体在照片下面都有一个链接。第六个对象没有。在使用“我的代码”提取和传输内容时，不会跳过缺少的html href，在Excel中，14个href列在彼此的下面（第6个单元格应保持为空或“无数据”），但最后一个单元格会这样做（&error，因为1415）。不幸的是，我必须保持我的代码结构，只需要一个循环或条件来完成它。有人有什么想法吗？谢谢

我的不完整代码：

Public Sub GetData()

    Dim html As New HTMLDocument
    Dim elmt01 As Object, elmt02 As Object
    Dim y As Long
    Dim xURL As String

    Set html = New MSHTML.HTMLDocument
    xURL = "https://immobilienpool.de/suche/immobilien?page=1"
    
With CreateObject("MSXML2.XMLHTTP.6.0")
    .Open "GET", xURL, False
    .send
     html.body.innerHTML = .responseText
End With

Set elmt01 = html.querySelectorAll("li[class*='contentBox']")    '15 items
Set elmt02 = html.querySelectorAll("li a[title*='zusätzliche']") '14 hrefs

For y = 0 To elmt01.Length - 1

  If InStr(elmt02, "pdf") Then  'better: If elmt02 exists in elmt01 then...
    ActiveSheet.Cells(y + 1, 2) = elmt02.Item(y).href
  Else
    ActiveSheet.Cells(y + 1, 2) = "No document"
  End If

Next

End Sub

下面的脚本应该可以解决您遇到的问题。我必须修改你的代码才能跳过空白行。我希望您能够遵守当前版本：

Public Sub GetData()
    Dim Html As HTMLDocument, HTMLDoc As HTMLDocument
    Dim oPdfLink As Object, xURL As String, I As Long

    Set Html = New MSHTML.HTMLDocument
    Set HTMLDoc = New MSHTML.HTMLDocument
    
    xURL = "https://immobilienpool.de/suche/immobilien?page=1"
    
    With CreateObject("MSXML2.XMLHTTP.6.0")
        .Open "GET", xURL, False
        .send
         Html.body.innerHTML = .responseText
    End With

    With Html.querySelectorAll("li[class*='contentBox']")
        For I = 0 To .Length - 1
            HTMLDoc.body.innerHTML = .item(I).outerHTML
            Set oPdfLink = HTMLDoc.querySelector("a[title*='zusätzliche']")
            
            If Not oPdfLink Is Nothing Then
                ActiveSheet.Cells(I + 1, 2) = oPdfLink.href
            Else:
                ActiveSheet.Cells(I + 1, 2) = "No document"
            End If
        Next I
    End With
End Sub

非常感谢@SIM，它工作得很好！