Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/excel/26.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Excel 在字典中存储多个项目,以便以后打印_Excel_Vba_Dictionary_Web Scraping - Fatal编程技术网

Excel 在字典中存储多个项目,以便以后打印

Excel 在字典中存储多个项目,以便以后打印,excel,vba,dictionary,web-scraping,Excel,Vba,Dictionary,Web Scraping,我用vba编写了一个脚本,从一个网页上从咖啡馆中删除不同的类别。我试图解析的类别是shopname、address和phone。我已经在脚本中定义了选择器。我面临的问题是,我无法将它们存储在字典中以便以后打印 如果是两个项目,我可以像我已经展示的那样处理它们。当有另一个项目出现时,我会感到困惑,比如在phone中(目前它在下面被注释掉)开始发挥作用 如何在字典中存储三项并打印它们? 要添加以执行上述脚本的参考: Microsoft XML, v6.0 Microsoft HTML Object

我用vba编写了一个脚本,从一个网页上从咖啡馆中删除不同的类别。我试图解析的类别是
shopname
address
phone
。我已经在脚本中定义了选择器。我面临的问题是,我无法将它们存储在字典中以便以后打印

如果是两个项目,我可以像我已经展示的那样处理它们。当有另一个项目出现时,我会感到困惑,比如在phone中(目前它在下面被注释掉)开始发挥作用

如何在字典中存储三项并打印它们?

要添加以执行上述脚本的参考:

Microsoft XML, v6.0
Microsoft HTML Object Library
我想学习如何在字典中存储多个条目,以便以后打印。

预期产出:


看来我能达到如下的效果。如果有更好的方法出现,我会放弃我的答案:

For Each post In Html.getElementsByClassName("info")
    shopName = post.querySelector(".business-name span").innerText
    address = post.querySelector(".adr").innerText
    phone = post.querySelector(".phones").innerText
    idic(shopName & "|" & address & "|" & phone) = 1
Next post

For Each key In idic.keys
    R = R + 1: Cells(R, 1) = Split(key, "|")(0)
    Cells(R, 2) = Split(key, "|")(1)
    Cells(R, 3) = Split(key, "|")(2)
Next key

我喜欢已经给出的答案(+)。您还可以将数组加载到项中

For Each post In Html.getElementsByClassName("info")
    shopName = post.querySelector(".business-name span").innerText
    address = post.querySelector(".adr").innerText
    phone = post.querySelector(".phones").innerText
    idic(post) = Array(shopName, address, phone)
Next post

For Each key In idic.keys
    R = R + 1: ActiveSheet.Cells(R, 1) = idic(key)(0)
    ActiveSheet.Cells(R, 2) = idic(key)(1)
    ActiveSheet.Cells(R, 3) = idic(key)(2)
Next key
您也可以只使用应该很快的数组

Dim list As Object, arr(), post As Object, index As Long
Set list = Html.getElementsByClassName("info")
ReDim arr(1 To list.Length)

For Each post In list
    index = index + 1
    shopName = post.querySelector(".business-name span").innerText
    address = post.querySelector(".adr").innerText
    phone = post.querySelector(".phones").innerText
    arr(index) = Array(shopName, address, phone)
Next
For index = LBound(arr) To UBound(arr)
    ActiveSheet.Cells(index, 1).Resize(1, UBound(arr(index))) = arr(index)
Next
不过,我会尝试将
html.getElementsByClassName(“info”)
加载到一个变量中,并在这两种情况下使用它


此外,数据以json字符串的形式存在于脚本标记中,因此如果使用json解析器,例如,您还可以执行以下操作:

Dim json As Object, item As Object, results(), i As Long
Set json = JsonConverter.ParseJson(Html.querySelectorAll("script[type='application/ld+json']").item(1).innerHTML)
ReDim results(1 To json.Count)
i = 1
For Each item In json
    results(i) = Array(item("name"), Join$(item("address").Items, " ,"), item("telephone"))
    i = i + 1
Next

另一种可能是为数据创建简单的类。然后将此类的实例添加到字典中。另外两个类
WebData
InfoDataCollection
将有助于分离代码和提高可读性等

GetDictItems方法

WebData类模块

信息数据类模块

InfoDataCollection类模块


idic(shopName)=数组(地址、电话)
etc
Cells(R,2)=idic(key)(0):Cells(R,3)=idic(key)(1)
抱歉@Tim有任何误解。如果有5项呢?我已经更新了帖子。也许你现在明白我的意思了。谢谢。字典的意义在于你用英语输入一个单词,然后把它翻译成法语。你的需求更像是一本电话簿:输入姓名、地址和电话号码。这很好,只是你提到了打印列表。你没有提到查找。对于打印一个简单的2D数组就足够了,比如Dim Arr(1到5,1到3)。Arr(1,1)=“外围地区”,Arr(1,2)=“犹大街4001号”,Arr(1,3)=4156616140。数组中的每个元素都有3个部分。我很久没有遇到你的解决方案@dee了。很好的实现一如既往。虽然问题已经接受了答案,但我想添加更多的代码,这将介绍如何将代码划分为类的下一个可能性。老实说,我需要一本用户手册来运行您的脚本@dee。我一直是vba的新手。如果您有任何关于用例的指导,我们将不胜感激。非常感谢。没问题,看一看。它只是将关注点分开,这样每个类都会处理部分问题。因为HtmlLevel不支持querySelector。更新了这两个版本。有些元素类型确实支持(我使用的测试用例支持)。现在我可以进一步查看一个URL了。这正是实现这一技巧的逻辑
idic(post)=数组(店名、地址、电话)
。这真是太棒了。虽然Tim Williams在其评论中首先提出了这一建议,但这还不够清楚,无法与真正的演示相符。感谢QHarr让我度过了美好的一天。很抱歉我的打扰@QHarr。如果我遵循这个逻辑,就会出现严重的问题,因为它会产生重复的值,而字典从不存储重复的值。你认为这是一个有效的方法(实际上是一个词汇表的作用)。谢谢。字典可以存储重复的值,但不能存储重复的键。你是说你有重复的钥匙?如果是这样,则添加一个计数器变量并将其用于键。你是,我想用一个物体作为键。它们可能是相同的对象类型,但在技术上不是相同的对象。
Dim json As Object, item As Object, results(), i As Long
Set json = JsonConverter.ParseJson(Html.querySelectorAll("script[type='application/ld+json']").item(1).innerHTML)
ReDim results(1 To json.Count)
i = 1
For Each item In json
    results(i) = Array(item("name"), Join$(item("address").Items, " ,"), item("telephone"))
    i = i + 1
Next
Const url = "https://www.yellowpages.com/search?search_terms=Coffee%20Shops&geo_location_terms=San%20Francisco%2C%20CA&page=2"

Sub GetDictItems()
    With New WebData
        .Load url
        .PrintToExcel
    End With
End Sub
Private m_html As HTMLDocument
Private m_data As InfoDataCollection

Private Sub Class_Initialize()
    Set m_html = New HTMLDocument
    Set m_data = New InfoDataCollection
End Sub

Public Sub Load(url As String)
    With New XMLHTTP60
        .Open "GET", url, False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        m_html.body.innerHTML = .responseText
    End With
    m_data.Add m_html
End Sub

Public Sub PrintToExcel()
    Dim key As Variant
    Dim R As Long
    Dim info As InfoData

    For Each key In m_data.Keys
        R = R + 1
        Set info = m_data.Items(key)
        Cells(R, 1) = info.ShopName
        Cells(R, 2) = info.Address
        Cells(R, 3) = info.Phone
    Next key
End Sub
Private m_shopName As String
Private m_address As String
Private m_phone As String

Public Property Get ShopName() As String
    ShopName = m_shopName
End Property

Public Property Let ShopName(ByVal vNewValue As String)
    m_shopName = vNewValue
End Property

Public Property Get Address() As String
    Address = m_address
End Property

Public Property Let Address(ByVal vNewValue As String)
    m_address = vNewValue
End Property

Public Property Get Phone() As String
    Phone = m_phone
End Property

Public Property Let Phone(ByVal vNewValue As String)
    m_phone = vNewValue
End Property
Private m_dictionary As Object

Private Sub Class_Initialize()
    Set m_dictionary = CreateObject("Scripting.Dictionary")
End Sub

Public Sub Add(html As HTMLDocument)
    Dim info As InfoData
    Dim post As HTMLDivElement

    m_dictionary.RemoveAll
    For Each post In html.getElementsByClassName("info")
        Set info = New InfoData
        info.ShopName = post.querySelector(".business-name span").innerText
        info.Address = post.querySelector(".adr").innerText
        info.Phone = post.querySelector(".phones").innerText
        Set m_dictionary(info.ShopName) = info
    Next post
End Sub

Public Property Get Keys() As Variant()
    Keys = m_dictionary.Keys
End Property

Public Property Get Items() As Object
    Set Items = m_dictionary
End Property