Excel 如何从在线零售商&x27;在VBA中使用getelementsbyclassname的网页?

Excel 如何从在线零售商&x27;在VBA中使用getelementsbyclassname的网页?,excel,web-scraping,webpage,vba,Excel,Web Scraping,Webpage,Vba,我写了一个宏,从零售商的网页上获取产品信息。它运行良好,但不会在我的工作表中呈现任何结果。我很难理解为什么。我在搜索输入框中输入“sale”,将导致以下url: 我想在我的工作表中显示产品名称、前价格和当前价格。这个 这些元素的HTML如下所示: <div class="subCatName"> <a href="/girls-clothing/colored-jeggings/6611358/651?pageSort=W3sidHlwZSI6InJl

我写了一个宏,从零售商的网页上获取产品信息。它运行良好,但不会在我的工作表中呈现任何结果。我很难理解为什么。我在搜索输入框中输入“sale”,将导致以下url:

我想在我的工作表中显示产品名称、前价格和当前价格。这个 这些元素的HTML如下所示:

<div class="subCatName">
            <a href="/girls-clothing/colored-jeggings/6611358/651?pageSort=W3sidHlwZSI6InJlbGV2YW5jZSIsInZhbCI6IiJ9XQ==&amp;productOrigin=search%20page&amp;productGridPlacement=1-1" id="anchor2_6611358" class="auxSubmit">Colored Jeggings</a>
        </div>
<div class="cat-list-price subCatPrice">
            <div class="priceContainer">
                <span class="mobile-was-price">
                            was 
                            $26.90</span>
                       <span class="mobile-now-price">
                           now 
                           $10.49</span>
                    </div>

            <div class="price_description">
                        <span class="mobile-extra">
                            Extra 30% off clearance!</span>
                    </div>              
                </div>
Sub test2()

Dim RowCount, erow As Long
Dim sht As Object
Dim ele As IHTMLElement
Dim eles As IHTMLElementCollection
Dim doc As HTMLDocument

Set sht = Sheets("JUSTICESALE")
RowCount = 1
sht.Range("A" & RowCount) = "Clothing Item"
sht.Range("B" & RowCount) = "SKU"
sht.Range("C" & RowCount) = "Former Price"
sht.Range("D" & RowCount) = "Sale Price"

Set ie = CreateObject("InternetExplorer.application")
searchterm = InputBox("ENTER SEARCH TERM")

Application.StatusBar = "LOADING JUSTICE SEARCH"
With ie
.Visible = True
.navigate "http://www.shopjustice.com/"

Do While .busy Or _
.readystate <> 4
DoEvents
Loop

Set doc = ie.document

doc.getelementsbyname("q").Item.innertext = searchterm
doc.getElementsByClassName("searchbtn").Item.Click

Application.StatusBar = "EXTRACTING PRODUCT DATA"

Set eles = doc.getElementsByClassName("subCatName")
For Each ele In eles
If ele.className = "subCatName" Then
erow = sht.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1) = doc.getElementsByClassName("auxSubmit")(RowCount).innertext
Cells(erow, 2) = doc.getElementsByClassName("mobile-was-price")(RowCount).innertext
RowCount = RowCount + 1

End If

Next ele

End With

Set ie = Nothing

Application.StatusBar = ""

End Sub

是
$26.90
现在
$10.49
额外30%的折扣!
代码如下:

<div class="subCatName">
            <a href="/girls-clothing/colored-jeggings/6611358/651?pageSort=W3sidHlwZSI6InJlbGV2YW5jZSIsInZhbCI6IiJ9XQ==&amp;productOrigin=search%20page&amp;productGridPlacement=1-1" id="anchor2_6611358" class="auxSubmit">Colored Jeggings</a>
        </div>
<div class="cat-list-price subCatPrice">
            <div class="priceContainer">
                <span class="mobile-was-price">
                            was 
                            $26.90</span>
                       <span class="mobile-now-price">
                           now 
                           $10.49</span>
                    </div>

            <div class="price_description">
                        <span class="mobile-extra">
                            Extra 30% off clearance!</span>
                    </div>              
                </div>
Sub test2()

Dim RowCount, erow As Long
Dim sht As Object
Dim ele As IHTMLElement
Dim eles As IHTMLElementCollection
Dim doc As HTMLDocument

Set sht = Sheets("JUSTICESALE")
RowCount = 1
sht.Range("A" & RowCount) = "Clothing Item"
sht.Range("B" & RowCount) = "SKU"
sht.Range("C" & RowCount) = "Former Price"
sht.Range("D" & RowCount) = "Sale Price"

Set ie = CreateObject("InternetExplorer.application")
searchterm = InputBox("ENTER SEARCH TERM")

Application.StatusBar = "LOADING JUSTICE SEARCH"
With ie
.Visible = True
.navigate "http://www.shopjustice.com/"

Do While .busy Or _
.readystate <> 4
DoEvents
Loop

Set doc = ie.document

doc.getelementsbyname("q").Item.innertext = searchterm
doc.getElementsByClassName("searchbtn").Item.Click

Application.StatusBar = "EXTRACTING PRODUCT DATA"

Set eles = doc.getElementsByClassName("subCatName")
For Each ele In eles
If ele.className = "subCatName" Then
erow = sht.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1) = doc.getElementsByClassName("auxSubmit")(RowCount).innertext
Cells(erow, 2) = doc.getElementsByClassName("mobile-was-price")(RowCount).innertext
RowCount = RowCount + 1

End If

Next ele

End With

Set ie = Nothing

Application.StatusBar = ""

End Sub
子测试2()
暗行数,与长行数相同
作为对象的暗sht
根据IHTMlement调暗电气元件
与IHTMLElementCollection相同的Dim eles
作为HTMLDocument的Dim doc
套装sht=床单(“JUSTICESALE”)
行数=1
短小范围(“A”和行数)=“服装项目”
短范围(“B”和行数)=“SKU”
短范围(“C”和行数)=“以前的价格”
短范围(“D”和行数)=“销售价格”
设置ie=CreateObject(“InternetExplorer.application”)
searchterm=InputBox(“输入搜索词”)
Application.StatusBar=“正在加载正义搜索”
与ie
.Visible=True
.导航“http://www.shopjustice.com/"
忙的时候做,忙的时候做_
.readystate 4
多芬特
环
Set doc=ie.document
doc.getelementsbyname(“q”).Item.innertext=searchterm
doc.getElementsByClassName(“searchbtn”).Item.Click
Application.StatusBar=“提取产品数据”
Set eles=doc.getElementsByClassName(“subCatName”)
对于eles中的每个ele
如果ele.className=“subCatName”,则
erow=sht.Cells(Rows.count,1)。End(xlUp)。Offset(1,0)。Row
单元格(erow,1)=doc.getElementsByClassName(“auxSubmit”)(行数)。innertext
单元格(erow,2)=doc.getElementsByClassName(“移动was价格”)(行数)。innertext
RowCount=RowCount+1
如果结束
下一个ele
以
设置ie=无
Application.StatusBar=“”
端接头
任何帮助都将不胜感激

编辑: 嗨,彼得,我感谢你的洞察力。它当然预先解决了一些问题。但是,在“编辑以解释缺少的类名”循环之前添加以下代码之后,它仍然没有写入excel

Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
在ie.readyState readyState\u完成时执行
多芬特
环
我错过了什么

我还为不同零售商的网页提供了一种替代方法,尽管概念相同,如下所示。你对这种方法有什么看法?我唯一的问题是Select Case行出现权限拒绝错误70

Sub test5()

Dim erow As Long
Dim ele As Object

Set sht = Sheets("CARTERS")
RowCount = 1
sht.Range("A" & RowCount) = "Clothing Item"
sht.Range("B" & RowCount) = "SKU"
sht.Range("C" & RowCount) = "Former Price"
sht.Range("D" & RowCount) = "Sale Price"

erow = Sheet1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row

Set objIE = CreateObject("Internetexplorer.application")

searchterm = InputBox("ENTER CARTER'S SEARCH TERM")

With objIE
.Visible = True
.navigate "http://www.carters.com/"

Do While .Busy Or _
.readyState <> 4
DoEvents
Loop

.document.getElementsByName("q").Item.innerText = searchterm
.document.getElementsByClassName("btn_search").Item.Click

Do While .readyState <> READYSTATE_COMPLETE
DoEvents
Loop

For Each ele In .document.all
Select Case ele.className

Case “product - name”
RowCount = RowCount + 1
sht.Range("A" & RowCount) = ele.innerText

Case “product - standard - price”
sht.Range("B" & RowCount) = ele.innerText

Case "product-sales-price"
sht.Range("C" & RowCount) = ele.innerText

End Select
Next ele
End With

Set objIE = Nothing

End Sub
子测试5()
暗淡如长
作为对象的Dim ele
设置sht=板材(“卡特尔”)
行数=1
短小范围(“A”和行数)=“服装项目”
短范围(“B”和行数)=“SKU”
短范围(“C”和行数)=“以前的价格”
短范围(“D”和行数)=“销售价格”
erow=Sheet1.单元格(Rows.count,1).结束(xlUp).偏移量(1,0).行
Set objIE=CreateObject(“Internetexplorer.application”)
searchterm=InputBox(“输入卡特的搜索词”)
与奥布杰
.Visible=True
.导航“http://www.carters.com/"
忙的时候做,忙的时候做_
.readyState 4
多芬特
环
.document.getElementsByName(“q”).Item.innerText=searchterm
.document.getElementsByClassName(“btn_搜索”).Item.Click
请稍候。readyState readyState\u完成
多芬特
环
对于文档中的每个元素。全部
选择Case ele.className
案例“产品名称”
RowCount=RowCount+1
sht.Range(“A”&行数)=ele.innerText
案例“产品-标准-价格”
sht.Range(“B”和行数)=ele.innerText
案例“产品销售价格”
sht.Range(“C”&行数)=元素innerText
结束选择
下一个ele
以
设置对象=无
端接头

再次感谢您的帮助。

您的代码运行良好,但有两个注意事项

首先,在您“单击”主页上的搜索按钮后,您的代码不会等待结果页面加载。因此,查找每个项目的循环失败,因为那里(还)什么都没有

其次,在解析特定字段的HTML时,需要进行一些错误处理,以处理缺少这些字段的情况。例如,请查看此处的代码并将其应用于您的情况:

ele中每个ele的

如果ele.className=“subCatName”,则
erow=sht.Cells(Rows.Count,1)。End(xlUp)。Offset(1,0)。Row
出错时继续下一步
单元格(erow,1)=doc.getElementsByClassName(“auxSubmit”)(行数)。innerText
如果错误号为0,则
单元格(erow,1)=“ERR:'auxSubmit'未找到类名!”
呃,明白了
其他的
如果结束
单元格(erow,2)=doc.getElementsByClassName(“移动was价格”)(行数)。innerText
如果错误号为0,则
Cells(erow,2)=“错误:找不到“mobile was price”类名!”
呃,明白了
如果结束
错误转到0
RowCount=RowCount+1
如果结束
下一个ele

我对上面的原始问题做了一些编辑。你还有什么可以分享的见解吗?非常感谢您的帮助。您将收到“权限被拒绝”错误,因为循环中的一个元素(请记住,
ele
变量是代码中未指定类型的对象)没有
.ClassName
属性。因此,您仍然需要考虑这种类型的错误处理。您似乎还想循环浏览页面上的所有元素。很好,但是请记住检测一个元素是否有子元素,然后还必须循环这些子元素。(别忘了还有其他需要抓取的项目的网页。)皮特写道“但记住要检测元素是否有子元素”。你可以发布如何检测元素是否有子元素吗。谢谢