用VBA从html链接中提取文本

用VBA从html链接中提取文本,html,vba,hyperlink,extract,Html,Vba,Hyperlink,Extract,我使用VBA从包含多个无序列表项的网页中提取,如下所示: 2015/16 ICD-10-CM S82.311D右胫骨下端环形骨折,随后在常规愈合的情况下遭遇骨折 或: 我能够获得“ICD-10-CM S82.311D”值,但我需要链接右侧的“圆环体断裂…”值。我该怎么做 这是我的密码: 公共函数convertICD(ByVal icdCode作为字符串) 结束函数执行DOC.getElementsByTagName(“li”),忽略前7个,然后处理其余的lnk.innerText得到了我所需要的

我使用VBA从包含多个无序列表项的网页中提取,如下所示:
  • 2015/16 ICD-10-CM S82.311D右胫骨下端环形骨折,随后在常规愈合的情况下遭遇骨折 或:
  • 我能够获得“ICD-10-CM S82.311D”值,但我需要链接右侧的“圆环体断裂…”值。我该怎么做

    这是我的密码:

    公共函数convertICD(ByVal icdCode作为字符串)


    结束函数

    执行DOC.getElementsByTagName(“li”),忽略前7个,然后处理其余的lnk.innerText得到了我所需要的。代码和细节在innerText中,我只需要解析它。考虑这个问题,虽然我很乐意看到一个更优雅的解决方案。

    < P>你可以使用无浏览器的XHR请求,通过类名和索引来更快地获得所有的信息。我在数组ICD中放入了一个ICD代码。你可以扩展这个


    页面视图:

    Option Explicit
    Public Sub GetInfo()
        Dim sResponse As String, HTML As New HTMLDocument
        Const BASE_URL As String = "https://www.icd10data.com/Convert/"
        Dim ICDs(), currICD As Long
        ICDs = Array("S92.311D")
    
        With CreateObject("MSXML2.XMLHTTP")
            For currICD = LBound(ICDs) To UBound(ICDs)
                .Open "GET", BASE_URL & ICDs(currICD), False
                .send
                sResponse = StrConv(.responseBody, vbUnicode)
                sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
    
                With HTML
                    .body.innerHTML = sResponse
                    Debug.Print .getElementsByClassName("pageHeading")(0).innerText
                    Debug.Print .getElementsByClassName("contentBlurbConversion")(0).innerText
                End With
            Next currICD
        End With
    End Sub
    


    代码输出:

    Option Explicit
    Public Sub GetInfo()
        Dim sResponse As String, HTML As New HTMLDocument
        Const BASE_URL As String = "https://www.icd10data.com/Convert/"
        Dim ICDs(), currICD As Long
        ICDs = Array("S92.311D")
    
        With CreateObject("MSXML2.XMLHTTP")
            For currICD = LBound(ICDs) To UBound(ICDs)
                .Open "GET", BASE_URL & ICDs(currICD), False
                .send
                sResponse = StrConv(.responseBody, vbUnicode)
                sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
    
                With HTML
                    .body.innerHTML = sResponse
                    Debug.Print .getElementsByClassName("pageHeading")(0).innerText
                    Debug.Print .getElementsByClassName("contentBlurbConversion")(0).innerText
                End With
            Next currICD
        End With
    End Sub
    


    VBA:

    Option Explicit
    Public Sub GetInfo()
        Dim sResponse As String, HTML As New HTMLDocument
        Const BASE_URL As String = "https://www.icd10data.com/Convert/"
        Dim ICDs(), currICD As Long
        ICDs = Array("S92.311D")
    
        With CreateObject("MSXML2.XMLHTTP")
            For currICD = LBound(ICDs) To UBound(ICDs)
                .Open "GET", BASE_URL & ICDs(currICD), False
                .send
                sResponse = StrConv(.responseBody, vbUnicode)
                sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
    
                With HTML
                    .body.innerHTML = sResponse
                    Debug.Print .getElementsByClassName("pageHeading")(0).innerText
                    Debug.Print .getElementsByClassName("contentBlurbConversion")(0).innerText
                End With
            Next currICD
        End With
    End Sub
    
    选项显式
    公共子GetInfo()
    Dim响应为字符串,HTML响应为新HTMLDocument
    Const BASE_URL作为字符串=”https://www.icd10data.com/Convert/"
    Dim ICD(),当前长度为
    ICDs=阵列(“S92.311D”)
    使用CreateObject(“MSXML2.XMLHTTP”)
    对于currICD=LBound(ICD)到UBound(ICD)
    .打开“获取”,基本URL和ICD(currICD),False
    .发送
    sResponse=StrConv(.responseBody,vbUnicode)
    
    sResponse=Mid$(sResponse,InStr(1,sResponse),“我确信最了解如何做这件事的人会发出呻吟声,但如果你没有得到其他响应……你可以在网页文本中搜索类似“converts Abrough to”(或始终返回的某个字符串)的字符串然后转到第8个“>”,您查找的字符串将以第8+1开头,然后转到下一个“GetElementsByCassName”(“img externalIcon”)可能对tooThanks的家伙有用。深入“视图源”并尝试通过标记名“li”获取"。获取该文件的内部文本为我提供了代码和详细信息,然后我只需解析它们。可能有更好的方法,但这足以满足我的需要。再次感谢。这非常有效。我能够将所有ICD 10代码外部参照到相应的ICD 9值。可能有更好的方法,但它能够及时完成70K查找姐姐,我的同事今天能做一个演讲。