从JSON字符串中提取键/值对,并在值中保留特殊字符

从JSON字符串中提取键/值对,并在值中保留特殊字符,json,regex,vb.net,Json,Regex,Vb.net,我有一个包含各种JSON数组的Html页面。我使用HTML Agility Pack从页面中获取innerText,它隔离了页面中的一些剩余文本和JSON数组(页面中有许多复杂的OJBECT)。然后我将文本传递给正则表达式,如下所示,它解析键/值对;然而,它停在撇号处;然而,我需要它,并希望保留特殊字符以支持其他功能 我从互联网上获得了正则表达式,我确信它需要一个调整来允许特殊字符。我尝试过各种方法;但作为一名正则表达式专家,我无法想出一个解决方案。有人对如何修复RegeEx有什么建议吗 Dim

我有一个包含各种JSON数组的Html页面。我使用HTML Agility Pack从页面中获取innerText,它隔离了页面中的一些剩余文本和JSON数组(页面中有许多复杂的OJBECT)。然后我将文本传递给正则表达式,如下所示,它解析键/值对;然而,它停在撇号处;然而,我需要它,并希望保留特殊字符以支持其他功能

我从互联网上获得了正则表达式,我确信它需要一个调整来允许特殊字符。我尝试过各种方法;但作为一名正则表达式专家,我无法想出一个解决方案。有人对如何修复RegeEx有什么建议吗

Dim some_json As String = """{""request"":""Over the last 25 years, I've worked with most of the world’s leading selling strategy systems and built sales training used by companies on six continents. Two years ago, I teamed up with other sales strategy experts to merge our combined experience, wisdom and knowledge into an artificial intelligence system. We worked with expert neuroscientists, behavioral economists, psychologists, and AI programmers to develop JOY, the world’s first emotionally intelligent and sales-savvy artificial intelligence system for sales.  Now I focus on helping companies implement JOY to instantly increase sales and dominate markets.  \n "",""status"":200}"""

        some_json = some_json.Replace("\n", " ")

        Dim r As Regex = New Regex("""(?<Key>[\w]*)"":""?(?<Value>([\s\w\d\.\\\-/:_\+]+(,[,\s\w\d\.\\\-/:_\+]*)?)*)""?")
        Dim mc As MatchCollection = r.Matches(some_json)

        'regex returns summary: Over the last 25 years, I
        'how do I return the entire value with the apostrophe's, special characters?

        For Each k As Match In mc
            Try
                If (k.Groups("Value").Value.Length > 0 And k.Groups("Key").Value = "request") Then
                    m = m & k.Groups("Key").Value & ":" & k.Groups("Value").Value.ToString & "<br/><br/>"
                End If

            Catch ex As Exception
                Dim se As String = ex.Message
            End Try
        Next
        Response.Write(m)
Dim一些json作为String=“”{“请求”:“在过去的25年里,我与世界上大多数人合作过”™s领先的销售战略系统,并建立了六大洲公司使用的销售培训。两年前,我与其他销售战略专家合作,将我们的经验、智慧和知识整合到一个人工智能系统中。我们与专家神经科学家、行为经济学家、心理学家和人工智能程序员合作为了发展快乐,世界™s第一个具有情感智能和销售悟性的销售人工智能系统。现在,我专注于帮助公司实施JOY,以立即增加销售额并主宰市场。\n““状态”:200}”
some_json=some_json.Replace(“\n”,”)
Dim r As Regex=New Regex(“(?[\w]*)”:“?(?([\s\w\d\.\\-/:\+]+(,[,\s\w\d\.\\\-/:\+]*))*”)
Dim mc As MatchCollection=r.Matches(一些json)
regex回报摘要:在过去的25年里,我
'如何使用撇号的特殊字符返回整个值?
对于mc中的每个k作为匹配项
尝试
如果(k.Groups(“Value”).Value.Length>0和k.Groups(“Key”).Value=“request”),则
m=m&k.Groups(“Key”).Value&“:”&k.Groups(“Value”).Value.ToString&“

” 如果结束 特例 将se变暗为字符串=例如消息 结束尝试 下一个 响应。写入(m)
通过抛出正则表达式并使用递归函数从json的任何深度提取任何内容来解决这个问题。它输出名称/值对。如果有一个deep对象,它会将键名(例如depth4.depth3.depth2)堆叠起来,然后跟随值。因此,在键(组合键)上进行一个简单的字符串比较,就可以提取值

Private Shared Function ParseJson(ByVal token As JToken, ByVal nodes As Dictionary(Of String, String), ByVal Optional parentLocation As String = "") As Boolean
    If token.HasValues Then

        For Each child As JToken In token.Children()

            If token.Type = JTokenType.[Property] Then

                If parentLocation = "" Then
                    parentLocation = (CType(token, JProperty)).Name
                Else
                    parentLocation += "." & (CType(token, JProperty)).Name
                End If
            End If

            ParseJson(child, nodes, parentLocation)
        Next

        Return True
    Else

        If nodes.ContainsKey(parentLocation) Then
            nodes(parentLocation) += "|" & token.ToString()
        Else
            nodes.Add(parentLocation, token.ToString())
        End If

        Return False
    End If
End Function
用法:

Private Sub Test_Load(sender As Object, e As EventArgs) Handles Me.Load

    Dim ServerPath As String = HttpRuntime.AppDomainAppPath
    Dim left_overs As String = String.Empty

    'download web page
    Dim html = New HtmlDocument()
    html.LoadHtml(New WebClient().DownloadString(ServerPath & "files/somefile.htm"))

    Dim txt As String = html.DocumentNode.InnerText, m As String = String.Empty

    Try
        Dim ndes As Array = html.DocumentNode.SelectNodes("//cde").ToArray
        For Each item As HtmlNode In ndes

            Dim s As String = item.InnerText.Trim
            'Response.Write(s)
            Try

                Dim nodes As Dictionary(Of String, String) = New Dictionary(Of String, String)()
                Dim rootObject As JObject = JObject.Parse(s)
                ParseJson(rootObject, nodes)

                For Each key As String In nodes.Keys
                    If key = "included.summary" Then

                        left_overs = AlphaNumericOnly(nodes(key))
                        m = m & key & " = " & left_overs & "<br/><br/>"
                        Response.Write(m)
                    End If
                Next

            Catch ex As Exception
                Dim err As String = ex.Message
            End Try

        Next
    Catch ex As Exception
    End Try

End Sub
Public Shared Function AlphaNumericOnly(strSource As String) As String
    Dim i As Integer
    Dim strResult As String = String.Empty

    For i = 1 To Len(strSource)
        Select Case Asc(Mid(strSource, i, 1))
            Case 32 To 91, 93 To 126 'include 32 if you want to include space
                strResult = strResult & Mid(strSource, i, 1)
        End Select
    Next
    AlphaNumericOnly = strResult
End Function