Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 来自html源的正则表达式json表单_Python_Json_Regex - Fatal编程技术网

Python 来自html源的正则表达式json表单

Python 来自html源的正则表达式json表单,python,json,regex,Python,Json,Regex,我遇到了一个无法从html源中正则化json值的问题 html源代码如下所示: <script data-csp-hash=""> window.__webpack_public_path__='https://renderer-assets.typeform.com/'; window.__webpack_nonce__='3088edaa602c001b5f6e1f31e3179422'; window.rendererAssets='["https://renderer-ass

我遇到了一个无法从html源中正则化json值的问题

html源代码如下所示:

<script data-csp-hash=""> window.__webpack_public_path__='https://renderer-assets.typeform.com/';
window.__webpack_nonce__='3088edaa602c001b5f6e1f31e3179422';
window.rendererAssets='["https://renderer-assets.typeform.com/vendors~libphonenumber~submission.c94d30638908af997673.js","https://renderer-assets.typeform.com/country-data.526012987a7e72182726.js","https://renderer-assets.typeform.com/form-container.98c74a2ac320736bdb16.js","https://renderer-assets.typeform.com/renderer.8282fd35106b77e43e2f.js","https://renderer-assets.typeform.com/submission.5d9a15e294b33a20ea2e.js","https://renderer-assets.typeform.com/vendors~form-container.b5fb128466f604baadba.js","https://renderer-assets.typeform.com/vendors~video.aa830e76dcc8735c9936.js","https://renderer-assets.typeform.com/video.45eca666f47b245e8fdb.js"]';
window.rendererData= {
    rootDomNode: 'root',
    form:     {
            "id":"Z3PvTW",
            "title":Testing",
            "welcome_screens":[ {
                "ref":"a13820db-af60-40eb-823d-86cf0f20299b",
                "title":"Yessir!",
                "properties": {
                    "show_button": true, "button_text": "Start"
                }
            }
            ],
            "thankyou_screens":[ {
                "ref":"default_tys",
                "title":"Done! Your information was sent perfectly.",
                "properties": {
                    "show_button": false, "share_icons": false
                }
            }
            ],
            "fields":[ {
                "id":"kxWycKljdtBq",
                "title":"FIRST NAME",
                "ref":"27f403f7-8c5b-4e18-b19d-1501e8f137ee",
                "validations": {
                    "required": true
                }
                ,
                "type":"short_text"
            }
            ,
            {
                "id":"WEXCnZ7EAFjN",
                "title":"LAST NAME",
                "ref":"a6bf6d83-ee37-4870-b6c5-779822290cde",
                "validations": {
                    "required": true
                }
                ,
                "type":"short_text"
            }
            ,
            {
                "id":"ButwoV1bTge5",
                "title":"EMAIL ADDRESS",
                "ref":"8860a4cf-71ec-4bfa-a2c7-934fd405f200",
                "properties": {
                    "description": "Note for stackoverflow!"
                }
                ,
                "validations": {
                    "required": true
                }
                ,
                "type":"email"
            }
            ],
            "_links": {
                "display": "link.com"
            }
        }
    ,
    messages: {
        "a11y.file-upload.remove":"Remove uploaded file",
    }
    ,
    trackingInfo: {
        "segmentKey": "9at6spGDYXelHDdz4r0cP73b3wV1f0ri", "accountId": 12587347, "accountLimitName": "Essentials", "userId": 12586030
    }
    ,
    stripe: null,
    showBranding: true,
    accessScheduling: {
        "closeScreenData": {
            "title":"This typeform isn't accepting new responses",
            "description":"",
            "brandingMottoText":"How you ask is everything",
            "brandingButtonText":"Create a *typeform*",
            "attachment": {}
            ,
            "textColor": "#3D3D3D", "showBranding": true, "brandingButtonColor": "#000000", "buttonRedirectLink": "https:\u002F\u002Fwww.typeform.com\u002Fsignup?utm_campaign=undefined&utm_source=typeform.com-12587347-Essentials&utm_medium=typeform&utm_content=typeform-closescreen&utm_term=EN"
        }
    }
    ,
    featureFlags: {
        "always-inject-new-relic": false, "beta-testers": false, "sb-3671-inline-submit-flow": "out-of-experiment", "sb-3671-new-submit-flow": false
    }
}

;
window.rendererTheme= {
    color: '#3D3D3D',
    backgroundColor: {
        red: '255', green: '255', blue: '255'
    }
}

;
我用这个几乎能刮到它

(?sm)^\s*form:\s*{(.*?)\n}$ #Not quite sure if this would work in Python however.

然而,我的问题是,它继续刮除表单值之后的内容,比如消息、跟踪信息、条带等等,我只希望能够获取表单json,而不需要其他内容


如何才能仅获取
表单的正则表达式:
json值?

您可以尝试以下方法:

data = '''....'''
data = re.findall("form\:[\S\s]*messages",data)[0]
data = re.sub("^form\:","",data)
data = re.sub("\,\n|\smessages","",data)
print data

它似乎做的表单是正确的,但是它也会打印出
表单json
之后的值,但会跳过消息json。在本例中,它打印出表单和
跟踪信息、条带、showBranding窗口、RenderTheme
等。因此,基本上,如果有一种方法可以停止
data=re.sub(“\,\n | \smessages“,”,data)
并在
消息之前使用所有内容
值,那么它可能会解决问题,然后尝试
data=re.sub(“\,\n | \smessages]*”,“,”,data)
嘿@NicolasGrenié哦,哇,我不知道,这实际上会帮我轻松1000倍。非常感谢你!原因是,如果你把它转换成一个特殊的json解析器,你就可以向这个类型发送请求,所以我的想法是制作一个脚本,可以为我编写并提交这个类型:)如果你知道我的意思的话?
data = '''....'''
data = re.findall("form\:[\S\s]*messages",data)[0]
data = re.sub("^form\:","",data)
data = re.sub("\,\n|\smessages","",data)
print data