python正则表达式从多行花括号中查找字符串

python正则表达式从多行花括号中查找字符串,python,json,python-3.x,dictionary,Python,Json,Python 3.x,Dictionary,我有一根这样的绳子。如何创建一个字典,第一个标记作为键,后面的所有标记作为值 test_string = """###Some Comment First-tags : { "tag1": { "tagKey1": "tagValue1", "tagKey2": "tagValue2" }, "tag2": { "tagKey1": "tagValue1", "tagKey2": "tagValue2" } so on ..... } "

我有一根这样的绳子。如何创建一个字典,第一个标记作为键,后面的所有标记作为值

test_string = """###Some Comment 
First-tags : 
{
  "tag1": {
    "tagKey1": "tagValue1",
    "tagKey2": "tagValue2"
  },
  "tag2": {
    "tagKey1": "tagValue1",
    "tagKey2": "tagValue2"
  }
  so on .....
} 
"""
例如: 钥匙将是第一个标签 而价值将是

{
  "tag1": {
    "tagKey1": "tagValue1",
    "tagKey2": "tagValue2"
  },
  "tag2": {
    "tagKey1": "tagValue1",
    "tagKey2": "tagValue2"
  }
  so on .....
} 
[编辑:字符串数据在文件中。问题是从文件中读取并创建一个字典,其中键是注释,值是Json数据]

例如,文件将具有:

###Some Comment 
    First-tags : 
    {
      "tag1": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      },
      "tag2": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      }
      so on .....
    } 


###2nd Comment 
    Second-tags : 
    {
      "tag1": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      },
      "tag2": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      }
      so on .....
    } 

###Some other Comment 
    someother-tags : 
    {
      "tag1": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      },
      "tag2": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      }
      so on .....
    } 

因此,这里我尝试将字符串转换为JSON

但是为了让它起作用,我的str应该是JSON而不是别的

因此,我找到了第一个
{
,并从中获取字符串

import json

my_str = '''
First-tags : 
{
  "tag1": {
    "tagKey1": "tagValue1",
    "tagKey2": "tagValue2"
  },
  "tag2": {
    "tagKey1": "tagValue1",
    "tagKey2": "tagValue2"
  }
  }
  '''
# find the first {
i = my_str.index('{')
my_str = my_str[i:] # trim the string so that only dict is left
my_dict = dict(json.loads(my_str)) # create JSON and then convert that to dict
print(my_dict) # n'joy
如果需要,还可以查找JSON的结尾并修剪str(查找
}

根据问题中的更新更新解决方案更新
您可以使用此正则表达式,它将匹配组1中
之前的最后一组单词字符(包括
-
),然后将所有其他字符匹配到组2中的下一个注释(
###
)或字符串结尾:

([\w-]+)\s*:\s*(.*?)(?=\s*###|$)
然后,您可以通过对字符串中的每个匹配项在两个组上进行迭代来创建字典:

import re

test_string = """
###Some Comment 
    First-tags : 
    {
      "tag1": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      },
      "tag2": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      }
      so on .....
    } 


###2nd Comment 
    Second-tags : 
    {
      "tag1": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      },
      "tag2": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      }
      so on .....
    } 

###Some other Comment 
    someother-tags : 
    {
      "tag1": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      },
      "tag2": {
        "tagKey1": "tagValue1",
        "tagKey2": "tagValue2"
      }
      so on .....
    }
"""
res = {}
for match in re.finditer(r'([\w-]+)\s*:\s*(.*?)(?=\s*###|$)', test_string, re.S):
    res[match.group(1)] = match.group(2)

print(res)
输出:

{
 'First-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }',
 'Second-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }',
 'someother-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
}
{
 'Some Comment ': {
   'First-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
 },
'2nd Comment ': {
   'Second-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
 },
 'Some other Comment ': {
  'someother-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
 }
}
更新

如果您还希望获取注释,可以使用以下代码:

res = {}
for match in re.finditer(r'###([^\n]+)\s*([\w-]+)\s*:\s*(.*?)(?=\s*###|$)', test_string, re.S):
    res[match.group(1)] = { match.group(2) : match.group(3) }

print(res)
输出:

{
 'First-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }',
 'Second-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }',
 'someother-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
}
{
 'Some Comment ': {
   'First-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
 },
'2nd Comment ': {
   'Second-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
 },
 'Some other Comment ': {
  'someother-tags': '{\n      "tag1": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      },\n      "tag2": {\n        "tagKey1": "tagValue1",\n        "tagKey2": "tagValue2"\n      }\n      so on .....\n    }'
 }
}

您是否在询问如何解析包含注释的
json
字符串到字典中?这似乎比那复杂一点;字符串是否总是包含
第一个标记
?您是否询问如何将包含注释的json字符串解析到字典中?是的,但文件结构如下:注释JSON数据另一个注释JSON数据等等。。。第一个标签只是一个例子。可以是任何东西,谢谢。这个解决方案可以解决我之前的问题,但我对问题陈述进行了编辑。谢谢你的帮助。嗨,库尔德普,谢谢你的帮助。我已经接受了使用regex的另一个解决方案。您的方法也很好,但对于我的用例,我发现正则表达式将有更好的帮助。谢谢你的帮助。绝对没问题,我很高兴你的问题得到解决!非常感谢。如果我有一个多次出现这种情况的文件,它会工作吗。我对问题陈述进行了编辑。@HarshKumar请查看我的编辑。它使用
re.finditer
查找字符串中的所有匹配项。是否也可以获取注释?i、 将一些评论打印为well@HarshKumar你一直在移动球门柱。。。您希望如何返回评论?很抱歉。我应该说得更清楚些。类似这样:res[comment][match.group(1)]=match.group(2)