Python 如何将字符串拆分为列表而不是括号？_Python_Regex

Python 如何将字符串拆分为列表而不是括号？

python regex

Python 如何将字符串拆分为列表而不是括号？,python,regex,Python,Regex,关于这个标题我很抱歉，我不知道该怎么说。无论如何，我正在用python编写一个标记语言编译器，它可以编译成HTML。例如： -(a){href:"http://www.google.com"}["Click me!"] 汇编成： <a href="http://www.google.com">Click me!</a> 产生： <title>Nested tags</title> <h1>Nested tags</h1>

关于这个标题我很抱歉，我不知道该怎么说。无论如何，我正在用python编写一个标记语言编译器，它可以编译成HTML。例如：

-(a){href:"http://www.google.com"}["Click me!"]

汇编成：

<a href="http://www.google.com">Click me!</a>

产生：

<title>Nested tags</title>
<h1>Nested tags</h1>
<p>Paragraph</p>

每一行都被解释为一个不同的命令，因此我的regexp（

\（.+）\[（.+）\]

和

\（.+）\{（.+）\\\\\[（.+）\]

）与

头[

之类的东西不匹配。我认为最好的解决方案是将它拆分为“-”，除非它位于命令体中，这样会产生：

(head)[-(title)[Nested tags]
(body)[-(div)[-(h1)[Nested tags]-(p)[Paragraph]]]

然后为每个块中的每个命令运行相同的代码

TL；DR:进行输入：

"-abc-def[-ignore-me]-ghi"

制作：

["abc", "def", "[-ignore-me]", "ghi"]

非常感谢您提供的任何帮助。

我相信此狡猾的代码的工作原理与您想要的类似：

import re

re_name = re.compile(r'-\(([^)]+)\)')
re_args = re.compile(r'{([^}]+)}')

def parse(chars, history=[]):
  body = ''
  while chars:
    char = chars.pop(0)
    if char == '[':
      name = re_name.search(body).group(1)
      args = re_args.search(body)
      start = '<'+name
      if args:
        for arg in args.group(1).split(','):
          start += ' '+arg.strip().replace(':', '=')
      start += '>'
      end = '</'+name+'>'
      history.append(start)
      history.append(parse(chars))
      history.append(end)
      body = ''
      continue
    if char == ']':
      return body.strip()
    body += char
  return history

code = '''
-(head)[
    -(title)[Nested tags]
    -(random)[Stuff]
]

-(body){class:"nav", other:stuff, random:tests}[
    -(div)[
        -(h1)[Nested tags]
        -(p)[Paragraph]
    ]
]
'''

parsed = parse(list(code))
print(''.join(parsed))

重新导入
re_name=re.compile（r'-\（[^）]+）\'））
re_args=re.compile（r'{（[^}]+）}）
def解析（字符，历史=[]）：
正文=“”
而chars：
char=chars.pop（0）
如果char=='['：
name=re_name.search（body.group）（1）
args=re_args.search（正文）
开始=“”
结束=“”
附加（开始）
附加（解析（chars））
添加历史记录（结束）
正文=“”
持续
如果char==']'：
返回体.strip（）
body+=char
回归历史
代码=“”
-（标题）[
-（标题）[嵌套标记]
-（随机的）[东西]
]
-（正文）{类别：“nav”，其他：材料，随机：测试}[
-（分区）[
-（h1）[嵌套标记]
-（p） [段]
]
]
'''
解析=解析（列表（代码））
打印（“”.join（已解析））

如果code==“-abc def[-忽略我]-ghi”：return[“abc”，“def”，“[-忽略我]，“ghi”]

：）@Scorpion\u God一开始我以为你给了我一个正确的答案：）我认为任何正则表达式都不能工作，因为你的标记不是一个。你可能需要使用解析器库。语法

-（标记）{attributes}[content]也是如此

？其中

[内容]

是可选的，

属性

看起来像

名称：值，名称：值…

？@Blckknght您当然是对的，但我们为什么不试着做我们自己的呢？

["abc", "def", "[-ignore-me]", "ghi"]

import re

re_name = re.compile(r'-\(([^)]+)\)')
re_args = re.compile(r'{([^}]+)}')

def parse(chars, history=[]):
  body = ''
  while chars:
    char = chars.pop(0)
    if char == '[':
      name = re_name.search(body).group(1)
      args = re_args.search(body)
      start = '<'+name
      if args:
        for arg in args.group(1).split(','):
          start += ' '+arg.strip().replace(':', '=')
      start += '>'
      end = '</'+name+'>'
      history.append(start)
      history.append(parse(chars))
      history.append(end)
      body = ''
      continue
    if char == ']':
      return body.strip()
    body += char
  return history

code = '''
-(head)[
    -(title)[Nested tags]
    -(random)[Stuff]
]

-(body){class:"nav", other:stuff, random:tests}[
    -(div)[
        -(h1)[Nested tags]
        -(p)[Paragraph]
    ]
]
'''

parsed = parse(list(code))
print(''.join(parsed))