Python中的多个正则表达式_Python_Regex

Python中的多个正则表达式

python regex

Python中的多个正则表达式,python,regex,Python,Regex,我正在创建一种编程语言。对于这种语言，我正在创建一个将其编译成Python的程序。我不需要lexer，因为大多数语法都可以通过正则表达式转换成Python 以下是我目前掌握的情况： import re infile = input() output = open(infile + ".py","w") input = open(infile + ".hlx") # I'm aware the .hlx extension is already taken, but it doesn't re

我正在创建一种编程语言。对于这种语言，我正在创建一个将其编译成Python的程序。我不需要lexer，因为大多数语法都可以通过正则表达式转换成Python

以下是我目前掌握的情况：

import re

infile = input()

output = open(infile + ".py","w")
input = open(infile + ".hlx")
# I'm aware the .hlx extension is already taken, but it doesn't really matter.

for line in input:
    output.write(re.sub(r'function (\S+) (\S+) =', r'def \1(\2):', line))

for line in input:
    output.write(re.sub(r'print(.+)', r'print(\1)', line))

for line in input:
    output.write(re.sub(r'call (\S+) (\S+)', r'\1(\2)', line))

# More regexes go here, eventually.

input.close()
output.close()

我必须把每个正则表达式放在一个单独的for语句中，因为如果我把它们放在一起，它将替换每一行3次

这里的问题是它只执行其中一个正则表达式，这是第一个正则表达式。这里的顺序并不重要，但是我仍然需要这个程序来执行所有的正则表达式。我该怎么做

顺便说一下，以下是我想用我的语言替换的代码：

function hello input =
    print "Hello, ", input, "!"
hello "world"

下面是我想用Python替换它的代码：

def hello(input):
    print("Hello, " + input + "!")
hello("world")

在一个循环中逐个执行所有替换。我还建议在单独的数据结构中使用正则表达式及其替换项，这将使进一步的扩展更容易：

conversions = (
  (r'function (\S+) (\S+) =', r'def \1(\2):'),
  (r'print(.+)',              r'print(\1)'  ),
  (r'call (\S+) (\S+)',       r'\1(\2)'     ),
)

for line in input:
    for (pattern, sub) in conversions:
        line = re.sub(pattern, sub, line)
    output.write(line)

如果要在打开的文件上迭代多次，则需要将

seek（）

移到文件的开头。另外，为什么不将每个

re.sub

调用的输出分配给一个变量，以便在编写前可以在同一行上调用每个

re.sub

。您是否还知道如何按特定顺序执行正则表达式？例如：做这个正则表达式，然后做这个，然后做这个…它们是按照元组中出现的顺序应用的。