Python/Pyparsing-多行引号

Python/Pyparsing-多行引号,python,pyparsing,Python,Pyparsing,我尝试使用pyparsing来匹配多行字符串,该字符串可以以类似于python的方式继续: Test = "This is a long " \ "string" 我找不到办法让你认识到这一点。以下是我到目前为止所做的尝试: import pyparsing as pp src1 = ''' Test("This is a long string") ''' src2 = ''' Test("This is a long " \ "string") ''' _lp

我尝试使用pyparsing来匹配多行字符串,该字符串可以以类似于python的方式继续:

Test = "This is a long " \
       "string"
我找不到办法让你认识到这一点。以下是我到目前为止所做的尝试:

import pyparsing as pp

src1 = '''
Test("This is a long string")
'''

src2 = '''
Test("This is a long " \
     "string")
'''

_lp = pp.Suppress('(')
_rp = pp.Suppress(')')
_str = pp.QuotedString('"', multiline=True, unquoteResults=False)
func = pp.Word(pp.alphas)

function = func + _lp + _str + _rp
print src1
print function.parseString(src1)
print '-------------------------'
print src2
print function.parseString(src2)

问题是,使用多行带引号的字符串并不能满足您的要求。多行带引号的字符串实际上是一个包含换行符的字符串:

import pyparsing as pp

src0 = '''
"Hello
 World
 Goodbye and go"
'''

pat = pp.QuotedString('"', multiline=True)
print pat.parseString(src0)
解析此字符串的输出将是
['Hello\n World\n再见并开始]

据我所知,如果您想要一个类似于Python字符串行为的字符串,您必须自己定义它:

import pyparsing as pp

src1 = '''
Test("This is a long string")
'''

src2 = '''
Test("This is a long"
    "string")
'''

src3 = '''

Test("This is a long" \\
     "string")
'''

_lp = pp.Suppress('(')
_rp = pp.Suppress(')')
_str = pp.QuotedString('"')
_slash = pp.Suppress(pp.Optional("\\"))
_multiline_str = pp.Combine(pp.OneOrMore(_str + _slash), adjacent=False)

func = pp.Word(pp.alphas)

function = func + _lp + _multiline_str + _rp

print src1
print function.parseString(src1)
print '-------------------------'
print src2
print function.parseString(src2)
print '-------------------------'
print src3
print function.parseString(src3)
这将产生以下输出:

Test("This is a long string")

['Test', 'This is a long string']
-------------------------

Test("This is a long"
    "string")

['Test', 'This is a longstring']
-------------------------

Test("This is a long" \
     "string")

['Test', 'This is a longstring']

注意:
Combine
类将各种带引号的字符串合并到单个单元中,以便它们在输出列表中显示为单个字符串。反斜杠被抑制的原因是,它不会作为输出字符串的一部分进行组合。

谢谢,这正是我所希望的!