Python 如何构建一个；领先的；等价于PyParsing中的followerBy子类_Python_Pyparsing

Python 如何构建一个；领先的；等价于PyParsing中的followerBy子类

python

Python 如何构建一个；领先的；等价于PyParsing中的followerBy子类,python,pyparsing,Python,Pyparsing,我试图通过使用PyParsing删除前导或尾随的空白字符来清除一些代码。删除前导空格非常容易，因为我可以使用与字符串匹配但不包含它的followerby子类。现在，对于标识字符串后面的内容，我需要相同的代码这里有一个小例子： from pyparsing import * insource = """ annotation (Documentation(info=" <html> <b>FOO</b> </html> "));

我试图通过使用PyParsing删除前导或尾随的空白字符来清除一些代码。删除前导空格非常容易，因为我可以使用与字符串匹配但不包含它的

followerby

子类。现在，对于标识字符串后面的内容，我需要相同的代码

这里有一个小例子：

from pyparsing import *

insource = """
annotation (Documentation(info="  
  <html>  
<b>FOO</b>
</html>  
 "));
"""
# Working replacement:
HTMLStartref = OneOrMore(White(' \t\n')) + (FollowedBy(CaselessLiteral('<html>')))

## Not working because of non-existing "LeadBy" 
# HTMLEndref = LeadBy(CaselessLiteral('</html>')) + OneOrMore(White(' \t\n')) + FollowedBy('"')

out = Suppress(HTMLStartref).transformString(insource)
out2 = Suppress(HTMLEndref).transformString(out)

从pyparsing导入*
内包=”“”
注释（文档）（信息=”
福
"));
"""
#工作替换：
HTMLStartref=一个或多个（白色（'\t\n'））+（后跟（CaselessLiteral（''））
##不工作，因为不存在“领导”
#HTMLEndref=LeadBy（CaselessLiteral（“”））+one或more（White（'\t\n'））+followerby（''“'））
out=Suppress（HTMLStartref）.transformString（内源）
out2=抑制（HTMLEndref）.transformString（输出）

作为输出，我们得到：

>>> print out
annotation (Documentation(info="<html>
<b>FOO</b>
</html>
 "));

>>打印输出
注释（文档）（信息=”
福
"));

而应该得到：

>>> print out2 annotation (Documentation(info="<html> <b>FOO</b> </html>"));

>>打印输出2 注释（文档）（信息=” 福 "));

我查看了，但找不到一个“
LeadBy
”等价于“
followdby
”，也找不到实现这一点的方法。
您要求的是类似于“lookback”的东西，也就是说，仅当某个内容前面有特定模式时才匹配。目前我还没有一个明确的类，但是对于你想要做的事情，你仍然可以从左到右变换，只保留前导部分，而不是抑制它，只是抑制空白
以下是解决问题的几种方法：

# define expressions to match leading and trailing # html tags, and just suppress the leading or trailing whitespace opener = White().suppress() + Literal("<html>") closer = Literal("</html>") + White().suppress() # define a single expression to match either opener # or closer - have to add leaveWhitespace() call so that # we catch the leading whitespace in opener either = opener|closer either.leaveWhitespace() print either.transformString(insource) # alternative, if you know what the tag will look like: # match 'info=<some double quoted string>', and use a parse # action to extract the contents within the quoted string, # call strip() to remove leading and trailing whitespace, # and then restore the original '"' characters (which are # auto-stripped by the QuotedString class by default) infovalue = QuotedString('"', multiline=True) infovalue.setParseAction(lambda t: '"' + t[0].strip() + '"') infoattr = "info=" + infovalue print infoattr.transformString(insource)

#定义表达式以匹配前导和尾随 #html标记，并仅抑制前导或尾随空格 opener=White（）.suppress（）+文本（“”） closer=Literal（“”+White（）。suppress（） #定义一个表达式以匹配任一开场白 #或者更接近-必须添加leaveWhitespace（）调用，以便 #我们在开场白中抓住了领先的空白要么=开启器|关闭器 leaveWhitespace（）中的任意一个打印其中一个.transformString（内源） #或者，如果您知道标签的外观： #匹配“info=”，并使用解析 #提取带引号字符串中的内容的操作， #调用strip（）以删除前导空格和尾随空格， #然后还原原始的“%”字符（即 #默认情况下由QuotedString类自动剥离） infovalue=QuotedString（“”，多行=True） infovalue.setParseAction（lambda t:“”+t[0].strip（）+”） infoattr=“info=”+infovalue 打印infoattr.transformString（内源）
谢谢你，保罗！这正是我想要的。由于问题更加复杂，我将坚持使用第一种解决方案（尽管我非常喜欢第二种实现，并尽量记住那一种）。