Python lexer正则表达式pygments g代码_Python_Regex_Pygments

Python lexer正则表达式pygments g代码

python regex

Python lexer正则表达式pygments g代码,python,regex,pygments,Python,Regex,Pygments,我试图在Pygments上创建一个非常简单的lexer 因此，在做一些有用的事情时，我可能会获得一些Python方面的经验，然后继续创建一个更复杂的Python。lexer适用于g代码，我能够：突出显示行的注释（但不是块注释）高亮显示M和G命令（但不显示其他命令X、Y、Z等）这是gcodelexer.py from pygments.lexer import RegexLexer from pygments.token import * __all__ = ['gcodeLexer'

我试图在Pygments上创建一个非常简单的lexer 因此，在做一些有用的事情时，我可能会获得一些Python方面的经验，然后继续创建一个更复杂的Python。lexer适用于g代码，我能够：

突出显示行的注释（但不是块注释）
高亮显示M和G命令（但不显示其他命令X、Y、Z等）

这是gcodelexer.py

from pygments.lexer import RegexLexer
from pygments.token import *

__all__ = ['gcodeLexer']

class gcodeLexer(RegexLexer):
    name = 'g-code'
    aliases = ['gcode']
    filenames = ['*.gcode']

    tokens = {
        'root': [
            (r' .*\n', Text),
            (r';.*$', Comment),
            (r'^[gmtGMT]\d{1,4}\s',Name.Builtin), # M or G commands
            (r'[^gGmM][+-]?\d*[.]?\d+', Keyword), # other commands
            # (r'\+.*\n', Generic.Inserted),
            # (r'-.*\n', Generic.Deleted),
            # (r'@.*\n', Generic.Subheading),
            # (r'Index.*\n', Generic.Heading),
            # (r'=.*\n', Generic.Heading),
            (r'.*\n', Text),
        ]
    }

基本上，“其他命令”只会在每行的两个或三个命令中找到第一个，我不明白为什么…我还试图找到每个标记的描述（关键字、名称、运算符等，但没有成功）。它们的名称是否应该是自明的

谢谢

更新：当前版本

from pygments.lexer import RegexLexer
from pygments.token import *

__all__ = ['gcodeLexer']

class gcodeLexer(RegexLexer):
    name = 'g-code'
    aliases = ['gcode']
    filenames = ['*.gcode']

    tokens = {
        'root': [
            (r'^;.*$', Comment),
            (r'\s;.*', Comment.Multiline, 'blockcomment'),
            (r'^[gmtGMT]\d{1,4}\s',Name.Builtin), # M or G commands
            (r'[^gGmM][+-]?\d*[.]?\d+', Keyword),
            (r'\s', Text.Whitespace),
            (r'.*\n', Text),
        ],
        'blockcomment': [
            (r'.*;.*$', Comment.Multiline, '#pop'),
            (r'^.*\n', Comment.Multiline),
            (r'.', Comment.Multiline),
        ]
    }

[IMG]

对于任何需要gcodelexer的人，这里有@Xander的帮助。如果你想为它的改进做出贡献，这里有官方的github

实际上，我刚刚写了一个正则表达式语句来解析gcode，它将处理所有的G和M代码以及X、Y、Z、I、J、K和F代码。下面是我使用的正则表达式语句：

（G | M | X | Y | Z | I | J | K | F）（？-？\d*？\d+）

您可以检查第一组代码是否为G或M，然后第二组代码将为您提供特定代码。如果您可以发布一些示例gcode，我将查看是否可以对其进行编辑以使其也适用于您。谢谢Xander，但我认为还有很多额外代码（从A到Z）虽然比您使用的要少。但我遇到的问题是第一个文本regex。我将使用当前版本。您是否建议进行任何修改？可能是不同的标记？谢谢

[^gm0-9]（？：-？\d*。？\d+\）

应该是您在检测非G和M代码命令时所寻找的。检测您可以使用的任何命令（

（[A-z]）（？-？\d*\。？\d+\。）

M190 S50.000000
M109 S250.000000
;Sliced at: Sun 03-07-2016 17:55:50
;Basic settings: Layer height: 0.3 Walls: 1.2 Fill: 20
;Print time: 1 hour 9 minutes
;Filament used: 2.584m 20.0g
;Filament cost: 0.37
;M190 S50 ;Uncomment to add your own bed temperature line
;M109 S250 ;Uncomment to add your own temperature line
G21        ;metric values
G90        ;absolute positioning
M82        ;set extruder to absolute mode
G28 X0 Y0  ;move X/Y to min endstops
G0 X100 Y100
G28 Z0     ;move Z to min endstops
G29
G1 Z15.0 F100 ;move the platform down 15mm
G92 E0                  ;zero the extruded length
G1 F200 E3              ;extrude 3mm of feed stock
G92 E0                  ;zero the extruded length again
G1 F10800
;Put printing message on LCD screen
;?IF_EXT0?M109 T0 S?TEMP0?
M117 Printing...

;Layer count: 19
;LAYER:0
M107
G0 F10800 X48.217 Y22.131 Z0.300
;TYPE:SKIRT
G1 F1800 X48.687 Y21.229 E0.01913
G1 X48.936 Y20.744 E0.02939
G1 X49.723 Y19.693 E0.05409
G1 X50.013 Y19.303 E0.06323
G1 X51.064 Y18.293 E0.09065
G1 X51.455 Y17.957 E0.10034

from pygments.lexer import RegexLexer
from pygments.token import *

__all__ = ['gcodeLexer']

class gcodeLexer(RegexLexer):
    name = 'g-code'
    aliases = ['gcode']
    filenames = ['*.gcode']

    tokens = {
        'root': [
            (r'^;.*$', Comment),
            (r'\s;.*', Comment.Multiline, 'blockcomment'),
            (r'^[gmtGMT]\d{1,4}\s',Name.Builtin), # M or G commands
            (r'[^gGmM][+-]?\d*[.]?\d+', Keyword),
            (r'\s', Text.Whitespace),
            (r'.*\n', Text),
        ],
        'blockcomment': [
            (r'.*;.*$', Comment.Multiline, '#pop'),
            (r'^.*\n', Comment.Multiline),
            (r'.', Comment.Multiline),
        ]
    }