Bash 为什么用逗号“&引用;在antlr lexer中的[.]类型表达式中计数

Bash 为什么用逗号“&引用;在antlr lexer中的[.]类型表达式中计数,bash,parsing,compiler-construction,antlr,lexer,Bash,Parsing,Compiler Construction,Antlr,Lexer,我正在为bash脚本编写语法。我在标记符号时遇到了一个问题。下面的语法将它标记为,而我希望它标记为 但是,如果我将BLOB设置为[a-zA-Z0-9@!$^%*&+.-]+,然后将其标记为 我不明白为什么会这样 在前一种情况下,字符:和/也被标记为,因此我看不出将,标记为的原因 输入我正在标记,wget-o--quiet https,://www.google.com 我收到的输出与所提到的语法 [@0,0:3='wget',<'wget'>,1:0] [@1,4:4=' ',<

我正在为bash脚本编写语法。我在标记符号时遇到了一个问题。下面的语法将它标记为
,而我希望它标记为

但是,如果我将
BLOB
设置为
[a-zA-Z0-9@!$^%*&+.-]+,然后将其标记为

我不明白为什么会这样

在前一种情况下,字符
/
也被标记为
,因此我看不出将
标记为
的原因

输入我正在标记,
wget-o--quiet https,://www.google.com
我收到的输出与所提到的语法

[@0,0:3='wget',<'wget'>,1:0]
[@1,4:4=' ',<OTHER>,1:4]
[@2,5:5='-',<BLOB>,1:5]
[@3,6:6='o',<BLOB>,1:6]
[@4,7:7=' ',<OTHER>,1:7]
[@5,8:8='-',<BLOB>,1:8]
[@6,9:9='-',<BLOB>,1:9]
[@7,10:10='q',<BLOB>,1:10]
[@8,11:11='u',<BLOB>,1:11]
[@9,12:12='i',<BLOB>,1:12]
[@10,13:13='e',<BLOB>,1:13]
[@11,14:14='t',<BLOB>,1:14]
[@12,15:15=' ',<OTHER>,1:15]
[@13,16:16='h',<BLOB>,1:16]
[@14,17:17='t',<BLOB>,1:17]
[@15,18:18='t',<BLOB>,1:18]
[@16,19:19='p',<BLOB>,1:19]
[@17,20:20='s',<BLOB>,1:20]
[@18,21:21=',',<BLOB>,1:21]
[@19,22:22=':',<OTHER>,1:22]
[@20,23:23='/',<OTHER>,1:23]
[@21,24:24='/',<OTHER>,1:24]
[@22,25:25='w',<BLOB>,1:25]
[@23,26:26='w',<BLOB>,1:26]
[@24,27:27='w',<BLOB>,1:27]
[@25,28:28='.',<BLOB>,1:28]
[@26,29:29='g',<BLOB>,1:29]
[@27,30:30='o',<BLOB>,1:30]
[@28,31:31='o',<BLOB>,1:31]
[@29,32:32='g',<BLOB>,1:32]
[@30,33:33='l',<BLOB>,1:33]
[@31,34:34='e',<BLOB>,1:34]
[@32,35:35='.',<BLOB>,1:35]
[@33,36:36='c',<BLOB>,1:36]
[@34,37:37='o',<BLOB>,1:37]
[@35,38:38='m',<BLOB>,1:38]
[@36,39:39='\n',<'
'>,1:39]
[@37,40:39='<EOF>',<EOF>,2:0]
line 1:4 extraneous input ' ' expecting BLOB
line 1:7 extraneous input ' ' expecting {<EOF>, '
', BLOB}
line 1:15 extraneous input ' ' expecting {<EOF>, '
', BLOB}
line 1:22 extraneous input ':' expecting {<EOF>, '
', BLOB}
[@0,0:3='wget',1:0]
[@1,4:4=' ',,1:4]
[@2,5:5='-',,1:5]
[@3,6:6='o',1:6]
[@4,7:7=' ',,1:7]
[@5,8:8='-',,1:8]
[@6,9:9='-',,1:9]
[@7,10:10='q',1:10]
[@8,11:11='u',1:11]
[@9,12:12='i',1:12]
[@10,13:13='e',1:13]
[@11,14:14='t',1:14]
[@12,15:15=' ',,1:15]
[@13,16:16='h',1:16]
[@14,17:17='t',1:17]
[@15,18:18='t',1:18]
[@16,19:19='p',1:19]
[@17,20:20='s',1:20]
[@18,21:21=',',,1:21]
[@19,22:22=':',,1:22]
[@20,23:23='/',,1:23]
[@21,24:24='/',,1:24]
[@22,25:25='w',1:25]
[@23,26:26='w',1:26]
[@24,27:27='w',1:27]
[@25,28:28='.',,1:28]
[@26,29:29='g',1:29]
[@27,30:30='o',1:30]
[@28,31:31='o',1:31]
[@29,32:32='g',1:32]
[@30,33:33='l',1:33]
[@31,34:34='e',1:34]
[@32,35:35='.',,1:35]
[@33,36:36='c',1:36]
[@34,37:37='o',1:37]
[@35,38:38='m',1:38]
[@36,39:39='\n',1:39]
[@37,40:39='',,2:0]
第1行:4外部输入“”应为BLOB
第1行:7外部输入“”应为{,'
“,BLOB}
第1行:15外部输入“”应为{,'
“,BLOB}
第1:22行无关输入“:”应为{,'
“,BLOB}

如注释中所述,字符类中
+.
中的
-
被解释为范围运算符。而
在该范围内。像这样逃避它:
[a-zA-Z0-9@!$^%*&+\-.]+?


此外,lexer规则末尾的尾随
[…]+?
始终与单个字符匹配。所以
[a-zA-Z0-9@$^%*&+\-.++?
可以像
[a-zA-Z0-9@$^%*&+\-.]

一样在正则表达式中使用
[a-d]
是四个字符的范围。而
[ad-]
使减号只是一个减号,因此它与这三个字符匹配。在您的例子中,在正则表达式中使用
[+-.]
是以下四个字符的范围:
+
(ASCII代码43)
(ASCII代码44)
-
(ASCII代码45)和
(ASCII代码46),这就是逗号被接受的原因。您可以通过查看ASCII图表(类似于)来判断某个范围将接受什么
[@0,0:3='wget',<'wget'>,1:0]
[@1,4:4=' ',<OTHER>,1:4]
[@2,5:5='-',<BLOB>,1:5]
[@3,6:6='o',<BLOB>,1:6]
[@4,7:7=' ',<OTHER>,1:7]
[@5,8:8='-',<BLOB>,1:8]
[@6,9:9='-',<BLOB>,1:9]
[@7,10:10='q',<BLOB>,1:10]
[@8,11:11='u',<BLOB>,1:11]
[@9,12:12='i',<BLOB>,1:12]
[@10,13:13='e',<BLOB>,1:13]
[@11,14:14='t',<BLOB>,1:14]
[@12,15:15=' ',<OTHER>,1:15]
[@13,16:16='h',<BLOB>,1:16]
[@14,17:17='t',<BLOB>,1:17]
[@15,18:18='t',<BLOB>,1:18]
[@16,19:19='p',<BLOB>,1:19]
[@17,20:20='s',<BLOB>,1:20]
[@18,21:21=',',<BLOB>,1:21]
[@19,22:22=':',<OTHER>,1:22]
[@20,23:23='/',<OTHER>,1:23]
[@21,24:24='/',<OTHER>,1:24]
[@22,25:25='w',<BLOB>,1:25]
[@23,26:26='w',<BLOB>,1:26]
[@24,27:27='w',<BLOB>,1:27]
[@25,28:28='.',<BLOB>,1:28]
[@26,29:29='g',<BLOB>,1:29]
[@27,30:30='o',<BLOB>,1:30]
[@28,31:31='o',<BLOB>,1:31]
[@29,32:32='g',<BLOB>,1:32]
[@30,33:33='l',<BLOB>,1:33]
[@31,34:34='e',<BLOB>,1:34]
[@32,35:35='.',<BLOB>,1:35]
[@33,36:36='c',<BLOB>,1:36]
[@34,37:37='o',<BLOB>,1:37]
[@35,38:38='m',<BLOB>,1:38]
[@36,39:39='\n',<'
'>,1:39]
[@37,40:39='<EOF>',<EOF>,2:0]
line 1:4 extraneous input ' ' expecting BLOB
line 1:7 extraneous input ' ' expecting {<EOF>, '
', BLOB}
line 1:15 extraneous input ' ' expecting {<EOF>, '
', BLOB}
line 1:22 extraneous input ':' expecting {<EOF>, '
', BLOB}