Regex 使用ANTLR解析.torrent文件_Regex_Parsing_Antlr_Torrent

Regex 使用ANTLR解析.torrent文件

regex parsing antlr

Regex 使用ANTLR解析.torrent文件,regex,parsing,antlr,torrent,Regex,Parsing,Antlr,Torrent,我正在尝试使用ANTLR4解析.torent文件中的metainfo metainfo文件中的所有数据都是bencoded。基准编码规范： dictionary = "d" 1*(string anytype) "e" ; non-empty dictionary list = "l" 1*anytype "e" ; non-empty list integer = "i" signumber "e" string = number ":" <n

我正在尝试使用ANTLR4解析.torent文件中的metainfo

metainfo文件中的所有数据都是bencoded。基准编码规范：

dictionary = "d" 1*(string anytype) "e" ; non-empty dictionary
list       = "l" 1*anytype "e"          ; non-empty list
integer    = "i" signumber "e"
string     = number ":" <number long sequence of any CHAR>
anytype    = dictionary / list / integer / string
signumber  = "-" number / number
number     = 1*DIGIT
CHAR       = %x00-FF                    ; any 8-bit character
DIGIT      = "0" / "1" / "2" / "3" / "4" /
            "5" / "6" / "7" / "8" / "9"

但面对下一个问题。我有以下例子：

d3:one3:twoe

解析器不会将第二个字符串识别为

two

，而是识别为

twoe

。因此，解析器无法识别字典的结尾，而字典的结尾是

类似地，另一个示例

d3:onel4:testee

无法识别，因为第一个字符串是

onel

，而不是

one

我如何修正我的语法，不管它对这个案例来说是正确的

另外，不要介意字符串a不是

%x00 FF

，此语法是草稿，包含其他小错误。

这似乎高于上下文无关语言。要解析它，需要捕获字符串中的字符数，并使用它来标识字符串的结尾。不确定antlr是否支持解析此案例。您好，nhahtdh。我尝试了以下定义bstring的示例：bstring:INT{chars=Integer.parseInt（$INT.text）；System.out.println（chars）；}'：'（{count=getCurrentToken（）.getStopIndex（）-getCurrentToken（）.getStartIndex（）+1；}STRING）{if（chars！=count）抛出新的RuntimeException（）；}；这个例子有效，长度大于字符串长度。但这并不正确。这似乎高于上下文无关语言。要解析它，需要捕获字符串中的字符数，并使用它来标识字符串的结尾。不确定antlr是否支持解析此案例。您好，nhahtdh。我尝试了以下定义bstring的示例：bstring:INT{chars=Integer.parseInt（$INT.text）；System.out.println（chars）；}'：'（{count=getCurrentToken（）.getStopIndex（）-getCurrentToken（）.getStartIndex（）+1；}STRING）{if（chars！=count）抛出新的RuntimeException（）；}；这个例子有效，长度大于字符串长度。但这并不正确。

d3:one3:twoe