Ruby中的ArnoldC lexer
我正在尝试用Ruby为ArnoldC()编写一个简单的lexer 我希望一个方法可以定义为:Ruby中的ArnoldC lexer,ruby,lexer,Ruby,Lexer,我正在尝试用Ruby为ArnoldC()编写一个简单的lexer 我希望一个方法可以定义为: 仔细听我说我的方法 和手说“你好,世界!” 哈斯塔·拉维斯塔,宝贝 我有以下代码: class ArnoldLexer KEYWORDS = ["LISTEN TO ME VERY CAREFULLY", "TALK TO THE HAND", "HASTA LA VISTA, BABY"] def tokenize(code) # Cleanup code by removing extr
仔细听我说我的方法
和手说“你好,世界!”
哈斯塔·拉维斯塔,宝贝
我有以下代码:
class ArnoldLexer
KEYWORDS = ["LISTEN TO ME VERY CAREFULLY", "TALK TO THE HAND", "HASTA LA VISTA, BABY"]
def tokenize(code)
# Cleanup code by removing extra line breaks
code.chomp!
# Current character postion
i = 0
# Collection of all parsed tokens in the form [:TOKEN_TYPE, value]
tokens = []
# Implement a very simple scanner.
# Scan one character at a time until there is something to parse.
while i < code.size
chunk = code[i..-1]
# Matching standard tokens.
if identifier = chunk[/\A([A-Z\s\,]*)/, 1]
# Keywords are special identifiers tagged with their own name,
# 'if' will result in an [:IF, "if"] token.
if KEYWORDS.include?(identifier)
tokens << [identifier.upcase.to_sym, identifier]
# Skip what was just parsed.
i += identifier.size
end
elsif identifier = chunk[/\A([a-z]*)/, 1]
tokens << [:IDENTIFIER, identifier]
i += identifier.size
# Matching class names and constants starting with a capital letter.
elsif constant = chunk[/\A([A-Z]\w*)/, 1]
tokens << [:CONSTANT, constant]
i += constant.size
elsif newline = chunk[/\A\n/, 1]
tokens << [:NEWLINE, "\n"]
elsif number = chunk[/\A([0-9]+)/, 1]
tokens << [:NUMBER, number.to_i]
i += number.size
elsif string = chunk[/\A"(.*?)"/, 1]
tokens << [:STRING, string]
i += string.size + 2
end
end
tokens
end
class-ArnoldLexer
关键词=[“仔细听我说”,“和手说话”,“哈斯塔拉维斯塔,宝贝”]
def标记化(代码)
#通过删除额外的换行符来清理代码
代码,chomp!
#当前角色位置
i=0
#以[:TOKEN\u TYPE,value]形式的所有已解析令牌的集合
代币=[]
#实现一个非常简单的扫描仪。
#一次扫描一个字符,直到有东西需要解析。
而我