Scala 用于提取标记和固定文本的parboiled2解析器
需要提取标记和固定文本。示例: “你好{token1}今天的日期是{token2}您想打电话给{token3}” 会回来吗Scala 用于提取标记和固定文本的parboiled2解析器,scala,parsing,extract,parboiled2,Scala,Parsing,Extract,Parboiled2,需要提取标记和固定文本。示例: “你好{token1}今天的日期是{token2}您想打电话给{token3}” 会回来吗 FixedPart(“你好”) TokenPart(token1) FixedPart(“今天的日期是”) TokenPart(token2) FixedPart(“您想打电话吗”) TokenPart(token3) 这里是幼稚的实现 import org.parboiled2.ParserInput import org.parboiled2.Parser impo
FixedPart(“你好”)
TokenPart(token1)
FixedPart(“今天的日期是”)
TokenPart(token2)
FixedPart(“您想打电话吗”)
TokenPart(token3)
import org.parboiled2.ParserInput
import org.parboiled2.Parser
import org.parboiled2.CharPredicate
sealed trait Part
case class TokenPart(tokenName : String ) extends Part
case class FixedPart( text : String ) extends Part
class MyParser(val input: ParserInput) extends Parser {
def Token = rule { '{' ~ capture(TokenName) ~> (TokenPart(_)) ~'}' }
//how this should be implemented??
def NotToken = rule { capture (!Token) ~>(FixedPart(_) )}
def TokenName = rule { CharPredicate.Alpha ~ oneOrMore (CharPredicate.AlphaNum) }
// This would not work
def TokenNotToken = rule { (Token|NotToken) }
def InputLine = rule { zeroOrMore (TokenNotToken) }
}
object MyParser {
def main(args: Array[String]) {
val res = new MyParser("Hello {token1} today's date is {token2} would you like to call {token3}").InputLine.run() // Success
println( res )
}
}
任何其他实现此功能的方法???您好,我修改了您的代码并添加了一些注释(我希望它们会有所帮助),因此它可以正常工作,并且(我猜)实现了您希望它实现的功能:
import org.parboiled2.ParserInput
import org.parboiled2.Parser
import org.parboiled2.CharPredicate
sealed trait Token
case class TokenPart(tokenName : String) extends Token
case class StringToken(text: String) extends Token
// I moved pre-evaluated char predicates to the companion
// you may leave them inside the class if you want.
// I also moved literals like startToken and endToken here
object TokenExtractor {
val AlphaChar = CharPredicate.Alpha
val AlphaNum = CharPredicate.AlphaNum
val startToken = "{"
val endToken = "}"
}
class TokenExtractor(val input: ParserInput) extends Parser {
import TokenExtractor._
// may be you wanted zero or more? Anyway in this case
// shortcut can play nice here. In fact, if you want to stick
// with oneOrMore you can user AlphaNum.+ instead
def TokenName = rule {
AlphaChar ~ AlphaNum.*
}
// There's a shortcut for Extraction syntax. If you are extracting
// data to the case class and Rule arguments match the number of
// items in the case class's apply method
// you can simply give a name of this case class:
// the extraction operator '~>' should be located at the end of the
// from the official documtation:
// https://github.com/sirthias/parboiled2
// One more very useful feature is special support for
// case class instance creation:
//
// case class Person(name: String, age: Int)
// (foo: Rule2[String, Int]) ~> Person
//
def Token = rule {
startToken ~ capture(TokenName) ~ endToken ~> TokenPart
}
// the text should follow until the parser will meet the
// enclosing '{' character. Disclosing is not mandatory :)
def Text = rule {
oneOrMore(noneOf(startToken))
}
// Here we're capturing a data that matches
// pre-defined rule (in our case Text)
def TextString = rule {
capture(Text) ~> StringToken
}
def TextPart = rule {
TextString | Token
}
// EOI is mandatory. Parser is greedy, so it tells the parser
// where parsing procedure must end, so please, add it at the
// end of the input
def InputLine = rule {
zeroOrMore(TextPart) ~ EOI
}
}
object Main {
def main(args: Array[String]) {
val example =
"Hello {token1} today's date is {token2} would you like to call {token3}"
// parser input can be string, so put it inside the constructor
val result = new TokenExtractor(example).InputLine.run()
println(result)
}
}
请不要在规则内调用
CharPredicate.NAME
。创建一个变量并分配谓词的值。在您的代码中,每当解析器面对规则时,都会计算CharPredicate.NAME。这会降低性能。