Scala 用于提取标记和固定文本的parboiled2解析器

Scala 用于提取标记和固定文本的parboiled2解析器,scala,parsing,extract,parboiled2,Scala,Parsing,Extract,Parboiled2,需要提取标记和固定文本。示例: “你好{token1}今天的日期是{token2}您想打电话给{token3}” 会回来吗 FixedPart(“你好”) TokenPart(token1) FixedPart(“今天的日期是”) TokenPart(token2) FixedPart(“您想打电话吗”) TokenPart(token3) 这里是幼稚的实现 import org.parboiled2.ParserInput import org.parboiled2.Parser impo

需要提取标记和固定文本。示例: “你好{token1}今天的日期是{token2}您想打电话给{token3}”

会回来吗

  • FixedPart(“你好”)
  • TokenPart(token1)
  • FixedPart(“今天的日期是”)
  • TokenPart(token2)
  • FixedPart(“您想打电话吗”)
  • TokenPart(token3)
这里是幼稚的实现

import org.parboiled2.ParserInput
import org.parboiled2.Parser
import org.parboiled2.CharPredicate
sealed trait Part 
case class TokenPart(tokenName : String ) extends Part
case class FixedPart( text : String ) extends Part 
class MyParser(val input: ParserInput) extends Parser {
  def Token = rule { '{' ~ capture(TokenName) ~>  (TokenPart(_)) ~'}'     }

  //how this should be implemented?? 
  def NotToken = rule { capture (!Token) ~>(FixedPart(_) )} 
  def TokenName = rule { CharPredicate.Alpha ~ oneOrMore (CharPredicate.AlphaNum) }

  // This would not work 
  def TokenNotToken = rule { (Token|NotToken)  }
  def InputLine = rule { zeroOrMore (TokenNotToken) }

}
object MyParser {
  def main(args: Array[String]) {
    val res = new MyParser("Hello {token1} today's date is {token2} would you like to call {token3}").InputLine.run() // Success
    println( res )     
  }
}

任何其他实现此功能的方法???

您好,我修改了您的代码并添加了一些注释(我希望它们会有所帮助),因此它可以正常工作,并且(我猜)实现了您希望它实现的功能:

import org.parboiled2.ParserInput
import org.parboiled2.Parser
import org.parboiled2.CharPredicate

sealed trait Token
case class TokenPart(tokenName : String) extends Token
case class StringToken(text: String) extends Token

// I moved pre-evaluated char predicates to the companion
// you may leave them inside the class if you want.
// I also moved literals like startToken and endToken here
object TokenExtractor {
  val AlphaChar = CharPredicate.Alpha
  val AlphaNum = CharPredicate.AlphaNum

  val startToken = "{"
  val endToken   = "}"
}


class TokenExtractor(val input: ParserInput) extends Parser {
  import TokenExtractor._

  // may be you wanted zero or more? Anyway in this case
  // shortcut can play nice here. In fact, if you want to stick
  // with oneOrMore you can user AlphaNum.+ instead
  def TokenName = rule {
    AlphaChar ~ AlphaNum.*
  }

  // There's a shortcut for Extraction syntax. If you are extracting
  // data to the case class and Rule arguments match the number of
  // items in the case class's apply method
  // you can simply give a name of this case class:
  // the extraction operator '~>' should be located at the end of the
  // from the official documtation:
  // https://github.com/sirthias/parboiled2
  // One more very useful feature is special support for
  // case class instance creation:
  //
  // case class Person(name: String, age: Int)
  // (foo: Rule2[String, Int]) ~> Person
  //
  def Token = rule {
    startToken ~ capture(TokenName) ~ endToken ~> TokenPart
  }

  // the text should follow until the parser will meet the
  // enclosing '{' character. Disclosing is not mandatory :)
  def Text = rule {
    oneOrMore(noneOf(startToken))
  }

  // Here we're capturing a data that matches
  // pre-defined rule (in our case Text)
  def TextString = rule {
    capture(Text) ~> StringToken
  }


  def TextPart = rule {
    TextString | Token
  }


  // EOI is mandatory. Parser is greedy, so it tells the parser
  // where parsing procedure must end, so please, add it at the
  // end of the input
  def InputLine = rule {
    zeroOrMore(TextPart) ~ EOI
  }
}


object Main {
  def main(args: Array[String]) {
    val example =
      "Hello {token1} today's date is {token2} would you like to call {token3}"

    // parser input can be string, so put it inside the constructor
    val result = new TokenExtractor(example).InputLine.run()
    println(result)
  }
}

请不要在规则内调用
CharPredicate.NAME
。创建一个变量并分配谓词的值。在您的代码中,每当解析器面对规则时,都会计算CharPredicate.NAME。这会降低性能。