Scala 用于提取标记和固定文本的parboiled2解析器_Scala_Parsing_Extract_Parboiled2

Scala 用于提取标记和固定文本的parboiled2解析器

scala parsing

Scala 用于提取标记和固定文本的parboiled2解析器,scala,parsing,extract,parboiled2,Scala,Parsing,Extract,Parboiled2,需要提取标记和固定文本。示例： “你好{token1}今天的日期是{token2}您想打电话给{token3}” 会回来吗 FixedPart（“你好”） TokenPart（token1） FixedPart（“今天的日期是”） TokenPart（token2） FixedPart（“您想打电话吗”） TokenPart（token3）这里是幼稚的实现 import org.parboiled2.ParserInput import org.parboiled2.Parser impo

需要提取标记和固定文本。示例： “你好{token1}今天的日期是{token2}您想打电话给{token3}”

会回来吗

```
FixedPart（“你好”）
```
```
TokenPart（token1）
```
```
FixedPart（“今天的日期是”）
```
```
TokenPart（token2）
```
```
FixedPart（“您想打电话吗”）
```
```
TokenPart（token3）
```

这里是幼稚的实现

import org.parboiled2.ParserInput
import org.parboiled2.Parser
import org.parboiled2.CharPredicate
sealed trait Part 
case class TokenPart(tokenName : String ) extends Part
case class FixedPart( text : String ) extends Part 
class MyParser(val input: ParserInput) extends Parser {
  def Token = rule { '{' ~ capture(TokenName) ~>  (TokenPart(_)) ~'}'     }

  //how this should be implemented?? 
  def NotToken = rule { capture (!Token) ~>(FixedPart(_) )} 
  def TokenName = rule { CharPredicate.Alpha ~ oneOrMore (CharPredicate.AlphaNum) }

  // This would not work 
  def TokenNotToken = rule { (Token|NotToken)  }
  def InputLine = rule { zeroOrMore (TokenNotToken) }

}
object MyParser {
  def main(args: Array[String]) {
    val res = new MyParser("Hello {token1} today's date is {token2} would you like to call {token3}").InputLine.run() // Success
    println( res )     
  }
}

任何其他实现此功能的方法？？？

您好，我修改了您的代码并添加了一些注释（我希望它们会有所帮助），因此它可以正常工作，并且（我猜）实现了您希望它实现的功能：

import org.parboiled2.ParserInput
import org.parboiled2.Parser
import org.parboiled2.CharPredicate

sealed trait Token
case class TokenPart(tokenName : String) extends Token
case class StringToken(text: String) extends Token

// I moved pre-evaluated char predicates to the companion
// you may leave them inside the class if you want.
// I also moved literals like startToken and endToken here
object TokenExtractor {
  val AlphaChar = CharPredicate.Alpha
  val AlphaNum = CharPredicate.AlphaNum

  val startToken = "{"
  val endToken   = "}"
}


class TokenExtractor(val input: ParserInput) extends Parser {
  import TokenExtractor._

  // may be you wanted zero or more? Anyway in this case
  // shortcut can play nice here. In fact, if you want to stick
  // with oneOrMore you can user AlphaNum.+ instead
  def TokenName = rule {
    AlphaChar ~ AlphaNum.*
  }

  // There's a shortcut for Extraction syntax. If you are extracting
  // data to the case class and Rule arguments match the number of
  // items in the case class's apply method
  // you can simply give a name of this case class:
  // the extraction operator '~>' should be located at the end of the
  // from the official documtation:
  // https://github.com/sirthias/parboiled2
  // One more very useful feature is special support for
  // case class instance creation:
  //
  // case class Person(name: String, age: Int)
  // (foo: Rule2[String, Int]) ~> Person
  //
  def Token = rule {
    startToken ~ capture(TokenName) ~ endToken ~> TokenPart
  }

  // the text should follow until the parser will meet the
  // enclosing '{' character. Disclosing is not mandatory :)
  def Text = rule {
    oneOrMore(noneOf(startToken))
  }

  // Here we're capturing a data that matches
  // pre-defined rule (in our case Text)
  def TextString = rule {
    capture(Text) ~> StringToken
  }


  def TextPart = rule {
    TextString | Token
  }


  // EOI is mandatory. Parser is greedy, so it tells the parser
  // where parsing procedure must end, so please, add it at the
  // end of the input
  def InputLine = rule {
    zeroOrMore(TextPart) ~ EOI
  }
}


object Main {
  def main(args: Array[String]) {
    val example =
      "Hello {token1} today's date is {token2} would you like to call {token3}"

    // parser input can be string, so put it inside the constructor
    val result = new TokenExtractor(example).InputLine.run()
    println(result)
  }
}

请不要在规则内调用

CharPredicate.NAME

。创建一个变量并分配谓词的值。在您的代码中，每当解析器面对规则时，都会计算CharPredicate.NAME。这会降低性能。