Regex UCI数据集上的Scala正则表达式帮助

Regex UCI数据集上的Scala正则表达式帮助,regex,scala,Regex,Scala,大家好,我正在尝试使用scala正则表达式解析一些数据 以下是我试图处理的文本: 但是我得到了以下错误: scala.MatchError:[Ljava.lang.String;@62f8fff1类[Ljava.lang.String 不过,在线正则表达式生成器似乎可以工作: 有什么想法吗?:我以前从未在scala中编程过,但从我在 你必须逃逸两次,比如数字 所以\d在scala中会变成\\d等等。谢谢!我尝试了你的建议并使用了[\\s]+\\lines:\\d*[\\s]*\\n\\n[\\

大家好,我正在尝试使用scala正则表达式解析一些数据

以下是我试图处理的文本:

但是我得到了以下错误:

scala.MatchError:[Ljava.lang.String;@62f8fff1类[Ljava.lang.String

不过,在线正则表达式生成器似乎可以工作:


有什么想法吗?:

我以前从未在scala中编程过,但从我在 你必须逃逸两次,比如数字


所以\d在scala中会变成\\d等等。

谢谢!我尝试了你的建议并使用了[\\s]+\\lines:\\d*[\\s]*\\n\\n[\\s\\s]*.r但似乎也不起作用,我相信错误在于\lines。你确定你想逃避这个问题吗?尝试一下,但似乎出错了val docParser=[\\s]+lines:\\d*[\\s]*\\n\\n[\\s\\s]*.r scala.MatchError:[Ljava.lang.String;@3d650371类[Ljava.lang.String;请不要转义两次,再试一次,可能意味着scala中的文字字符串。很抱歉无法测试它;只是给您一些粗略的提示:
val inputData = ""xref: cantaloupe.srv.cs.cmu.edu alt.atheism:51121 soc.motss:139944 rec.scouting:5318
newsgroups: alt.atheism,soc.motss,rec.scouting
path: cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!fs7.ece.cmu.edu!europa.eng.gtefsd.com!howland.reston.ans.net!wupost!uunet!newsgate.watson.ibm.com!yktnews.watson.ibm.com!watson!watson.ibm.com!strom
from: strom@watson.ibm.com (rob strom)
subject: re: [soc.motss, et al.] "princeton axes matching funds for boy scouts"
sender: @watson.ibm.com
message-id: <1993apr05.180116.43346@watson.ibm.com>
date: mon, 05 apr 93 18:01:16 gmt
distribution: usa
references: <c47efs.3q47@austin.ibm.com> <1993mar22.033150.17345@cbnewsl.cb.att.com> <n4hy.93apr5120934@harder.ccr-p.ida.org>
organization: ibm research
lines: 15

in article <n4hy.93apr5120934@harder.ccr-p.ida.org>, n4hy@harder.ccr-p.ida.org (bob mcgwier) writes:

|> [1] however, i hate economic terrorism and political correctness
|> worse than i hate this policy.  


|> [2] a more effective approach is to stop donating
|> to any organizating that directly or indirectly supports gay rights issues
|> until they end the boycott on funding of scouts.  

can somebody reconcile the apparent contradiction between [1] and [2]?

-- 
rob strom, strom@watson.ibm.com, (914) 784-7641
ibm research, 30 saw mill river road, p.o. box 704, yorktown heights, ny  10598"
in article <n4hy.93apr5120934@harder.ccr-p.ida.org>, n4hy@harder.ccr-p.ida.org (bob mcgwier) writes:

|> [1] however, i hate economic terrorism and political correctness
|> worse than i hate this policy.  


|> [2] a more effective approach is to stop donating
|> to any organizating that directly or indirectly supports gay rights issues
|> until they end the boycott on funding of scouts.  

can somebody reconcile the apparent contradiction between [1] and [2]?
val docParser = """([\\s\\S]+\\lines: \\d*)([\\s\\S]*\\n\\n)([\\s\\S]*)""".r
val docParser(metadata, content, footer) = inputText