Ruby 正则表达式从文本中解析出数据

Ruby 正则表达式从文本中解析出数据,ruby,regex,Ruby,Regex,我使用Ruby 2.2解析以下文本: [key1: this is a bunch of text that can span multiple lines. key2: foo key2: bar key3: this can span multiple lines as well ] 放入如下所示的哈希数组: [ key1: "this is a bunch of text that can span multiple lines." key2: ["foo",

我使用Ruby 2.2解析以下文本:

[key1: this is a bunch of text that can 
span multiple lines. 
key2: foo 
key2: bar
key3: this can span multiple lines 
as well 
]
放入如下所示的哈希数组:

[
    key1: "this is a bunch of text that can span multiple lines."
    key2: ["foo", "bar"]
    key3: "this can span multiple lines as well"
]
我的第一个目标是使用正则表达式解析出键/值对,这就是我一直坚持的目标:

/\[((key1|key2|key3): (.+?))+(?=(?:key1:|key2:|key3:|\]))/m
它不起作用,因为我用来查找下一个键或结束括号的先行词似乎与文本匹配。我的理解是,事实并非如此


如有任何建议,将不胜感激。谢谢。

要更改的一件事是使第二组键不被捕获:
\[((key1 | key2 | key3):(.+)(?=(?:key1:| key2:| key3:|\])
没问题!如果您也可以使外部组不捕获(除非您需要“key:whatever text”
\[(?:(key1 | key2 | key3):(.+)(?=(?:key1:| key2:| key3:|\])
听起来不错,但无论哪种方式,rubular都是测试regexp的好网站。)您对此感兴趣吗@小心点,我假设结束括号有自己的行。但有一点变化:我喜欢这个解决方案的简单性。谢谢
data = %Q|[key1: this is a bunch of text that can 
span multiple lines. 
key2: foo 
key2: bar
key3: this can span multiple lines 
as well 
]|

p data[1..-2] #Remove square brackets [...] 
  .split(/(key\d):\s+/)[1..-1] #regexp out keys and values. (And get rid of initial empty string)
  .each_slice(2) #Group into key-value lists
  .group_by(&:shift) # Group by first values