Ruby 正则表达式从文本中解析出数据_Ruby_Regex

Ruby 正则表达式从文本中解析出数据

ruby regex

Ruby 正则表达式从文本中解析出数据,ruby,regex,Ruby,Regex,我使用Ruby 2.2解析以下文本： [key1: this is a bunch of text that can span multiple lines. key2: foo key2: bar key3: this can span multiple lines as well ] 放入如下所示的哈希数组： [ key1: "this is a bunch of text that can span multiple lines." key2: ["foo",

我使用Ruby 2.2解析以下文本：

[key1: this is a bunch of text that can 
span multiple lines. 
key2: foo 
key2: bar
key3: this can span multiple lines 
as well 
]

放入如下所示的哈希数组：

[
    key1: "this is a bunch of text that can span multiple lines."
    key2: ["foo", "bar"]
    key3: "this can span multiple lines as well"
]

我的第一个目标是使用正则表达式解析出键/值对，这就是我一直坚持的目标：

/\[((key1|key2|key3): (.+?))+(?=(?:key1:|key2:|key3:|\]))/m

它不起作用，因为我用来查找下一个键或结束括号的先行词似乎与文本匹配。我的理解是，事实并非如此

如有任何建议，将不胜感激。谢谢。

要更改的一件事是使第二组键不被捕获：

\[（（key1 | key2 | key3）：（.+）（？=（？：key1:| key2:| key3:|\]）

没问题！如果您也可以使外部组不捕获（除非您需要“key:whatever text”

\[（？：（key1 | key2 | key3）：（.+）（？=（？：key1:| key2:| key3:|\]）

听起来不错，但无论哪种方式，rubular都是测试regexp的好网站。）您对此感兴趣吗@小心点，我假设结束括号有自己的行。但有一点变化：我喜欢这个解决方案的简单性。谢谢

data = %Q|[key1: this is a bunch of text that can 
span multiple lines. 
key2: foo 
key2: bar
key3: this can span multiple lines 
as well 
]|

p data[1..-2] #Remove square brackets [...] 
  .split(/(key\d):\s+/)[1..-1] #regexp out keys and values. (And get rid of initial empty string)
  .each_slice(2) #Group into key-value lists
  .group_by(&:shift) # Group by first values