Ruby 在不属于单词的单引号处拆分文本

Ruby 在不属于单词的单引号处拆分文本,ruby,regex,Ruby,Regex,我想要一个正则表达式,它可以将被单引号包围的文本提取到一个数组中。例如,此正则表达式将提取括号之间的文本: string = "(Well!) thought Alice to herself, (after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it

我想要一个正则表达式,它可以将被单引号包围的文本提取到一个数组中。例如,此正则表达式将提取括号之间的文本:

string = "(Well!) thought Alice to herself, (after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it, even if I fell off the top of the house!)"
string.scan(/\((?>[^\(\)\\]+|\\{2}|\\.)*\)/)
# => ["(Well!)", "(after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it, even if I fell off the top of the house!)"] 
我想用单引号做同样的事情。我需要忽略在a-z或a-z范围内的字符前面和后面的单引号,例如,当它是收缩的一部分而不是用作引号时

string = "'Well!' thought Alice to herself, 'after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it, even if I fell off the top of the house!'"
# => ["'Well!'", "'after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it, even if I fell off the top of the house!'"] 
我试过这个:

string.scan(/'(?>[^'\\]+|\\{2}|\\.)*'/)
# => ["'Well!'", "'after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'", "'t say anything about it, even if I fell off the top of the house!'"] 

我建议你用这个正则表达式

'[^']+'

要在括号中拆分文本,您不能将正则表达式简化为:string.scan/\[^\]*\/?在将来,我建议您将代码精简为基本内容,并以读者不必水平滚动的方式编写代码。在这里,你可以用一根短得多的弦来表达你的观点。我想这将是一个问题:它是“嗡嗡”的吉他assume@CarySwoveland是的,这是一种边缘情况。有没有想过如何解决这个问题?我看不出有什么解决办法。这不仅仅是以“z”或“s”结尾的专有名词,这是一个问题:苔丝“冈纳”一定是疯了,因为冰箱里没有啤酒。或者嘿,兄弟,希拉和特里谢是猫的喵喵叫,不是吗?。注意,你不能通过看一个词是否大写来判断它是否是专有名词,因为它可能在一个句子的开头。苔丝可以证实。。我的建议是:把这个项目交给新来的人。谢谢!你能解释一下你的正则表达式的每一部分对我的教育都有什么作用吗?你的正则表达式将在以下示例中失败:让我们去拿鲍勃的包。=>[让我们去找鲍勃]。如何更改它以检查第一个和最后一个单引号是否也不是收缩的一部分?@diasks,我会解释?和分别为负向后看和向前看。他们确保“第一秒”之前不会紧跟着一个字母是一个非捕获组,因为*重复了零次或多次。它必须匹配|除^=>'以外的任何字符,而不是“单引号[^']或后跟字母的单引号,并且必须用单引号括起来。尼斯,h。我不知道\B是否匹配非单词边界@diasks2,是什么?使.*'非贪婪',这将在正则表达式中遇到以下字符'\B-时立即停止.*匹配。如果没有?,则将应用默认的“贪婪”模式,因此。*将吞掉字符,直到到达“\B”的最后一个实例,它位于字符串的末尾,而不是您想要的。
'[^']+'
string = "'Well!' thought Alice to herself, 'after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it, even if I fell off the top of the house!'"

p string.scan(/\B'.*?'\B/) #=> ["'Well!'", "'after such a fall as this, I shall think nothing of tumbling down stairs! How brave they'll all think me at home! Why, I wouldn't say anything about it, even if I fell off the top of the house!'"]