Ruby解析与正则表达式_Ruby_Regex_String

Ruby解析与正则表达式

ruby regex string

Ruby解析与正则表达式,ruby,regex,string,Ruby,Regex,String,最近刚学了Ruby，一直在摆弄它。我想学习如何使用正则表达式或其他Ruby技巧来检查给定文本行中的某些单词、空格字符、有效格式等假设我有一个订单列表，其格式严格如下： cost: 50 items: book,lamp 分号后面有一个空格，每个逗号后面没有空格，结尾没有尾随空格等等。如何使用Ruby检查此格式中的错误？例如，我的检查不合格： cost: 60 items:shoes,football 我的目标是将字符串拆分为“”并检查第一个单词是否为“cost:”，第二个单

最近刚学了Ruby，一直在摆弄它。我想学习如何使用正则表达式或其他Ruby技巧来检查给定文本行中的某些单词、空格字符、有效格式等

假设我有一个订单列表，其格式严格如下：

cost: 50 items: book,lamp

分号后面有一个空格，每个逗号后面没有空格，结尾没有尾随空格等等。如何使用Ruby检查此格式中的错误？例如，我的检查不合格：

cost:     60 items:shoes,football

我的目标是将字符串拆分为“”并检查第一个单词是否为“cost:”，第二个单词是否为数字等等，但我意识到拆分为“”并不能帮助我检查额外的空格，因为它只会吃掉它。也不能帮我检查尾随空格。如何执行此操作？

您可以使用以下正则表达式

r = /
    \A                # match beginning of string     
    cost:\s           # match "cost:" followed by a space
    \d+\s             # match > 0 digits followed by a space
    items:\s          # match "items:" followed by a space
    [[:alpha:]]+      # match > 0 lowercase or uppercase letters
    (?:,[[:alpha:]]+) # match a comma followed by > 0 lowercase or uppercase 
                      # letters in a non-capture group (?: ... )
    *                 # perform the match on non-capture group >= 0 times
    \z                # match the end of the string
    /x                # free-spacing regex definition mode

"cost: 50 items: book,lamp"         =~ r #=> 0   (a match, beginning at index 0)
"cost: 50 items: book,lamp,table"   =~ r #=> 0   (a match, beginning at index 0)
"cost:     60 items:shoes,football" =~ r #=> nil (no match)

正则表达式当然可以按正常方式编写：

r = /\Acost:\s\d+\sitems:\s[[:alpha:]]+(?:,[[:alpha:]]+)*\z/

或

尽管在自由间距模式定义（

\x

）中不能用空格替换空白字符（

\s

）。

您可以使用以下正则表达式

r = /
    \A                # match beginning of string     
    cost:\s           # match "cost:" followed by a space
    \d+\s             # match > 0 digits followed by a space
    items:\s          # match "items:" followed by a space
    [[:alpha:]]+      # match > 0 lowercase or uppercase letters
    (?:,[[:alpha:]]+) # match a comma followed by > 0 lowercase or uppercase 
                      # letters in a non-capture group (?: ... )
    *                 # perform the match on non-capture group >= 0 times
    \z                # match the end of the string
    /x                # free-spacing regex definition mode

"cost: 50 items: book,lamp"         =~ r #=> 0   (a match, beginning at index 0)
"cost: 50 items: book,lamp,table"   =~ r #=> 0   (a match, beginning at index 0)
"cost:     60 items:shoes,football" =~ r #=> nil (no match)

正则表达式当然可以按正常方式编写：

r = /\Acost:\s\d+\sitems:\s[[:alpha:]]+(?:,[[:alpha:]]+)*\z/

或

尽管在自由间距模式定义（

\x

）中不能用空格替换空白字符（

\s

）。

但是，这是一个非常好的示例，可以帮助您提高TDD（测试驱动开发）/测试技能。在发表评论时，我建议您继续玩

minitest

和您的问题。

：

是冒号<代码>是一个分号。不管你怎么做，这都是一个很好的例子，可以帮助你提高TDD（测试驱动开发）/测试技能。在发表评论时，我建议您继续玩

minitest

和您的问题。

：

是冒号<代码>是分号。要在自由间距模式下显式匹配空格，可以使用

或

\u0020

或

[]

。谢谢@CarySwoveland。进一步说，有没有办法在项目列表中包含下划线、数字和连字符（无空格）？比如，如果项目是一个“版本2”而不是像“灯”这样简单的东西。在[：alpha:]之后添加“\w”行吗？只需将正则表达式中的

[[：alpha:]+

更改为

\w+

（

\w

是一个“单词字符”，是大写或小写字母、数字或下划线）。要在自由间距模式下显式匹配空格，可以使用

或

\u0020

或

[]

。谢谢@CarySwoveland。再进一步，有没有办法在项目列表中包含下划线、数字和连字符（无空格）？比如说，如果项目是“版本”，而不是像“灯”这样简单的东西。在[：alpha:]之后添加“\w”行吗？只需更改

[：alpha:]+

在正则表达式中，在两个位置（

\w

为“单词字符”，是大写或小写字母、数字或下划线）。