Ruby正则表达式用于匹配字符模式和换行符之间的子字符串_Ruby_Regex_String

Ruby正则表达式用于匹配字符模式和换行符之间的子字符串

ruby regex string

Ruby正则表达式用于匹配字符模式和换行符之间的子字符串,ruby,regex,string,Ruby,Regex,String,我将数据格式化为单个字符串： "1. Enloe Medical Center - 2,000 2. CSU Chico - 1,805 3. Walmart Distribution Center - 1,350 4. Pacific Coast Producers (Agribusiness) - 1,200 5. Marysville School District - 1,000 6. Feather River Hospital - 865 7. Sunsweet Grow

我将数据格式化为单个字符串：

"1. Enloe Medical Center - 2,000 
2. CSU Chico - 1,805 
3. Walmart Distribution Center - 1,350 
4. Pacific Coast Producers (Agribusiness) - 1,200 
5. Marysville School District - 1,000 
6. Feather River Hospital - 865 
7. Sunsweet Growers (Agriculture) - 600 
8. YRC (Freight Services) - 500 
9. Sierra Pacific Industries (Lumber Products) - 500 
10. Colusa Casino Resort - 500"

在Ruby应用程序中，我想创建两个数组：每个编号列表标记和破折号之间的一个子字符串，以及包含破折号和换行符之间的数字的一个子字符串（作为整数），如下所示：

labels = ["Enloe Medical Center","CSU Chico","Walmart Distribution Center","Pacific Coast Producers (Agribusiness)","Marysville School District","Feather River Hospital","Sunsweet Growers (Agriculture)","YRC (Freight Services)","Sierra Pacific Industries (Lumber Products)","Colusa Casino Resort"]

numbers = [2000, 1805, 1350, 1200, 1000, 865, 600, 500, 500, 500]

我的正则表达式不太好；我知道如何进行替换和匹配，但我不确定从哪里开始。有人能帮忙吗？

您可以这样做：

s = "1. Enloe Medical Center - 2,000 
 2. CSU Chico - 1,805 
 3. Walmart Distribution Center - 1,350 
 4. Pacific Coast Producers (Agribusiness) - 1,200 
 5. Marysville School District - 1,000 
 6. Feather River Hospital - 865 
 7. Sunsweet Growers (Agriculture) - 600 
 8. YRC (Freight Services) - 500 
 9. Sierra Pacific Industries (Lumber Products) - 500 
10. Colusa Casino Resort - 500"

arr1 = s.each_line.map { | x | 
  x.match(/- (.*)/)[ 1 ].gsub(/[^0-9]*/,'')
}

arr2 = s.each_line.map { | x | 
  x.match(/\d. (.*) - (.*)/)[ 1 ]
}

puts arr1
puts arr2

rawlines = <<EOF
1. Enloe Medical Center - 2,000 
2. CSU Chico - 1,805 
3. Walmart Distribution Center - 1,350 
4. Pacific Coast Producers (Agribusiness) - 1,200 
5. Marysville School District - 1,000 
6. Feather River Hospital - 865 
7. Sunsweet Growers (Agriculture) - 600 
8. YRC (Freight Services) - 500 
9. Sierra Pacific Industries (Lumber Products) - 500 
10. Colusa Casino Resort - 500
EOF
labels = []
numbers = []
rawlines.scan(/^[0-9]+\. ([^-]+) - ([1-9][0-9]{0,2}(?>,[0-9]{3})*)/) do |label, number|
  labels << label
  numbers << number.gsub(",", "")
end
puts labels
puts numbers

rawlines=您可以这样做：
rawlines = <<EOF
1. Enloe Medical Center - 2,000 
2. CSU Chico - 1,805 
3. Walmart Distribution Center - 1,350 
4. Pacific Coast Producers (Agribusiness) - 1,200 
5. Marysville School District - 1,000 
6. Feather River Hospital - 865 
7. Sunsweet Growers (Agriculture) - 600 
8. YRC (Freight Services) - 500 
9. Sierra Pacific Industries (Lumber Products) - 500 
10. Colusa Casino Resort - 500
EOF
labels = []
numbers = []
rawlines.scan(/^[0-9]+\. ([^-]+) - ([1-9][0-9]{0,2}(?>,[0-9]{3})*)/) do |label, number|
  labels << label
  numbers << number.gsub(",", "")
end
puts labels
puts numbers

rawlines=有一件事让它变得简单：
/pat/m-将换行符视为匹配的字符
另一件事是分组（第二部分中的示例）
为一行编写regexp，它适合整个字符串：
r1 = /\d+\,\d+\s*$/m
str.scan r1
["2,000 ", "1,805 ", "1,350 ", "1,200 ", "1,000 "]

$
与行尾匹配

\d
编号

+
多少次->一次或多次

\s
空格（0次或更多次）

另外，既然你知道如何替换，我就没有把它改成数字
r2 = /\d+\.\s*([\w\s]+)\s*\-/m
 str.scan(r2).flatten

\d+
-匹配数字1或更多次

\.
-匹配
-必须将其转义，因为
匹配任何字符

s*
-空格0或更多

[\w\s]+
-任何单词字符或空格，1次或多次

（）
-您正在分组，这是一种简单的方式，可以说我希望它被以下内容所包围：
一件事让它变得简单：
/pat/m-将换行符视为匹配的字符
另一件事是分组（第二部分中的示例）
为一行编写regexp，它适合整个字符串：
r1 = /\d+\,\d+\s*$/m
str.scan r1
["2,000 ", "1,805 ", "1,350 ", "1,200 ", "1,000 "]

$
与行尾匹配

\d
编号

+
多少次->一次或多次

\s
空格（0次或更多次）

另外，既然你知道如何替换，我就没有把它改成数字
r2 = /\d+\.\s*([\w\s]+)\s*\-/m
 str.scan(r2).flatten

\d+
-匹配数字1或更多次

\.
-匹配
-必须将其转义，因为
匹配任何字符

s*
-空格0或更多

[\w\s]+
-任何单词字符或空格，1次或多次

（）
-你在分组，很容易说我希望这个被这个包围，这里还有：
我非常喜欢你的方法，用一个较短的正则表达式把它做成一行：-）标签，数字=str.scan（/\d+.\s（+.+）\s-\（\d.*）/。映射{label，number |[label，number.gsub（“，”，“）。到I]}.transpose@bjhaid我想你可以用这样的东西写更少的东西：/\$（？\d+）\（？\d+）/=~“$3.67”；\35;=>0; 我很困惑你的例子与讨论中的问题不相关。我很喜欢你的方法，把它做成一行，用一个较短的正则表达式：-）标签，数字=str.scan（/\d++.\s-\（\d.*）/）。映射{标签，数字{label，number.gsub（“，”）。到}.transpose@bjhaid我想你可以用这样的东西写更少的东西：/\$（？\d+）\（？\d+）/=~“$3.67”；\35;=>0; 美元#=>“3”
@DarekNędza我很困惑您的例子与讨论中的问题不相关。非常感谢您的深入解释；非常有帮助。非常感谢您的深入讲解；真的很有帮助。