C# 从字符串数组或列表c中查找第一个匹配字符串的索引#

C# 从字符串数组或列表c中查找第一个匹配字符串的索引#,c#,regex,C#,Regex,我在表单中有字符串 var dummyString = $@"SIGNED APPLICATION AND AFFIDAVIT REQUIRED LOCATION: BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ BASED ON: VACANT LAND"; 我想做的是从这个字符串中提取位置/地址。我可以很容易地找到位置的索引:,但是对于应该终止字符串的索引,我想不出有效的解

我在表单中有字符串

var dummyString = $@"SIGNED APPLICATION AND AFFIDAVIT REQUIRED  LOCATION:  BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ  BASED ON:  VACANT LAND";
我想做的是从这个字符串中提取位置/地址。我可以很容易地找到位置的索引:,但是对于应该终止字符串的索引,我想不出有效的解决方案。最简单的选择是迭代列表并找到状态代码的索引,但这不是非常有效的处理方法

我认为解决这个问题的办法是使用一个美国州代码列表,然后在位置索引之后找到任何州代码的第一个匹配项的索引:带有空格的子字符串,这样我就可以找到完整的州代码及其索引

public const List<string> USStateCodes = new List<string> { "AL", "AK", "AS", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FM", "FL", "GA", "GU", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MH", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "MP", "OH", "OK", "OR", "PW", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VI", "VA", "WA", "WV", "WI", "WY" };
public const List USStateCodes=新列表{“AL”、“AK”、“AS”、“AZ”、“AR”、“CA”、“CO”、“CT”、“DE”、“DC”、“FM”、“FL”、“GA”、“GU”、“HI”、“ID”、“IL”、“IN”、“IA”、“KS”、“KY”、“LA”、“ME”、“MH”、“MD”、“MA”、“MI”、“MN”、“MS”、“MO”、“MT”、“NE”、“NV”、“NH”、“NJ”、“NY”、“NC”、“ND”、“MP”、“OH”、“OK”、“OR”、“PW”、“PA”、“PR”、“RI”、“SC”、“SD”、“TN”,TX,UT,VT,VI,VA,WA,WV,WI,WY};
你知道怎么从这里开始吗

我想要的输出是:

新泽西州东海恩斯波特38号地块99号地块9号地块和100号地块9号、10号、11号、12号和13号地块

这里提到的问题是我使用regex查找邮政编码索引(5位数字)作为终止符的更大逻辑的一部分,但在某些情况下,邮政编码可能不在地址中(用户错误)。我还得提取地址

您可以使用

var dummyString = @"SIGNED APPLICATION AND AFFIDAVIT REQUIRED  LOCATION:  BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ  BASED ON:  VACANT LAND";
var USStateCodes = new List<string> { "AL", "AK", "AS", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FM", "FL", "GA", "GU", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MH", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "MP", "OH", "OK", "OR", "PW", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VI", "VA", "WA", "WV", "WI", "WY" };
var result = Regex.Match(dummyString, $@"LOCATION:\s*(.*?\b(?:{string.Join("|", USStateCodes)}))\b")?.Groups[1].Value;

详细信息

  • 位置:
    -固定的起始字符串
  • \s*
    -0+空格
  • (.*?\b(?:{string.Join(|“,USStateCodes)}))
    -组1(结果将在组中捕获):
    • *?
      -除换行符以外的任何0个或更多字符(使用
      RegexOptions.Singleline
      也匹配换行符),尽可能少
    • \b
      -单词边界
    • (?:{string.Join(|“,USStateCodes)})
      -使用状态代码(如
      (?:AL | AK | AS |…| WY)
      )创建一个替换组,并匹配任何一个替换项
  • \b
    -单词边界

但我想不出有效的索引解决方案,应该在索引中终止字符串。“基于”的索引:?我可以对列表中的每个州代码进行for循环,并尝试在字符串中查找代码的索引…但这需要对每个项目进行迭代…不是很优雅。可能需要regex
签名的应用程序和宣誓书所需的位置:
基于:空白土地
始终在输入字符串中保持不变?不…。它是一个变量值…它不断变化,除了位置始终存在。非常接近,但似乎缺少状态代码。不,它在那里。看演示。你是救生员!
LOCATION:\s*(.*?\b(?:AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY))\b