c#regex如何匹配用户'；s输入到单词/短语数组中_C#_Arrays_Regex_Match

c#regex如何匹配用户'；s输入到单词/短语数组中

c# arrays regex

c#regex如何匹配用户'；s输入到单词/短语数组中,c#,arrays,regex,match,C#,Arrays,Regex,Match,我有一个包含不同单词和短语的数组。用户将输入一条垃圾邮件，我应该检查数组中已有的单词和短语是否匹配。对于每个匹配，分数将为+1，如果分数超过5，则可能是垃圾邮件但是我的分数没有增加，我不知道为什么 string[] spam = new string[] {"-different words and phrases provided by programmer"}; Console.Write("Key in an email message: "); str

我有一个包含不同单词和短语的数组。用户将输入一条垃圾邮件，我应该检查数组中已有的单词和短语是否匹配。对于每个匹配，分数将为+1，如果分数超过5，则可能是垃圾邮件

但是我的分数没有增加，我不知道为什么

string[] spam = new string[] {"-different words and phrases provided by programmer"};

        Console.Write("Key in an email message: ");
        string email = Console.ReadLine();
        int score = 0;

        string pattern = "^\\[a-zA-Z]";
        Regex expression = new Regex(pattern);
        var regexp = new System.Text.RegularExpressions.Regex(pattern);

        if (!regexp.IsMatch(email))
        {
            score += 1;
        }

您可以使用Linq来解决这个问题

  // HashSet<String> is for better performance
  HashSet<String> spamWords = new HashSet<String>(
    "different words and phrases provided by programmer"
      .Split(new Char[] {' '}, StringSplitOptions.RemoveEmptyEntries)
      .Select(word => word.ToUpper()));

  ...

  String eMail = "phrases, not words and letters zzz";

  ... 

  // score == 3: "phrases" + "words" + "and"
  int score = Regex
    .Matches(eMail, @"\w+")
    .OfType<Match>()
    .Select(match => match.Value.ToUpper())
    .Sum(word => spamWords.Contains(word) ? 1 : 0);

//HashSet是为了更好的性能
HashSet spamWords=新HashSet(
“程序员提供的不同单词和短语”
.Split（新字符[]{“”}，StringSplitOptions.RemoveEmptyEntries）
.Select（word=>word.ToUpper（））；
...
String eMail=“短语，而不是单词和字母zzz”；
... 
//分数==3：“短语”+“单词”+“和”
整数分数=正则表达式
.匹配项（电子邮件@“\w+”）
第（）类
.Select（match=>match.Value.ToUpper（））
.Sum（word=>spamWords.Contains（word）？1:0；

在这个实现中，我以不区分大小写的方式查找垃圾邮件（因此，

和

，

和

，

和

将被视为垃圾邮件）。若要考虑复数、字符串（即

word

，

wording

），必须使用词干分析器。

如果要计算非字母，正则表达式必须是

string pattern=“[^a-zA-Z]”。甚至\P{L}
。或者使用Char.isleter（）
。您声明了spam
var，但没有使用它。里面的单词是否需要进行测试？如果匹配，则当前代码中只能得到+1，并且根本不使用spam
。您确定这就是您需要的吗？您不需要使用正则表达式，只需检查用户字符串是否包含任何单词。垃圾邮件数组已声明，但未在代码中的任何位置使用？此外，expression
和regexp
是相同的。我想你想写string[]spam=newstring[]{“different”，“words”，“and”，“phrases”，“provided”，“by”，“programmer”}
  // HashSet<String> is for better performance
  HashSet<String> spamWords = new HashSet<String>(
    "different words and phrases provided by programmer"
      .Split(new Char[] {' '}, StringSplitOptions.RemoveEmptyEntries)
      .Select(word => word.ToUpper()));

  ...

  String eMail = "phrases, not words and letters zzz";

  ... 

  // score == 3: "phrases" + "words" + "and"
  int score = Regex
    .Matches(eMail, @"\w+")
    .OfType<Match>()
    .Select(match => match.Value.ToUpper())
    .Sum(word => spamWords.Contains(word) ? 1 : 0);