C# 如何用C来计算段落中某个单词的数量#
我试着写一个程序,用户给系统一个单词和一个段落,系统的任务是计算这个单词出现的次数 我如何计算这个词在C#中出现了多少次 这将:C# 如何用C来计算段落中某个单词的数量#,c#,counting,words,C#,Counting,Words,我试着写一个程序,用户给系统一个单词和一个段落,系统的任务是计算这个单词出现的次数 我如何计算这个词在C#中出现了多少次 这将: 通过检查每个字符(test.select),将每个标点字符替换为一个空格,并将它们重新组合为另一个字符串(string.Concat) 将字符串拆分为由空格分隔的子字符串(.Split) 过滤掉,只保留与搜索字符串匹配的内容 计数它们Count() 这将: 通过检查每个字符(test.select),将每个标点字符替换为一个空格,并将它们重新组合为另一个字符串(
- 通过检查每个字符(
),将每个标点字符替换为一个空格,并将它们重新组合为另一个字符串(test.select
)string.Concat
- 将字符串拆分为由空格分隔的子字符串(
).Split
- 过滤掉,只保留与
字符串匹配的内容搜索
- 计数它们
Count()
- 通过检查每个字符(
),将每个标点字符替换为一个空格,并将它们重新组合为另一个字符串(test.select
)string.Concat
- 将字符串拆分为由空格分隔的子字符串(
).Split
- 过滤掉,只保留与
字符串匹配的内容搜索
- 计数它们
Count()
正如文章所说的“<代码>对分割方法有一个性能开销。如果字符串上的唯一操作是对单词进行计数,则应该考虑使用匹配或索引方法而不是< /COD>”
因此,如果性能有问题,可以使用带有indexOf和count的while循环class CountWords
{
static void Main()
{
string text = @"Historically, the world of data and the world of objects" +
@" have not been well integrated. Programmers work in C# or Visual Basic" +
@" and also in SQL or XQuery. On the one side are concepts such as classes," +
@" objects, fields, inheritance, and .NET Framework APIs. On the other side" +
@" are tables, columns, rows, nodes, and separate languages for dealing with" +
@" them. Data types often require translation between the two worlds; there are" +
@" different standard functions. Because the object world has no notion of query, a" +
@" query can only be represented as a string without compile-time type checking or" +
@" IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to" +
@" objects in memory is often tedious and error-prone.";
string searchTerm = "data";
//Convert the string into an array of words
string[] source = text.Split(new char[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries);
// Create the query. Use ToLowerInvariant to match "data" and "Data"
var matchQuery = from word in source
where word.ToLowerInvariant() == searchTerm.ToLowerInvariant()
select word;
// Count the matches, which executes the query.
int wordCount = matchQuery.Count();
Console.WriteLine("{0} occurrences(s) of the search term \"{1}\" were found.", wordCount, searchTerm);
// Keep console window open in debug mode
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
}
/* Output:
3 occurrences(s) of the search term "data" were found.
*/
正如文章所说的“<代码>对分割方法有一个性能开销。如果字符串上的唯一操作是对单词进行计数,则应该考虑使用匹配或索引方法而不是< /COD>”
因此,如果性能有问题,可以使用带有indexOf和count的while循环class CountWords
{
static void Main()
{
string text = @"Historically, the world of data and the world of objects" +
@" have not been well integrated. Programmers work in C# or Visual Basic" +
@" and also in SQL or XQuery. On the one side are concepts such as classes," +
@" objects, fields, inheritance, and .NET Framework APIs. On the other side" +
@" are tables, columns, rows, nodes, and separate languages for dealing with" +
@" them. Data types often require translation between the two worlds; there are" +
@" different standard functions. Because the object world has no notion of query, a" +
@" query can only be represented as a string without compile-time type checking or" +
@" IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to" +
@" objects in memory is often tedious and error-prone.";
string searchTerm = "data";
//Convert the string into an array of words
string[] source = text.Split(new char[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries);
// Create the query. Use ToLowerInvariant to match "data" and "Data"
var matchQuery = from word in source
where word.ToLowerInvariant() == searchTerm.ToLowerInvariant()
select word;
// Count the matches, which executes the query.
int wordCount = matchQuery.Count();
Console.WriteLine("{0} occurrences(s) of the search term \"{1}\" were found.", wordCount, searchTerm);
// Keep console window open in debug mode
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
}
/* Output:
3 occurrences(s) of the search term "data" were found.
*/
将正则表达式与锚点一起使用:
将正则表达式与锚点一起使用:
可能的重复不是问题的重复-这是计算[英语]文本段落中的一个单词(而不是一个字符),具有不同的含义和解决方案。可能的重复不是问题的重复-这是计算[英语]文本段落中的一个单词(而不是一个字符)它有不同的含义和解决方案。您给出的示例使用Split()方法。。。即使它说它不是更好的方法,并给出一个指向
Regex
Matches()
方法的链接。。。split方法是错误的,因为它无法匹配上下文中的单词,例如(word)
。的索引是错误的,因为它将匹配notaword
@user2864740我不知道您是如何得出关于indexof的结论的。@使用简单indexof的浏览,如删除的答案中所示(此答案中暗示,甚至没有实现),并不能正确解释分词。因此,我的“结论”基于indexOf的一个同样不充分的隐含用法。。。即使它说它不是更好的方法,并给出一个指向Regex
Matches()
方法的链接。。。split方法是错误的,因为它无法匹配上下文中的单词,例如(word)
。的索引是错误的,因为它将匹配notaword
@user2864740我不知道您是如何得出关于indexof的结论的。@使用简单indexof的浏览,如删除的答案中所示(此答案中暗示,甚至没有实现),并不能正确解释分词。因此,我的“结论”基于indexOf同样不充分的隐含用法。
int wordCount = Regex.Matches(text, "\\b" + Regex.Escape(searchTerm) + "\\b", RegexOptions.IgnoreCase).Count;