如何在Java中列出字符串数组中前n个常用词？_Java_String_Hashmap

如何在Java中列出字符串数组中前n个常用词？

java string

如何在Java中列出字符串数组中前n个常用词？,java,string,hashmap,Java,String,Hashmap,我需要分析一篇课文，找出前n个常用词。其中，n是用户可以指定的要打印的常用单词数我使用了hashmaps。但现在，我只能找到一个最常用的词假设我有这样一个散列图 cat: 4 dog: 3 sky: 10 blue: 1 查找最常用单词的代码如下所示： int compareValue = 0; String compareKey = ""; for (Map.Entry<String, Int

我需要分析一篇课文，找出前n个常用词。其中，n是用户可以指定的要打印的常用单词数我使用了hashmaps。但现在，我只能找到一个最常用的词

假设我有这样一个散列图

    cat: 4
    dog: 3
    sky: 10
    blue: 1

查找最常用单词的代码如下所示：

        int compareValue = 0;
        String compareKey = "";

  for (Map.Entry<String, Integer> set : pairs.entrySet()) {
            if (set.getValue() > compareValue) {
                compareKey = set.getKey();
                compareValue = set.getValue();
            }
        }

int compareValue=0；
字符串compareKey=“”；
for（Map.Entry集：pairs.entrySet（））{
if（set.getValue（）>compareValue）{
compareKey=set.getKey（）；
compareValue=set.getValue（）；
}
}

您能告诉我如何修改此代码以查找多个最常用的单词吗？并且有一个变量来指定所需的频繁单词数

以下是您的答案：

        String text = "Very very very good text to compare good text with cats, cats and dogs. " +
                "Very good dogs.";

        int mostFrequentWordsNumber = 5;

        Map<String, Integer> mapOfFrequentWords = new TreeMap<>();

        String[] words = text.split("\\s+");

        for (String word : words) {
            if (!mapOfFrequentWords.containsKey(word)) {
                mapOfFrequentWords.put(word, 1);
            } else {
                mapOfFrequentWords.put(word, mapOfFrequentWords.get(word) + 1);
            }
        }

        Map<String, Integer> sorted = mapOfFrequentWords
                .entrySet()
                .stream()
                .sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
                .limit(mostFrequentWordsNumber)
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e2,
                                LinkedHashMap::new));
        System.out.println(sorted);

String text=“非常非常好的文本，用于将好文本与猫、猫和狗进行比较。”+
“非常好的狗。”；
int-mostFrequentWordsNumber=5；
Map MapOffFrequentWords=新树映射（）；
String[]words=text.split（\\s+）；
for（字符串字：字）{
if（！mapOfFrequentWords.containsKey（word））{
mapOfFrequentWords.put（单词，1）；
}否则{
mapOfFrequentWords.put（word，mapOfFrequentWords.get（word）+1）；
}
}
Map sorted=MapOffFrequentWords
.entrySet（）
.stream（）
.sorted（Map.Entry.comparingByValue（Comparator.reverseOrder（）））
.限值（MOST频繁字数）
.collect（Collectors.toMap（Map.Entry:：getKey，Map.Entry:：getValue，（e1，e2）->e2，
LinkedHashMap:：new））；
系统输出打印项次（已排序）；

结果将是：

{good=3，Very=2，dogs.=2，text=2，Very=2}

。您可以将其更改为不区分大小写，以消除非常和非常不同的单词。

解决与“前K个元素”相关的问题的最佳方法是使用堆数据结构。根据您的问题，我会使用Heap查找前k个常用词。

可以在此网站上查看示例代码

您需要的是

TreeMap

。这要求您提供自己的

比较器。您的比较器应允许按每个键值对中的值排序。然后可以在树形图上调用addAll（HashMap）
。然后查找TreeMap
的headMap（）
方法。