Java 比较两个哈希映射并计算重复值的数量_Java_Arraylist_Hashmap

Java 比较两个哈希映射并计算重复值的数量

java

Java 比较两个哈希映射并计算重复值的数量,java,arraylist,hashmap,Java,Arraylist,Hashmap,我创建了两个HashMap，其中包含来自两个单独txt文件的字符串现在，我尝试比较两个hashmap，并计算每个文件包含的重复值的数量。例如，如果file1和file2都包含两次字符串“hello”，我的控制台应该打印：hello发生两次这是我的第一个HashMap： List<String> word_list = new ArrayList<>(); //Load your words to the word_list here

我创建了两个HashMap，其中包含来自两个单独txt文件的字符串

现在，我尝试比较两个hashmap，并计算每个文件包含的重复值的数量。例如，如果file1和file2都包含两次字符串“hello”，我的控制台应该打印：hello发生两次

这是我的第一个HashMap：

 List<String> word_list = new ArrayList<>();
        //Load your words to the word_list here


         while (INPUT_TEXT1.hasNext()) {
            String input_word = INPUT_TEXT1.next();

            word_list.add(input_word);

        }

        INPUT_TEXT1.close();

        String regexPattern = "[^a-zA-Z]";

        int index = 0;

        for (String s : word_list) {

            word_list.set(index++, s.replaceAll(regexPattern, "").toLowerCase());
        }

        //Find the unique words now from list
        String[] uniqueWords = word_list.stream().distinct().
                                       toArray(size -> new String[size]);
        Map<String, Integer> wordsMap = new HashMap<>();
        int frequency = 0;

        //Load the words to Map with each uniqueword as Key and frequency as Value
        for (String uniqueWord : uniqueWords) {
            frequency = Collections.frequency(word_list, uniqueWord);
            System.out.println(uniqueWord+" occured "+frequency+" times");
            wordsMap.put(uniqueWord, frequency);
        }

       //Now, Sort the words with the reverse order of frequency(value of HashMap)
       Stream<Entry<String, Integer>> topWords = wordsMap.entrySet().stream().
         sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6);

        //Now print the Top 5 words to console
        System.out.println("Top 5 Words:::");
        topWords.forEach(System.out::println);


        System.out.println("\n\n");

List word_List=new ArrayList（）；
//将您的单词加载到此处的单词列表中
while（输入_TEXT1.hasNext（））{
字符串输入\单词=输入\文本1.next（）；
单词列表。添加（输入单词）；
}
输入_TEXT1.close（）；
字符串regexpatern=“[^a-zA-Z]”；
int指数=0；
for（字符串s:单词列表）{
word_list.set（index++，s.replaceAll（regexpatern，“”）.toLowerCase（））；
}
//现在从列表中查找唯一的单词
String[]uniqueWords=word\u list.stream（）.distinct（）。
toArray（大小->新字符串[大小]；
Map wordsMap=newhashmap（）；
整数频率=0；
//加载要映射的单词，每个单词作为关键字，频率作为值
for（字符串uniqueWord:uniqueWords）{
频率=集合频率（单词列表，uniqueWord）；
System.out.println（uniqueWord+“发生”+频率+“次数”）；
单词映射put（uniqueWord，frequency）；
}
//现在，按频率的相反顺序对单词进行排序（HashMap的值）
Stream topWords=wordsMap.entrySet（）.Stream（）。
排序（Map.Entry.comparingByValue（）.reversed（））.limit（6）；
//现在将前5个单词打印到console
System.out.println（“前5个单词：”）；
topWords.forEach（System.out:：println）；
System.out.println（“\n\n”）；

这是我的第二个HashMap：

List<String> wordList = new ArrayList<>();
        //Load your words to the word_list here


         while (INPUT_TEXT2.hasNext()) {
            String input_word1 = INPUT_TEXT2.next();

            wordList.add(input_word1);

        }

        INPUT_TEXT2.close();

        String regex = "[^a-zA-Z]";

        int index1 = 0;

        for (String s : wordList) {

            wordList.set(index1++, s.replaceAll(regex, "").toLowerCase());
        }

        String[] uniqueWords1 = wordList.stream().distinct().
                                       toArray(size -> new String[size]);
        Map<String, Integer> wordsMap1 = new HashMap<>();

         //Load the words to Map with each uniqueword as Key and frequency as Value
        for (String uniqueWord : uniqueWords1) {
            frequency = Collections.frequency(wordList, uniqueWord);
            System.out.println(uniqueWord+" occured "+frequency+" times");
            wordsMap.put(uniqueWord, frequency);
        }

       //Now, Sort the words with the reverse order of frequency(value of HashMap)
       Stream<Entry<String, Integer>> topWords1 = wordsMap1.entrySet().stream().
         sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6)

List wordList=new ArrayList（）；
//将您的单词加载到此处的单词列表中
while（输入_TEXT2.hasNext（））{
String input_word1=input_TEXT2.next（）；
添加（输入单词1）；
}
输入_TEXT2.close（）；
字符串regex=“[^a-zA-Z]”；
int index1=0；
for（字符串s:wordList）{
set（index1++，s.replaceAll（regex，“”）.toLowerCase（））；
}
字符串[]uniqueWords1=wordList.stream（）.distinct（）。
toArray（大小->新字符串[大小]；
Map wordsMap1=新HashMap（）；
//加载要映射的单词，每个单词作为关键字，频率作为值
for（字符串唯一字：唯一字1）{
频率=集合。频率（单词列表，uniqueWord）；
System.out.println（uniqueWord+“发生”+频率+“次数”）；
单词映射put（uniqueWord，frequency）；
}
//现在，按频率的相反顺序对单词进行排序（HashMap的值）
Stream topWords1=wordsMap1.entrySet（）.Stream（）。
排序（Map.Entry.comparingByValue（）.reversed（））.limit（6）

以下是我查找重复值的原始方法：

 boolean val = wordsMap.keySet().containsAll(wordsMap1.keySet());

    for (Entry<String, Integer> str : wordsMap.entrySet()) {
        System.out.println("================= " + str.getKey());


        if(wordsMap1.containsKey(str.getKey())){
            System.out.println("Map2 Contains Map 1 Key");
        }
    }

    System.out.println("================= " + val);

boolean val=wordsMap.keySet（）.containsAll（wordsMap1.keySet（））；
for（条目str:wordsMap.entrySet（））{
System.out.println（“======================”+str.getKey（））；
if（wordsMap1.containsKey（str.getKey（）））{
System.out.println（“Map2包含MAP1键”）；
}
}
System.out.println（“=========================”+val）；

有人对实现这一目标有其他建议吗？多谢各位

编辑

我如何计算每个值的出现次数呢？

我认为您的代码也可以。如果您的目标是找到更好的方法来实施最后一次检查，您可以尝试以下方法：

Set<String> keySetMap1 = new HashSet<String>(wordsMap.keySet());
Set<String> keySet2 = wordsMap1.keySet();
keySetMap1.retainAll(keySet2);
keySetMap1.stream().forEach(x -> System.out.println("Map2 Contains Map 1 Key: "+x));

Set keySetMap1=newhashset（wordsMap.keySet（））；
Set keySet2=wordsMap1.keySet（）；
键集映射1.retainAll（键集2）；
keySetMap1.stream（）.forEach（x->System.out.println（“Map2包含map1键：+x））；

为什么你自己的代码不起作用？哇！！！这大概是我见过的构建词到频率映射的最糟糕的实现了。对列表进行完全扫描以获得唯一的单词，然后对每个唯一的单词进行完全扫描。哎呀！由于您使用的是Java 8 streams，请尝试使用

stream（）.collect（Collectors.groupingBy（w->w，Collectors.counting（））

。我将重点放在上一次检查上，认为OP在询问如何改进它，而完全忽略了第一部分。我同意Andreas的观点，第一部分应该完全重构。我如何计算每个重复值的出现次数？为了回答这个问题：我如何计算每个单独值的出现次数，您可以按照Andreas的建议重构代码：

Map wordsMap=word\u list.stream（）.collect（Collectors.groupingBy（w->w，Collectors.counting（））使用这一行，您可以计算词频图。希望我们回答了你所有的问题。