Java 从集合和哈希映射打印_Java

Java 从集合和哈希映射打印

java

Java 从集合和哈希映射打印,java,Java,我正在开发一个Java应用程序，它接收2个文本文件和其他参数，为它们创建hashmaps，并对它们执行一些比较方法。一种方法是打印每个文件共享的所有唯一单词，打印这些单词，然后计算两个文件的Jaccard索引。我还希望这个方法也打印每个文件中每个单词的出现次数，我想知道最好的方法是什么。我已经在这里浏览了许多其他的例子，但没有找到答案下面是我目前使用的方法的一部分。这两个hashmap只包含唯一的单词，并计算与每个单词相关联的频率。它提供了每个文件有共同点的单词，但我也希望看到每个文件中使用每

我正在开发一个Java应用程序，它接收2个文本文件和其他参数，为它们创建hashmaps，并对它们执行一些比较方法。一种方法是打印每个文件共享的所有唯一单词，打印这些单词，然后计算两个文件的Jaccard索引。我还希望这个方法也打印每个文件中每个单词的出现次数，我想知道最好的方法是什么。我已经在这里浏览了许多其他的例子，但没有找到答案

下面是我目前使用的方法的一部分。这两个hashmap只包含唯一的单词，并计算与每个单词相关联的频率。它提供了每个文件有共同点的单词，但我也希望看到每个文件中使用每个单词的频率

public double compareMaps(HashMap<String,Integer> hMap1,HashMap<String,Integer> hMap2){

    Set<String> mapSet1 = new TreeSet<>(hMap1.keySet());
    Set<String> mapSet2 = new TreeSet<>(hMap2.keySet());

    Set<String> Intersect = new TreeSet<>(mapSet1);
    Intersect.retainAll(mapSet2);


    Set<String> union = new TreeSet<>(mapSet1);

    union.addAll(mapSet2);

    Iterator iterator;
    iterator = Intersect.iterator();
    System.out.printf("%nUnique words in Document 1: %d%nUnique words in Document 2: %d%n", hMap1.size(), hMap2.size());

    System.out.println("Word\t\tCount1\t\tCount2");
    while (iterator.hasNext()){
        System.out.println(iterator.next());

公共双比较映射（HashMap hMap1、HashMap hMap2）{
Set mapSet1=新树集（hMap1.keySet（））；
Set mapSet2=新树集（hMap2.keySet（））；
Set Intersect=新树集（mapSet1）；
相交。保留（地图集2）；
Set union=新树集（mapSet1）；
union.addAll（mapSet2）；
迭代器；
迭代器=Intersect.iterator（）；
System.out.printf（“%nUnique-words-in-Document 1:%d%nUnique-words-in-Document 2:%d%n”，hMap1.size（），hMap2.size（））；
System.out.println（“Word\t\t计数1\t\t计数2”）；
while（iterator.hasNext（））{
System.out.println（iterator.next（））；

我的当前输出
文件1:91中的独特词语
文件2中的独特词语：122
单词计数1计数2
a
也
一个
及
我想要的是：
文件1:91中的独特词语
文件2中的独特词语：122
单词计数1计数2
a 4 7
还有3 3
A 5 4
和36

提前感谢您的帮助！

您的计数在传入的原始地图中，因此您需要从那里获取它们：

while (iterator.hasNext()) {
  String word = iterator.next();
  System.out.println(word + "\t" + Integer.toString(hMap1.get(word)) + "\t" + Integer.toString(hMap2.get(word)));
}

为了从文件中获取每个工件的引用，可以使用以下代码：

//spit pattern sentences to words
static final Pattern SPLIT = Pattern.compile("[- .:,]+");

//read the file with Buffered reader. 
BufferedReader reader =  Files.newBufferedReader(
            Paths.get("<add_here_the_filename>), StandardCharsets.UTF_8);

//solution one - using group
Map<String, Map<Integer, List<String>>> solution_1 =
        reader.lines()
              .flatMap(line -> SPLIT_PATTERN.splitAsStream(line))
              .collect(Collectors.groupingBy(word -> word.substring(0,1),
                       Collectors.groupingBy(String::length)));

//将句型句子吐到单词上
静态最终模式拆分=Pattern.compile（“[-.：，]+”）；
//使用缓冲读取器读取文件。
BufferedReader reader=Files.newBufferedReader(
path.get（“），StandardCharsets.UTF_8）；
//解决方案一-使用组
地图解决方案1=
reader.lines（）
.flatMap（直线->分割模式.splitAsStream（直线））
.collect（收集器.groupingBy（word->word.substring（0,1）），
Collectors.groupby（String:：length））；

或者，您可以使用toMap（）创建每个单词出现的映射。

不需要

Integer。toString

，因为字符串串联将整数转换为字符串。