Java 8中使用数组流计算单词出现次数_Java_Arrays_Java 8_Java Stream

Java 8中使用数组流计算单词出现次数

java arrays java-8

Java 8中使用数组流计算单词出现次数,java,arrays,java-8,java-stream,Java,Arrays,Java 8,Java Stream,如何使用数组流计算字符串中的词频？我正在使用Java8 这是我的密码： String sentence = "The cat has black fur and black eyes"; String[] bites = sentence.trim().split("\\s+"); String in = "black cat"; 计算句子中单词“黑”和“猫”的频率。单词“黑色”的频率为2，单词“猫”的频率为1 因此目标输出为3。Map count=Arrays.stream（比特） Map

如何使用数组流计算字符串中的词频？我正在使用Java8

这是我的密码：

String sentence = "The cat has black fur and black eyes";
String[] bites = sentence.trim().split("\\s+");

String in = "black cat";

计算句子中单词“黑”和“猫”的频率。单词“黑色”的频率为2，单词“猫”的频率为1

因此目标输出为3。

Map count=Arrays.stream（比特）
Map<String, Long> count = Arrays.stream(bites)
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

.collect（Collectors.groupingBy（Function.identity（）、Collectors.counting（））；

怎么样

Map<String, Long> counts = yourStringStream
    .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

Map counts=yourStringStream
.collect（Collectors.groupingBy（Function.identity（）、Collectors.counting（））；

这将为您提供从所有单词到其频率计数的映射。

String-sense=“猫有黑色的皮毛和黑色的眼睛”；
String sentence = "The cat has black fur and black eyes";
String[] bites = sentence.trim().split("\\s+");

Map<String, Long> counts = Arrays.stream(bites)
       .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

String[]bites=句子.trim（）.split（\\s+）；
映射计数=数组。流（位）
.collect（Collectors.groupingBy（Function.identity（）、Collectors.counting（））；

如果我能理解您的问题，您可以使用此解决方案获得预期结果：

String sentence = "The cat has black fur and black eyes";
String in = "black cat";

List<String> bites  = Arrays.asList(sentence.trim().split("\\s+"));
List<String> listIn = Arrays.asList(in.split("\\s"));

long count = bites.stream().filter(listIn::contains).count();

另一种简单的方法是使用Java8中引入的方法

  HashMap<String,LongAdder> wordCount= new LinkedHashMap<>();
    for (String word:sentence.split("\\s")){
      wordCount.computeIfAbsent(word, (k) -> new LongAdder()).increment();
   }

HashMap wordCount=newlinkedhashmap（）；
for（字符串字：句子.split（\\s）））{
computeIfAbsent（word，（k）->new LongAdder（））.increment（）；
}

输出

{The=1，cat=1，has=1，black=2，fur=1，and=1，eyes=1}

尽管有很多例子展示了如何使用流来实现这一点非常好。您仍然不应该忘记，

Collections

已经有一种方法可以为您实现这一点：

List<String> list = Array.asList(bites);
System.out.println(Collections.frequency(list, "black")); // prints 2
System.out.println(Collections.frequency(list, "cat"));   // prints 1

List List=Array.asList（比特）；
System.out.println（Collections.frequency（列表，“黑色”）；//印刷品2
System.out.println（Collections.frequency（列表，“cat”）；//印刷品1

final Collection ins=Arrays.asList（in.split（\\s+））；
Arrays.stream（比特数）
.filter（ins:：contains）
.mapToLong（咬合=>1L）
.sum（）

@YCF\L谢谢你的提示。编辑了我的答案，从

String.contains

更改为

List.contains

。对每个单词重复

数组。asList（in.split（\\s”）

操作是浪费资源的。这不是正确的答案，因为它不计算单独的计数…

模式。编译（\\b（black | cat）\\b”）。splitAsStream（句子）。count（）-1

@Holger:太好了！但必须用虚拟词填充句子，因为当最后一个词是

cat

时，它不起作用。收集部分可以完全省略。您只需进行筛选，然后使用

count（）

即可获得完全相同的结果。完全忽略它，最终聚合结果不需要任何分组（顺便说一句）。您还可以将列表（

Arrays.asList（）

）保存一次，而不总是在流中创建新的丢弃对象

  HashMap<String,LongAdder> wordCount= new LinkedHashMap<>();
    for (String word:sentence.split("\\s")){
      wordCount.computeIfAbsent(word, (k) -> new LongAdder()).increment();
   }

List<String> list = Array.asList(bites);
System.out.println(Collections.frequency(list, "black")); // prints 2
System.out.println(Collections.frequency(list, "cat"));   // prints 1

final Collection<String> ins = Arrays.asList(in.split("\\s+"));

Arrays.stream(bites)
    .filter(ins::contains)
    .mapToLong(bite => 1L)
    .sum()