Scala映射字数?
这个问题出现在中的“映射和元组”一章中Scala映射字数?,scala,Scala,这个问题出现在中的“映射和元组”一章中 编写一个从文件中读取单词的程序。使用可变映射计算每个单词出现的频率 我的尝试是 // source file: https://www.gutenberg.org/cache/epub/35709/pg35709.txt scala> val words = scala.io.Source.fromFile("pg35709.txt").mkString.split("\\s+") words: Array[String] = Array(The
// source file: https://www.gutenberg.org/cache/epub/35709/pg35709.txt
scala> val words = scala.io.Source.fromFile("pg35709.txt").mkString.split("\\s+")
words: Array[String] = Array(The, Project, Gutenberg, EBook, of, Making, Your, Camera, Pay,, by, Frederick, C., Davis,
This, eBook, is, for, the, use, of, anyone, anywhere, at, no, cost, and, with, almost, no, restrictions, whatsoever.,
You, may, copy, it,, give, it, away, or, re-use, it, under, the, terms, of, the, Project, Gutenberg, License, included,
with, this, eBook, or, online, at, www.gutenberg.net, Title:, Making, Your, Camera, Pay, Author:, Frederick, C., Davis,
Release, Date:, March, 29,, 2011, [EBook, #35709], Language:, English, ***, START, OF, THIS, PROJECT, GUTENBERG, EBOOK,
MAKING, YOUR, CAMERA, PAY, ***, Produced, by, The, Online, Distributed, Proofreading, Team, at, http://www.pgdp.net,
(This, file, was, produced, from, images, generously, made, available, by, The, In...
scala> val wordCount = scala.collection.mutable.HashMap[String, Int]()
wordCount: scala.collection.mutable.HashMap[String,Int] = Map()
scala> for (word <- words) {
| val count = wordCount.getOrElse(word, 0)
| wordCount(word) = count + 1
| }
scala> word
wordCount words
scala> wordCount
res1: scala.collection.mutable.HashMap[String,Int] = Map(arts -> 1, follow -> 3, request, -> 1, Lines. -> 1,
demand -> 7, 1.E.4. -> 1, PRODUCT -> 2, 470 -> 1, Chicago, -> 3, scenic -> 1, J2 -> 1, untrimmed -> 1,
photographs--not -> 1, basis. -> 1, "prints -> 1, instances. -> 1, Onion-Planter -> 1, trick -> 1,
illustrating -> 3, prefer. -> 1, detected -> 1, non-exclusive. -> 1, famous -> 1, Competition -> 2,
expense -> 1, created -> 2, renamed. -> 1, maggot -> 1, calendar-photographs, -> 1, widely-read -> 1,
Publisher, -> 1, producers -> 1, Shapes -> 1, ARTICLES -> 2, yearly -> 2, retoucher -> 1, satisfy -> 2,
agrees: -> 1, Gentleman_, -> 1, intellectual -> 2, hard -> 2, Porch. -> 1, sold.) -> 1, START -> 1, House -> 2,
welcome -> 1, Dealers' -> 1, ... -> 2, pasted -> 1, _Cosmopolitan_ -...
//源文件:https://www.gutenberg.org/cache/epub/35709/pg35709.txt
scala>val words=scala.io.Source.fromFile(“pg35709.txt”).mkString.split(\\s+”)
单词:数组[字符串]=数组(古腾堡,电子书,制作,你的,相机,付款,作者:弗雷德里克·C·戴维斯,
这本电子书是,为了,使用,任何人,任何地方,不惜任何代价,几乎没有任何限制。,
您可以,复制,它,给予,它,送出,或,重复使用,它,根据,条款,项目,古腾堡,许可证,包括,
用,这本,电子书,或,在线,在,www.gutenberg.net,标题:,制作,你的,相机,付款,作者:,弗雷德里克,C.,戴维斯,
发布日期:2011年3月29日,[电子书,35709],语言:,英语,***,开始,本项目,古腾堡,电子书,
制作,你的,相机,付款,***,制作,由,在线,分发,校对,团队,在,http://www.pgdp.net,
(这个,文件,是,制作的,来自,图像,慷慨地,制造的,可用的,由,在。。。
scala>val wordCount=scala.collection.mutable.HashMap[String,Int]()
wordCount:scala.collection.mutable.HashMap[String,Int]=Map()
scala>for(word)
字数
scala>字数
res1:scala.collection.mutable.HashMap[String,Int]=Map(art->1,follow->3,request,->1,行。->1,
需求->7,1.E.4.->1,产品->2470->1,芝加哥,->3,风景->1,J2->1,未修剪->1,
照片-不是->1,基础->1,“打印->1,实例->1,洋葱种植机->1,技巧->1,
演示->3,首选->1,检测->1,非独占->1,著名->1,竞争->2,
费用->1,创建->2,重命名。->1,蛆虫->1,日历照片,->1,广泛阅读->1,
出版商,->1,制作人->1,形状->1,文章->2,年度->2,润饰->1,满足->2,
同意:->1,绅士,-->1,知识分子->2,努力->2,门廊->1,售出->1,开始->1,房子->2,
欢迎->1,经销商->1,…->2,粘贴->1,_Cosmopolitan-。。。
虽然我知道这是可行的,但我想知道是否有Scalaesque
方法实现同样的效果您可以这样做:
val wordCount = words.groupBy(w => w).mapValues(_.size)
groupBy
方法返回从给定函数的结果到从函数返回相同值的值集合的映射。在这种情况下,一个map[String,Array[String]]
。然后mapValues
映射Array[String]
尽其所能。如果用Scalaesque的方法实现同样的效果,你的意思是使用可变映射,这里有一个版本:
scala> val data = Array("The", "Project", "Gutenberg", "EBook", "of", "Making", "Your", "The")
data: Array[String] = Array(The, Project, Gutenberg, EBook, of, Making, Your, The)
scala> val wordCount = scala.collection.mutable.HashMap[String, Int]().withDefaultValue(0)
wordCount: scala.collection.mutable.Map[String,Int] = Map()
scala> data.foreach(word => wordCount(word) += 1 )
scala> wordCount
res6: scala.collection.mutable.Map[String,Int] = Map(Making -> 1, of -> 1, Your -> 1, Project -> 1, Gutenberg -> 1, EBook -> 1, The -> 2)
作者希望在本章中以可变的方式完成它,所以这里是我的可变解决方案(以scala的方式,应该不那么冗长)
这也可以使用。我只是选择写出lambda,因为我在开始使用Scala时发现它更清晰。如果在
while
循环结束时删除in.next()
,这将是正确的答案
var wordsMap = new scala.collection.mutable.HashMap[String, Int]
val in = new Scanner(new java.io.File("/home/artemvlasenko/Desktop/myfile.txt"))
while (in.hasNext) {
val word = in.next()
val count = wordsMap.getOrElse(word, 0)
wordsMap(word) = count + 1
in.next()
}
println(wordsMap.mkString(", "))