Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/reactjs/21.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 在Spark中获得关键点的平均值_Apache Spark - Fatal编程技术网

Apache spark 在Spark中获得关键点的平均值

Apache spark 在Spark中获得关键点的平均值,apache-spark,Apache Spark,如何计算Spark中键的平均值?我们可以使用combineByKey或foldByKey计算Spark中键的平均值 民谣 输入数据: employee,department,salary e1,d1,100 e2,d1,500 e5,d2,200 e6,d1,300 e7,d3,200 e7,d3,500 最后的1表示计数。输入类型和初始值必须匹配 val depSalary = data.map(_.split(',')).map( x=> (x(1),(x(2).toInt,1)))

如何计算Spark中键的平均值?

我们可以使用combineByKey或foldByKey计算Spark中键的平均值

民谣 输入数据:

employee,department,salary
e1,d1,100
e2,d1,500
e5,d2,200
e6,d1,300
e7,d3,200
e7,d3,500
最后的1表示计数。输入类型和初始值必须匹配

val depSalary = data.map(_.split(',')).map( x=> (x(1),(x(2).toInt,1)))   

val dummy = (0,0)
val depSalarySumCount = depSalary.foldByKey(dummy)((startValue,data)  => ( startValue._1 + data._1 , startValue._2 +data._2  ) )   

val result =  depSalarySumCount.map(x => (x._1, (x._2._1/x._2._2) ))
result.collect
val depSalary = data.map(_.split(',')).map( x=> (x(1),(x(2).toInt,1)))   

val dummy = (0,0)
val depSalarySumCount = depSalary.foldByKey(dummy)((startValue,data)  => ( startValue._1 + data._1 , startValue._2 +data._2  ) )   

val result =  depSalarySumCount.map(x => (x._1, (x._2._1/x._2._2) ))
result.collect