Scala 如何从spark数据帧计算滚动协方差矩阵
我有一个Spark 2.2.0的货币价格数据框架,我将收益添加到其中Scala 如何从spark数据帧计算滚动协方差矩阵,scala,apache-spark,spark-dataframe,Scala,Apache Spark,Spark Dataframe,我有一个Spark 2.2.0的货币价格数据框架,我将收益添加到其中 import org.apache.spark.sql.SparkSession import org.apache.spark.sql.expressions.Window import org.apache.spark.sql.functions._ val spark = SparkSession.builder.getOrCreate() val prices = spark.read.json("prices.js
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
val spark = SparkSession.builder.getOrCreate()
val prices = spark.read.json("prices.json")
// make a window function and convert prices to returns
val window = Window.partitionBy("currency").orderBy("time")
val lagPrice = lag(col("close"), 1).over(window)
val percentReturn = col("close") / col("lastClose") - 1d
val logReturn = log(col("close") / col("lastClose"))
val returns = prices.withColumn("lastClose", lagPrice)
.withColumn("return", percentReturn)
.withColumn("logReturn", logReturn)
现在我想用一个窗口函数计算所有货币的滚动协方差矩阵(如移动平均数)。但我找不到任何文件或例子