Scala Spark内部连接并获取最小值()

Scala Spark内部连接并获取最小值(),scala,apache-spark,Scala,Apache Spark,我无法正确联接并获取结果列,需要在联接后获取列的min() SELECT t.ad, t.DId, t.BY, t.BM, t.cid, MIN(p.PS) AS PS FROM Tempity t inner join ples p on t.cid = p.cid and p.PType = t.TeO AND p.pto = 'cccc' AND p.cid = 2 GROUP BY t.aid ,t.DId ,t

我无法正确联接并获取结果列,需要在联接后获取列的min()

 SELECT 
t.ad,
t.DId,
t.BY,
t.BM,
t.cid,
MIN(p.PS) AS PS 
FROM 
    Tempity t inner join  ples p 
    on t.cid = p.cid
    and p.PType = t.TeO 
    AND p.pto = 'cccc' 
    AND p.cid = 2
  GROUP BY t.aid
    ,t.DId
    ,t.BYear
    ,t.BM
    ,t.cid;
I am converting above sql query as
        val RS = Tempity.join(DF_LES,Tempity("cid") <=> DF_PLES("cid")&&   DF_PLES("clientid") <=> 2 && Tempity("TO") <=> DF_LES("PType") && DF_LES("pto") <=> "cccc" ,"inner").select("aid","DId","BM","BY","TO","cid").groupBy(aid","DId","BM","BY")select("aid","DId","BM","BY","TO","cid").show
使用
tempaty(“cid”)
而不是
cid
,因为它不明确

import org.apache.spark.sql.functions._ //for min()

val RS = Tempity.join(DF_LES, 
          Tempity("cid") <=> DF_PLES("cid") && 
          DF_PLES("clientid") <=> 2 && 
          Tempity("TO") <=> DF_PLES("PType") && 
          DF_PLES("pto") <=> "cccc", 
        "inner"
      )
    .groupBy("a​id","DId","BM","BY", Tempity("cid"))‌​
    .agg(min(DF_PLES("PS")))

RS.show()

:49:错误:value select不是org.apache.spark.sql的成员。RelationalGroupedDataset
val RS=tempy.join(DF_LES,tempy(“cid”)DF-PLES(“cid”)和&DF-PLES(“clientid”)2和&tempy(“TO”)DF-PLES(“PType”)和&DF-PLES(“pto”)“cccc”,“inner”)引发错误.groupBy(“a​id、DId、BM、BY、TEMPTY(“cid”))‌​     .agg(min(DF_PLES(“PS”))
我想在DF上执行操作,而不是使用sql(“”)。在从表的!!如果可能,通过编辑数据帧代码,在应答中添加数据帧代码。
import org.apache.spark.sql.functions._ //for min()

val RS = Tempity.join(DF_LES, 
          Tempity("cid") <=> DF_PLES("cid") && 
          DF_PLES("clientid") <=> 2 && 
          Tempity("TO") <=> DF_PLES("PType") && 
          DF_PLES("pto") <=> "cccc", 
        "inner"
      )
    .groupBy("a​id","DId","BM","BY", Tempity("cid"))‌​
    .agg(min(DF_PLES("PS")))

RS.show()
val spark: SparkSession = SparkSession.builder.master("local").getOrCreate;

//create tables from DataFrames
Tempity.createOrReplaceTempView("Tempity")
DF_PLES.createOrReplaceTempView("ples")

import spark.sql

//Now run the same SQL 

sql("""
    SELECT t.ad, t.DId, t.BY, t.BM, t.cid, MIN(p.PS) AS PS
      FROM Tempity t
    INNER JOIN ples p
      ON t.cid = p.cid AND p.PType = t.TeO AND p.pto = 'cccc' AND p.cid = 2
    GROUP BY t.ad, t.DId, t.BY, t.BM, t.cid
    """)