Scala 从另一个dataframe列值筛选dataframe
我们有两个数据帧,需要进行过滤 一个数据帧中的数据与另一个数据帧列中的数据Scala 从另一个dataframe列值筛选dataframe,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,我们有两个数据帧,需要进行过滤 一个数据帧中的数据与另一个数据帧列中的数据 df1 ------------------------------- name paid_amount date_paid ------------------------------- aaa 10 2017-10-10 aba 10 2017-01-10 aac 10 2017-10-10 daa 10 2017-16-10
df1
-------------------------------
name paid_amount date_paid
-------------------------------
aaa 10 2017-10-10
aba 10 2017-01-10
aac 10 2017-10-10
daa 10 2017-16-10
df2
-----------------------------
start_date end_date
-----------------------------
2017-01-01 2018-01-01
------------------------------
we need to create third dataframe by checking
(date_paid) field in df1 falls in between df2(start_date) & df2(end_date)
df1.where($date_paid).isin(df2.start_date && df2.end_date)
应该是:
df1.crossJoin(df2).where($"date_paid".between($"start_date", $"end_date"))