Scala Spark 2.0-不使用createOrReplaceTempView将数据帧拆分为子数据帧
我有一个DF,如下所示,我需要将其转换为子数据帧,而不使用createOrReplaceTempView并在sql中执行子查询,因为子数据帧将用于执行与主表的多个左联接 我尝试了listvaluesDF.createOrReplaceTempViewlistvaluesDF&&var dfYear=spark.sqlselect键,它的值来自listvaluesDF,其中internal='year',但我确实解释了它创建的视图在执行大左连接时可能会发生冲突Scala Spark 2.0-不使用createOrReplaceTempView将数据帧拆分为子数据帧,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,我有一个DF,如下所示,我需要将其转换为子数据帧,而不使用createOrReplaceTempView并在sql中执行子查询,因为子数据帧将用于执行与主表的多个左联接 我尝试了listvaluesDF.createOrReplaceTempViewlistvaluesDF&&var dfYear=spark.sqlselect键,它的值来自listvaluesDF,其中internal='year',但我确实解释了它创建的视图在执行大左连接时可能会发生冲突 +-----------------
+--------------------+--------------------+----------------+
| key| value| internal|
+--------------------+--------------------+----------------+
| accessories| D31 - Accessories| wmt0Department|
| annualEvent| Annual Event| wmt0Event|
| apparel| Apparel| wmt0SBU|
| automotive| D10 - Automotive| wmt0Department|
| baking| baking|wmt0DeptCategory|
| baking| Baking| seasonType|
| baking| Baking|wmt0DeptCategory|
| bathandShower|D20 - Bath and Sh...| wmt0Department|
| bedding| D22 - Bedding| wmt0Department|
| betty| Betty|wmt0DeptCategory|
| boxedCards| Boxed Cards|wmt0DeptCategory|
| boyswear| Boyswear|wmt0DeptCategory|
| boyswear| D24 - Boyswear| wmt0Department|
| buildaBasket| Build a Basket|wmt0DeptCategory|
| camerasAndSupplies|D06 - Cameras & S...| wmt0Department|
| cardsAndGifts| Cards & Gifts| seasonType|
| cardsAndGifts| Cards & Gifts|wmt0DeptCategory|
| carving| Carving|wmt0DeptCategory|
| celebration| D67 - Celebration| wmt0Department|
| chineseNewYear| Chinese New Year| seasonType|
| christmas| Christmas| seasonType|
| cookandDine| D14 - Cook and Dine| wmt0Department|
| costumes| Costumes|wmt0DeptCategory|
| crafts| D44 - Crafts| wmt0Department|
| decor| Décor|wmt0DeptCategory|
| decortiveGarland|Decorative Wreath...|wmt0DeptCategory|
| dotcomOnly| Dotcom Only| wmt0DotcomOnly|
| dressUp| Dress Up|wmt0DeptCategory|
| easter| Easter| seasonType|
| electronics| D72 - Electronics| wmt0Department|
| entertainment| Entertainment| wmt0SBU|
| fall| S3 - Fall| seasonType|
| fallGMPad| Fall GM Pad| wmt0Event|
| fallModular| Fall Modular| seasonType|
| familySocks| D27 - Family Socks| wmt0Department|
| familySocks| Family Socks|wmt0DeptCategory|
| fathersDay| Father's Day| seasonType|
| feature| Feature| wmt0Event|
| featureQ1| Feature Q1| seasonType|
| featureQ2| Feature Q2| seasonType|
| featureQ3| Feature Q3| seasonType|
| featureQ4| Feature Q4| seasonType|
| featureSeason1| Feature Season 1| seasonType|
| featureSeason2| Feature Season 2| seasonType|
| featureSeason3| Feature Season 3| seasonType|
| featureSeason4| Feature Season 4| seasonType|
| firstHalf| H1 – First Half| seasonType|
'dfYear=listvaluesDf.select'key',value.where'internal.equalsyear drYear.show'你试过吗?让我试一下,同时上面的问题看起来像产品问题-issues.apache.org/jira/browse/SPARK-25150scala>val dftempear=listvaluesDf.select'key,'value.where'internal.equalsyear:25:error:重载方法值where和可选项:conditionExpr:Stringorg.apache.spark.sql.Dataset[org.apache.spark.sql.Row]条件:org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]不能应用于布尔值dftempyer=listValuesDFF.select'key,'value.where'internal.equalsyearThis worked-val dftempyar=listValuesDFF.filter$internal==year.select'key',value,但将尝试主左连接并让您知道@chlebeno Luck,Spark正在将新的DF子集链接到原始沿袭