Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
spark scala中两个数据帧的unequla记录的左外连接_Scala_Apache Spark_Apache Spark Sql_Spark Dataframe - Fatal编程技术网

spark scala中两个数据帧的unequla记录的左外连接

spark scala中两个数据帧的unequla记录的左外连接,scala,apache-spark,apache-spark-sql,spark-dataframe,Scala,Apache Spark,Apache Spark Sql,Spark Dataframe,我有两个数据帧。 数据帧一 +-------------+-------------------------+--------------+--------+----------+-----------------------+---------------------+-------------------+-----------------------+--------------------------+--------------------------+-----------+ |Da

我有两个数据帧。 数据帧一

+-------------+-------------------------+--------------+--------+----------+-----------------------+---------------------+-------------------+-----------------------+--------------------------+--------------------------+-----------+
|DataPartition|TimeStamp                |OrganizationID|SourceID|_auditorId|sr:AuditorEnumerationId|sr:AuditorOpinionCode|sr:AuditorOpinionId|sr:IsPlayingAuditorRole|sr:IsPlayingCSRAuditorRole|sr:IsPlayingTaxAdvisorRole|FFAction|!||
+-------------+-------------------------+--------------+--------+----------+-----------------------+---------------------+-------------------+-----------------------+--------------------------+--------------------------+-----------+
|Japan        |2018-05-03T09:52:48+00:00|4295876589    |195     |null      |null                   |null                 |null               |null                   |null                      |null                      |O|!|       |
|Japan        |2018-05-03T08:10:19+00:00|4295876589    |196     |null      |null                   |null                 |null               |null                   |null                      |null                      |D|!|       |
|Japan        |2018-05-03T09:52:48+00:00|4295876589    |194     |null      |null                   |null                 |null               |null                   |null                      |null                      |O|!|       |
+-------------+-------------------------+--------------+--------+----------+-----------------------+---------------------+-------------------+-----------------------+--------------------------+--------------------------+-----------+
第二数据帧是

    DataPartition   TimeStamp   OrganizationID  SourceID    _auditorId  sr:AuditorEnumerationId sr:AuditorOpinionCode   sr:AuditorOpinionId sr:IsPlayingAuditorRole sr:IsPlayingCSRAuditorRole  sr:IsPlayingTaxAdvisorRole  FFAction|!|
Japan   2018-05-03T08:06:06+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T08:06:06+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T09:48:33+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T09:48:33+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T07:27:10+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:27:10+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:27:10+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:35:42+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:35:42+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:35:42+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T09:34:46+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T09:34:46+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T08:10:19+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T08:10:19+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T07:28:16+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:28:16+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:28:16+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-02T09:05:04+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-02T09:05:04+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-02T09:05:04+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:31:28+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:31:28+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:31:28+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:22:58+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:22:58+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:22:58+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T09:45:22+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T09:45:22+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T07:11:26+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:11:26+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:11:26+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:00:45+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:00:45+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:00:45+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:36:47+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:36:47+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:36:47+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:01:52+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:01:52+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:01:52+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-02T10:28:22+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-02T10:28:22+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-02T10:28:22+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T09:52:48+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T09:52:48+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T09:41:09+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T09:41:09+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-02T10:30:32+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-02T10:30:32+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-02T10:30:32+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T06:56:32+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T06:56:32+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T06:56:32+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T07:05:04+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:05:04+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:05:04+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
Japan   2018-05-03T09:38:59+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T09:38:59+00:00   4295876589  195 16157   1002485247  UWE 3010547 true    false   false   O|!|
Japan   2018-05-03T07:08:14+00:00   4295876589  194 2719    3023331 AOP 3010542 true    false   true    O|!|
Japan   2018-05-03T07:08:14+00:00   4295876589  195 5937    3026578 NOP 3010543 true    false   true    O|!|
Japan   2018-05-03T07:08:14+00:00   4295876589  196 3252    3024053 ONC 3020538 true    false   true    O|!|
现在我想添加数据框一二数据框的所有列,除了三列
TimeStamp、OrganizationID和SourceID
不同的记录之外。 因此,在这种情况下,数据帧1记录不会添加到数据帧2。因为
TimeStamp | OrganizationID | SourceID
列在两个数据帧中都是匹配的

只应添加一行SourceId为196的行

在这种情况下,左外联接有效吗? 当我这样做时,我会得到重复的列


因此,简而言之,基于数据框1中三列的匹配记录不会被添加,除非所有记录都将添加到数据框中

您可以尝试
leftanti
连接,然后
联合
df2

df1.join(df2, Seq("TimeStamp" ,"OrganizationID", "SourceID"), "leftanti").union(df2)

您的最终数据帧应该是什么样子?Si ti realyl os muhc effotr ot tyep teh titel properyl?不,如果我使用这个,我会得到重复的记录