Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/26.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Mysql 如何使用spark sql区分两个表?_Mysql_Sql Server_Apache Spark Sql - Fatal编程技术网

Mysql 如何使用spark sql区分两个表?

Mysql 如何使用spark sql区分两个表?,mysql,sql-server,apache-spark-sql,Mysql,Sql Server,Apache Spark Sql,现在我需要使用spark sql区分两个表,我发现sql server的答案如下: (SELECT * FROM table1 EXCEPT SELECT * FROM table2) UNION ALL (SELECT * FROM table2 EXCEPT SELECT * FROM table1) 希望有人能告诉我如何在SQLServer中这样使用spark sql?(不在乎特殊颜色,只需使用*)您可以这样做: scala> val df1=sc

现在我需要使用spark sql区分两个表,我发现sql server的答案如下:

(SELECT *
 FROM   table1
 EXCEPT
 SELECT *
 FROM   table2)
UNION ALL
(SELECT *
 FROM   table2
 EXCEPT
 SELECT *
 FROM   table1) 

希望有人能告诉我如何在SQLServer中这样使用spark sql?(不在乎特殊颜色,只需使用*)

您可以这样做:

scala> val df1=sc.parallelize(Seq((1,2),(3,4))).toDF("a","b")
df1: org.apache.spark.sql.DataFrame = [a: int, b: int]

scala> val df2=sc.parallelize(Seq((1,2),(5,6))).toDF("a","b")
df2: org.apache.spark.sql.DataFrame = [a: int, b: int]

scala> df1.create
createOrReplaceTempView   createTempView

scala> df1.createTempView("table1")

scala> df2.createTempView("table2")

scala> spark.sql("select * from table1 EXCEPT select * from table2").show
+---+---+                                                                       
|  a|  b|
+---+---+
|  3|  4|
+---+---+


scala> spark.sql("(select * from table2 EXCEPT select * from table1) UNION ALL (select * from table1 EXCEPT select * from table2)").show
+---+---+                                                                       
|  a|  b|
+---+---+
|  5|  6|
|  3|  4|
+---+---+

注意:在您的情况下,您必须从JDBC调用中生成数据帧,然后注册表并执行操作。

您可以这样做:

scala> val df1=sc.parallelize(Seq((1,2),(3,4))).toDF("a","b")
df1: org.apache.spark.sql.DataFrame = [a: int, b: int]

scala> val df2=sc.parallelize(Seq((1,2),(5,6))).toDF("a","b")
df2: org.apache.spark.sql.DataFrame = [a: int, b: int]

scala> df1.create
createOrReplaceTempView   createTempView

scala> df1.createTempView("table1")

scala> df2.createTempView("table2")

scala> spark.sql("select * from table1 EXCEPT select * from table2").show
+---+---+                                                                       
|  a|  b|
+---+---+
|  3|  4|
+---+---+


scala> spark.sql("(select * from table2 EXCEPT select * from table1) UNION ALL (select * from table1 EXCEPT select * from table2)").show
+---+---+                                                                       
|  a|  b|
+---+---+
|  5|  6|
|  3|  4|
+---+---+

注意:在您的情况下,您必须从JDBC调用中生成数据帧,然后注册表并执行操作。

对特定主键的连接不是更有效吗?使用Except将表a中的每一行与表B中的每一行进行比较是否正确?特定主键上的联接不是更有效吗?使用Except将表A中的每一行与表B中的每一行进行比较,对吗?