Apache spark Spark cassandra:join table with the condition on the query on the primary RDD中的属性(“where table a.myValue>;table b.myOtherValue";”的查询)
有没有一种方法可以连接两个表,在两个表之间的列上添加一个条件 例如:Apache spark Spark cassandra:join table with the condition on the query on the primary RDD中的属性(“where table a.myValue>;table b.myOtherValue";”的查询),apache-spark,cassandra,spark-cassandra-connector,Apache Spark,Cassandra,Spark Cassandra Connector,有没有一种方法可以连接两个表,在两个表之间的列上添加一个条件 例如: case class TableA(pkA: Int, valueA: Int) case class TableB(pkB: Int, valueB: Int) val rddA = sc.cassandraTable[TableA]("ks", "tableA") rddA.joinWithCassandraTable[TableB]("ks", "tableB").where("tableB.valueB >
case class TableA(pkA: Int, valueA: Int)
case class TableB(pkB: Int, valueB: Int)
val rddA = sc.cassandraTable[TableA]("ks", "tableA")
rddA.joinWithCassandraTable[TableB]("ks", "tableB").where("tableB.valueB > tableA.valueA")
是否有方法发送where(“tableB.valueB>tableA.valueA”)
指令?(“tableB.value”是一个集群列)RDD.where()调用只是将谓词传递给CQL。CQL仅限于快速和简单的OLTP查询。
更复杂的查询只能用SparkSQL完成。对于您的情况,可能是这样的:
sqlContext.read.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "tableA", "keyspace"->"ks"))
.load().registerTempTable("tableA")
sqlContext.read.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "tableB", "keyspace"->"ks"))
.load().registerTempTable("tableB")
sqlContext.sql("select * from tableA join tableB on tableB.valueB > tableA.valueA").show