Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 检查两个数据集数组的交集_Apache Spark - Fatal编程技术网

Apache spark 检查两个数据集数组的交集

Apache spark 检查两个数据集数组的交集,apache-spark,Apache Spark,我想检查两个小数据集数组的交集,我是这样做的: for (Dataset<Row> dataset1 : arrayOfDatasets1) { for (Dataset<Row> dataset2 : arrayOfDatasets2) { if (dataset1.intersect(dataset2).count() != 0) return true; } }

我想检查两个小数据集数组的交集,我是这样做的:

 for (Dataset<Row> dataset1 : arrayOfDatasets1) {
        for (Dataset<Row> dataset2 : arrayOfDatasets2) {
            if (dataset1.intersect(dataset2).count() != 0)
                return true;
        }
    }
    return false;
for(数据集数据集1:ArrayOfDataSet1){
对于(数据集数据集2:arrayOfDatasets2){
if(dataset1.intersect(dataset2.count()!=0)
返回true;
}
}
返回false;
给定数据集具有模式,上述方法是否合适?或者我应该使用:

dataset1.intersect(dataset2)。() != sparksession.emptyDataframe()


而是比较数据集?

我确实发现计数是有效的