Apache spark 为什么pyspark和sql的结果不同?
我试图用两种不同的语法在pyspark中翻译下面的sql,但这两种代码都给出了不同的输出,这也与sql输出不匹配。我不明白这些代码的实际区别在哪里Apache spark 为什么pyspark和sql的结果不同?,apache-spark,pyspark,apache-spark-sql,Apache Spark,Pyspark,Apache Spark Sql,我试图用两种不同的语法在pyspark中翻译下面的sql,但这两种代码都给出了不同的输出,这也与sql输出不匹配。我不明白这些代码的实际区别在哪里 select count(*) from ( select afpo.charg as Batch_Number, mara1.matkl as Material_Group, mara1.zzmanu_stg as Mfg_Stage_Code, mkpf.budat as WCB_261_Posting_Date, mch1.hsdat as M
select count(*) from (
select afpo.charg as Batch_Number,
mara1.matkl as Material_Group,
mara1.zzmanu_stg as Mfg_Stage_Code,
mkpf.budat as WCB_261_Posting_Date,
mch1.hsdat as Manufacturing_Date
from
opssup_dev_wrk_sap.src_sap_afpo afpo
inner join opssup_dev_wrk_sap.src_sap_mara mara1 on afpo.matnr=mara1.matnr
inner join opssup_dev_wrk_sap.src_sap_mseg mseg on afpo.aufnr=mseg.aufnr
inner join opssup_dev_wrk_sap.src_sap_mkpf mkpf on mseg.mblnr=mkpf.mblnr
inner join opssup_dev_wrk_sap.src_sap_mara mara on mseg.matnr=mara.matnr
inner join opssup_dev_wrk_sap.src_sap_mch1 mch1 on afpo.charg=mch1.charg
where mara.zzmanu_stg='WCB'
and mseg.bwart='261')
---返回2505行
上述sql查询的执行计划:
我已在pyspark中转换此sql,如下所示:
afpo_df = sqlContext.table(sap_source_schema + ".src_sap_afpo").alias('afpo_df')
mara1_df = sqlContext.table(sap_source_schema + ".src_sap_mara").alias('mara1_df')
mseg_df = sqlContext.table(sap_source_schema + ".src_sap_mseg").alias('mseg_df')
mkpf_df = sqlContext.table(sap_source_schema + ".src_sap_mkpf").alias('mkpf_df')
mara_df = sqlContext.table(sap_source_schema + ".src_sap_mara").alias('mara_df')
mch1_df = sqlContext.table(sap_source_schema + ".src_sap_mch1").alias('mch1_df')
temp12_df = afpo_df \
.join(mara1_df,(afpo_df.matnr==mara1_df.matnr)) \
.join(mseg_df,(afpo_df.aufnr==mseg_df.aufnr)) \
.join(mkpf_df,(mseg_df.mblnr==mkpf_df.mblnr)) \
.join(mara_df,(mseg_df.matnr==mara_df.matnr)) \
.join(mch1_df,(afpo_df.charg==mch1_df.charg)) \
.filter("mseg_df.bwart=='261' AND mara_df.zzmanu_stg=='WCB'") \
.select(afpo_df.charg.alias('Batch_Number'),mara1_df.matkl.alias('Material_Group'),mara1_df.zzmanu_stg.alias('Mfg_Stage_Code'), \
mkpf_df.budat.alias('WCB_261_Posting_Date'),mch1_df.hsdat.alias('Manufacturing_Date'))
target_df = temp12_df
print(target_df.count())
返回大约13L行
上述代码对应的查询计划:
> == Physical Plan ==
*(15) Project [charg#72 AS Batch_Number#322732, matkl#126 AS Material_Group#322733, zzmanu_stg#275 AS Mfg_Stage_Code#322734, budat#511 AS WCB_261_Posting_Date#322735, hsdat#571 AS Manufacturing_Date#322736]
+- *(15) BroadcastNestedLoopJoin BuildRight, Inner
:- *(15) Project [charg#72, matkl#126, zzmanu_stg#275, budat#511, hsdat#571]
: +- *(15) SortMergeJoin [charg#72], [charg#543], Inner
: :- *(11) Sort [charg#72 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(charg#72, 200)
: : +- *(10) Project [charg#72, matkl#126, zzmanu_stg#275, budat#511]
: : +- *(10) SortMergeJoin [mblnr#313], [mblnr#505], Inner
: : :- *(7) Sort [mblnr#313 ASC NULLS FIRST], false, 0
: : : +- Exchange hashpartitioning(mblnr#313, 200)
: : : +- *(6) Project [charg#72, matkl#126, zzmanu_stg#275, mblnr#313]
: : : +- *(6) SortMergeJoin [aufnr#14, matnr#116], [aufnr#368, matnr#321], Inner
: : : :- *(3) Sort [aufnr#14 ASC NULLS FIRST, matnr#116 ASC NULLS FIRST], false, 0
: : : : +- Exchange hashpartitioning(aufnr#14, matnr#116, 200)
: : : : +- *(2) Project [aufnr#14, charg#72, matnr#116, matkl#126, zzmanu_stg#275]
: : : : +- *(2) BroadcastHashJoin [matnr#33], [matnr#116], Inner, BuildRight, false
: : : : :- *(2) Project [aufnr#14, matnr#33, charg#72]
: : : : : +- *(2) Filter ((isnotnull(matnr#33) && isnotnull(aufnr#14)) && isnotnull(charg#72))
: : : : : +- *(2) FileScan parquet opssup_dev_wrk_sap.src_sap_afpo[aufnr#14,matnr#33,charg#72] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_afpo], PartitionFilters: [], PushedFilters: [IsNotNull(matnr), IsNotNull(aufnr), IsNotNull(charg)], ReadSchema: struct<aufnr:string,matnr:string,charg:string>
: : : : +- BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(input[0, string, true]))
: : : : +- *(1) Project [matnr#116, matkl#126, zzmanu_stg#275]
: : : : +- *(1) Filter isnotnull(matnr#116)
: : : : +- *(1) FileScan parquet opssup_dev_wrk_sap.src_sap_mara[matnr#116,matkl#126,zzmanu_stg#275] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mara], PartitionFilters: [], PushedFilters: [IsNotNull(matnr)], ReadSchema: struct<matnr:string,matkl:string,zzmanu_stg:string>
: : : +- *(5) Sort [aufnr#368 ASC NULLS FIRST, matnr#321 ASC NULLS FIRST], false, 0
: : : +- Exchange hashpartitioning(aufnr#368, matnr#321, 200)
: : : +- *(4) Project [mblnr#313, matnr#321, aufnr#368]
: : : +- *(4) Filter ((((isnotnull(bwart#319) && (bwart#319 = 261)) && isnotnull(matnr#321)) && isnotnull(aufnr#368)) && isnotnull(mblnr#313))
: : : +- *(4) FileScan parquet opssup_dev_wrk_sap.src_sap_mseg[mblnr#313,bwart#319,matnr#321,aufnr#368] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mseg], PartitionFilters: [], PushedFilters: [IsNotNull(bwart), EqualTo(bwart,261), IsNotNull(matnr), IsNotNull(aufnr), IsNotNull(mblnr)], ReadSchema: struct<mblnr:string,bwart:string,matnr:string,aufnr:string>
: : +- *(9) Sort [mblnr#505 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(mblnr#505, 200)
: : +- *(8) Project [mblnr#505, budat#511]
: : +- *(8) Filter isnotnull(mblnr#505)
: : +- *(8) FileScan parquet opssup_dev_wrk_sap.src_sap_mkpf[mblnr#505,budat#511] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mkpf], PartitionFilters: [], PushedFilters: [IsNotNull(mblnr)], ReadSchema: struct<mblnr:string,budat:string>
: +- *(13) Sort [charg#543 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(charg#543, 200)
: +- *(12) Project [charg#543, hsdat#571]
: +- *(12) Filter isnotnull(charg#543)
: +- *(12) FileScan parquet opssup_dev_wrk_sap.src_sap_mch1[charg#543,hsdat#571] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mch1], PartitionFilters: [], PushedFilters: [IsNotNull(charg)], ReadSchema: struct<charg:string,hsdat:string>
+- BroadcastExchange IdentityBroadcastMode
+- *(14) Project
+- *(14) Filter (isnotnull(zzmanu_stg#318210) && (zzmanu_stg#318210 = WCB))
+- *(14) FileScan parquet opssup_dev_wrk_sap.src_sap_mara[zzmanu_stg#318210] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mara], PartitionFilters: [], PushedFilters: [IsNotNull(zzmanu_stg), EqualTo(zzmanu_stg,WCB)], ReadSchema: struct<zzmanu_stg:string>
返回1804行
上述代码的执行计划为:
== Physical Plan ==
*(15) Project [charg#72 AS Batch_Number#301751, matkl#126 AS Material_Group#301752, zzmanu_stg#275 AS Mfg_Stage_Code#301753, budat#511 AS WCB_261_Posting_Date#301754, hsdat#571 AS Manufacturing_Date#301755]
+- *(15) SortMergeJoin [charg#72], [charg#543], Inner
:- *(12) Sort [charg#72 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(charg#72, 200)
: +- *(11) Project [charg#72, matkl#126, zzmanu_stg#275, budat#511]
: +- *(11) BroadcastHashJoin [matnr#33], [matnr#300069], Inner, BuildRight, false
: :- *(11) Project [matnr#33, charg#72, matkl#126, zzmanu_stg#275, budat#511]
: : +- *(11) SortMergeJoin [mblnr#313], [mblnr#505], Inner
: : :- *(7) Sort [mblnr#313 ASC NULLS FIRST], false, 0
: : : +- Exchange hashpartitioning(mblnr#313, 200)
: : : +- *(6) Project [matnr#33, charg#72, matkl#126, zzmanu_stg#275, mblnr#313]
: : : +- *(6) SortMergeJoin [aufnr#14], [aufnr#368], Inner
: : : :- *(3) Sort [aufnr#14 ASC NULLS FIRST], false, 0
: : : : +- Exchange hashpartitioning(aufnr#14, 200)
: : : : +- *(2) Project [matnr#33, aufnr#14, charg#72, matkl#126, zzmanu_stg#275]
: : : : +- *(2) BroadcastHashJoin [matnr#33], [matnr#116], Inner, BuildRight, false
: : : : :- *(2) Project [aufnr#14, matnr#33, charg#72]
: : : : : +- *(2) Filter ((isnotnull(matnr#33) && isnotnull(aufnr#14)) && isnotnull(charg#72))
: : : : : +- *(2) FileScan parquet opssup_dev_wrk_sap.src_sap_afpo[aufnr#14,matnr#33,charg#72] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_afpo], PartitionFilters: [], PushedFilters: [IsNotNull(matnr), IsNotNull(aufnr), IsNotNull(charg)], ReadSchema: struct<aufnr:string,matnr:string,charg:string>
: : : : +- BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(input[0, string, true]))
: : : : +- *(1) Project [matnr#116, matkl#126, zzmanu_stg#275]
: : : : +- *(1) Filter isnotnull(matnr#116)
: : : : +- *(1) FileScan parquet opssup_dev_wrk_sap.src_sap_mara[matnr#116,matkl#126,zzmanu_stg#275] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mara], PartitionFilters: [], PushedFilters: [IsNotNull(matnr)], ReadSchema: struct<matnr:string,matkl:string,zzmanu_stg:string>
: : : +- *(5) Sort [aufnr#368 ASC NULLS FIRST], false, 0
: : : +- Exchange hashpartitioning(aufnr#368, 200)
: : : +- *(4) Project [mblnr#313, aufnr#368]
: : : +- *(4) Filter (((isnotnull(bwart#319) && (bwart#319 = 261)) && isnotnull(aufnr#368)) && isnotnull(mblnr#313))
: : : +- *(4) FileScan parquet opssup_dev_wrk_sap.src_sap_mseg[mblnr#313,bwart#319,aufnr#368] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mseg], PartitionFilters: [], PushedFilters: [IsNotNull(bwart), EqualTo(bwart,261), IsNotNull(aufnr), IsNotNull(mblnr)], ReadSchema: struct<mblnr:string,bwart:string,aufnr:string>
: : +- *(9) Sort [mblnr#505 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(mblnr#505, 200)
: : +- *(8) Project [mblnr#505, budat#511]
: : +- *(8) Filter isnotnull(mblnr#505)
: : +- *(8) FileScan parquet opssup_dev_wrk_sap.src_sap_mkpf[mblnr#505,budat#511] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mkpf], PartitionFilters: [], PushedFilters: [IsNotNull(mblnr)], ReadSchema: struct<mblnr:string,budat:string>
: +- BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(input[0, string, true]))
: +- *(10) Project [matnr#300069]
: +- *(10) Filter ((isnotnull(zzmanu_stg#300228) && (zzmanu_stg#300228 = WCB)) && isnotnull(matnr#300069))
: +- *(10) FileScan parquet opssup_dev_wrk_sap.src_sap_mara[matnr#300069,zzmanu_stg#300228] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mara], PartitionFilters: [], PushedFilters: [IsNotNull(zzmanu_stg), EqualTo(zzmanu_stg,WCB), IsNotNull(matnr)], ReadSchema: struct<matnr:string,zzmanu_stg:string>
+- *(14) Sort [charg#543 ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(charg#543, 200)
+- *(13) Project [charg#543, hsdat#571]
+- *(13) Filter isnotnull(charg#543)
+- *(13) FileScan parquet opssup_dev_wrk_sap.src_sap_mch1[charg#543,hsdat#571] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mch1], PartitionFilters: [], PushedFilters: [IsNotNull(charg)], ReadSchema: struct<charg:string,hsdat:string>
==物理计划==
*(15) 项目[charg#72作为批号#301751,matkl#126作为材料#组#301752,zzmanu#stg#275作为制造阶段#代码#301753,布达特#511作为WCB#261 354,hsdat 571作为制造日期35755]
+-*(15)SortMergeJoin[charg#72],[charg#543],内
:-*(12)排序[charg#72 ASC NULLS FIRST],false,0
:+-交换哈希分区(charg#72200)
:+-*(11)项目[charg 72,matkl 126,zzmanu stg 275,budat 511]
:+-*(11)BroadcastHashJoin[matnr#33],[matnr#300069],内部,BuildRight,false
::-*(11)项目[33号料、72号料、126号料、zzmanu stg 275号料、布达特511号料]
::+-*(11)SortMergeJoin[mblnr#313],[mblnr#505],内
:-*(7)排序[mblnr#313 ASC NULL FIRST],false,0
:+-Exchange哈希分区(mblnr#313200)
:+-*(6)项目[33号料、72号料、126号料、275号料、313号料]
:+-*(6)SortMergeJoin[aufnr#14],[aufnr#368],内部
:-*(3)排序[aufnr#14 ASC NULL FIRST],false,0
:+-Exchange hashpartitioning(aufnr#14200)
:+-*(2)项目[33号材料、14号材料、72号材料、126号材料、275号材料]
:+-*(2)BroadcastHashJoin[matnr#33],[matnr#116],内部,BuildRight,false
:-*(2)项目[aufnr 14,matnr 33,charg 72]
::+-*(2)过滤器((isnotnull(matnr#33)和&isnotnull(aufnr#14))和&isnotnull(charg#72))
:+-*(2)文件扫描拼花地板操作集:true,格式:parquet,位置:InMemoryFileIndex[s3://amgen edl ois opssup shr bkt/dev/west2/wrk/sap/src#sap\afpo],分区过滤器:[],PushedFilters:[IsNotNull(matnr),IsNotNull(aufnr),NotIsIs)],ReadSchema:struct
:+-BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(输入[0,字符串,true]))
:+-*(1)项目[116号材料、126号材料、275号材料]
:+-*(1)过滤器不为空(matnr#116)
:+-*(1)文件扫描拼花地板opssup#u dev#u wrk#u sap.src#u sap#mara[matnr#116、matkl#126、zzmanu stg#275]批处理:true,格式:拼花地板,位置:InMemoryFileIndex[s3://amgen edl ois opssup shr bkt/dev/west2/wrk/sap/src#sap#sap#mara],分区过滤器:[],PushedFilters:[IsNotNull(matnr),模式:结构
:+-*(5)排序[aufnr#368 ASC NULL FIRST],false,0
:+-Exchange哈希分区(aufnr#368200)
:+-*(4)项目[mblnr 313,aufnr 368]
:+-*(4)过滤器((isnotnull(bwart#319)和&(bwart#319=261))&&isnotnull(aufnr#368))&&isnotnull(mblnr#313))
:+-*(4)文件扫描拼花地板opssup#u dev#u wrk#u sap.src#u sap#mseg[mblnr#313,bwart#319,aufnr#368]批处理:true,格式:拼花,位置:内存文件索引[s3://amgen edl ois opssup shr bkt/dev/west2/wrk/sap/src#sap#sap#mseg],分区过滤器:[]、推送过滤器:[NULL(艺术),EqualTo(非艺术),ISNOTFNR 261),IsNotNull(mblnr)],ReadSchema:struct
::+-*(9)排序[mblnr#505 ASC NULL FIRST],false,0
:+-交换哈希分区(mblnr#505200)
:+-*(8)项目[mblnr 505,布达特511]
::+-*(8)过滤器不为空(mblnr#505)
::+-*(8)文件扫描拼花地板opssup_dev_wrk_sap.src_sap_mkpf[mblnr#505,budat#511]批处理:true,格式:parquet,位置:InMemoryFileIndex[s3://amgen edl ois opssup shr bkt/dev/west2/wrk/sap/src_sap_mkpf],分区过滤器:[],PushedFilters:],PushedFilters:[IsNotNull(mblnr,schema readstruct:struct]
:+-BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(输入[0,字符串,true]))
:+-*(10)项目[matnr#300069]
:+-*(10)过滤器((isnotnull(zzmanu stg#300228)和&(zzmanu stg#300228=WCB))和isnotnull(材料号300069))
:+-*(10)文件扫描拼花地板opssup_dev_wrk_sap.src_sap_mara[matnr#300069,zzmanu stg#300228]批处理:true,格式:拼花,位置:InMemoryFileIndex[s3://amgen edl ois opssup shr bkt/dev/west2/wrk/sap/src_sap_mara],分区过滤器:[]、推送过滤器:[IsNotNull(zzmanu stg)、EqualTo(zzmanu stg)、isn nottg、isnr、isnr、[isnr],ReadSchema:struct
+-*(14)排序[charg#543 ASC NULLS FIRST],false,0
+-交换哈希分区(charg#543200)
+-*(13)项目[charg 543,hsdat 571]
+-*(13)过滤器不为空(charg#543)
+-*(13)FileScan parquet opssup_dev_wrk_sap.src_sap_mch1[charg#543,hsdat#571]批处理:true,格式:parquet,位置:InMemoryFileIndex[s3://amgen edl ois opssup shr bkt/dev/west2/wrk/sap/src_sap_mch1],分区过滤器:[],PushedFilters:[IsNotNull(charg)],ReadSchema:struct
为什么会发生这种情况,以及在pyspark中转换上述sql查询的最佳方式是什么。请包含每段代码。根据要求,我已更新了执行计划。
afpo_df = sqlContext.table(sap_source_schema + ".src_sap_afpo").alias('afpo_df')
mara1_df = sqlContext.table(sap_source_schema + ".src_sap_mara").alias('mara1_df')
mseg_df = sqlContext.table(sap_source_schema + ".src_sap_mseg").alias('mseg_df')
mkpf_df = sqlContext.table(sap_source_schema + ".src_sap_mkpf").alias('mkpf_df')
mara_df = sqlContext.table(sap_source_schema + ".src_sap_mara").alias('mara_df')
mch1_df = sqlContext.table(sap_source_schema + ".src_sap_mch1").alias('mch1_df')
temp12_df = afpo_df \
.join(mara1_df,"matnr") \
.join(mseg_df,"aufnr") \
.join(mkpf_df,"mblnr") \
.join(mara_df,"matnr") \
.join(mch1_df,"charg") \
.filter("mseg_df.bwart=='261' AND mara_df.zzmanu_stg=='WCB'") \
.select(afpo_df.charg.alias('Batch_Number'),mara1_df.matkl.alias('Material_Group'),mara1_df.zzmanu_stg.alias('Mfg_Stage_Code'), \
mkpf_df.budat.alias('WCB_261_Posting_Date'),mch1_df.hsdat.alias('Manufacturing_Date'))
target_df = temp12_df
print(target_df.count())
== Physical Plan ==
*(15) Project [charg#72 AS Batch_Number#301751, matkl#126 AS Material_Group#301752, zzmanu_stg#275 AS Mfg_Stage_Code#301753, budat#511 AS WCB_261_Posting_Date#301754, hsdat#571 AS Manufacturing_Date#301755]
+- *(15) SortMergeJoin [charg#72], [charg#543], Inner
:- *(12) Sort [charg#72 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(charg#72, 200)
: +- *(11) Project [charg#72, matkl#126, zzmanu_stg#275, budat#511]
: +- *(11) BroadcastHashJoin [matnr#33], [matnr#300069], Inner, BuildRight, false
: :- *(11) Project [matnr#33, charg#72, matkl#126, zzmanu_stg#275, budat#511]
: : +- *(11) SortMergeJoin [mblnr#313], [mblnr#505], Inner
: : :- *(7) Sort [mblnr#313 ASC NULLS FIRST], false, 0
: : : +- Exchange hashpartitioning(mblnr#313, 200)
: : : +- *(6) Project [matnr#33, charg#72, matkl#126, zzmanu_stg#275, mblnr#313]
: : : +- *(6) SortMergeJoin [aufnr#14], [aufnr#368], Inner
: : : :- *(3) Sort [aufnr#14 ASC NULLS FIRST], false, 0
: : : : +- Exchange hashpartitioning(aufnr#14, 200)
: : : : +- *(2) Project [matnr#33, aufnr#14, charg#72, matkl#126, zzmanu_stg#275]
: : : : +- *(2) BroadcastHashJoin [matnr#33], [matnr#116], Inner, BuildRight, false
: : : : :- *(2) Project [aufnr#14, matnr#33, charg#72]
: : : : : +- *(2) Filter ((isnotnull(matnr#33) && isnotnull(aufnr#14)) && isnotnull(charg#72))
: : : : : +- *(2) FileScan parquet opssup_dev_wrk_sap.src_sap_afpo[aufnr#14,matnr#33,charg#72] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_afpo], PartitionFilters: [], PushedFilters: [IsNotNull(matnr), IsNotNull(aufnr), IsNotNull(charg)], ReadSchema: struct<aufnr:string,matnr:string,charg:string>
: : : : +- BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(input[0, string, true]))
: : : : +- *(1) Project [matnr#116, matkl#126, zzmanu_stg#275]
: : : : +- *(1) Filter isnotnull(matnr#116)
: : : : +- *(1) FileScan parquet opssup_dev_wrk_sap.src_sap_mara[matnr#116,matkl#126,zzmanu_stg#275] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mara], PartitionFilters: [], PushedFilters: [IsNotNull(matnr)], ReadSchema: struct<matnr:string,matkl:string,zzmanu_stg:string>
: : : +- *(5) Sort [aufnr#368 ASC NULLS FIRST], false, 0
: : : +- Exchange hashpartitioning(aufnr#368, 200)
: : : +- *(4) Project [mblnr#313, aufnr#368]
: : : +- *(4) Filter (((isnotnull(bwart#319) && (bwart#319 = 261)) && isnotnull(aufnr#368)) && isnotnull(mblnr#313))
: : : +- *(4) FileScan parquet opssup_dev_wrk_sap.src_sap_mseg[mblnr#313,bwart#319,aufnr#368] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mseg], PartitionFilters: [], PushedFilters: [IsNotNull(bwart), EqualTo(bwart,261), IsNotNull(aufnr), IsNotNull(mblnr)], ReadSchema: struct<mblnr:string,bwart:string,aufnr:string>
: : +- *(9) Sort [mblnr#505 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(mblnr#505, 200)
: : +- *(8) Project [mblnr#505, budat#511]
: : +- *(8) Filter isnotnull(mblnr#505)
: : +- *(8) FileScan parquet opssup_dev_wrk_sap.src_sap_mkpf[mblnr#505,budat#511] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mkpf], PartitionFilters: [], PushedFilters: [IsNotNull(mblnr)], ReadSchema: struct<mblnr:string,budat:string>
: +- BroadcastExchange HashedRelationBroadcastMode(ArrayBuffer(input[0, string, true]))
: +- *(10) Project [matnr#300069]
: +- *(10) Filter ((isnotnull(zzmanu_stg#300228) && (zzmanu_stg#300228 = WCB)) && isnotnull(matnr#300069))
: +- *(10) FileScan parquet opssup_dev_wrk_sap.src_sap_mara[matnr#300069,zzmanu_stg#300228] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mara], PartitionFilters: [], PushedFilters: [IsNotNull(zzmanu_stg), EqualTo(zzmanu_stg,WCB), IsNotNull(matnr)], ReadSchema: struct<matnr:string,zzmanu_stg:string>
+- *(14) Sort [charg#543 ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(charg#543, 200)
+- *(13) Project [charg#543, hsdat#571]
+- *(13) Filter isnotnull(charg#543)
+- *(13) FileScan parquet opssup_dev_wrk_sap.src_sap_mch1[charg#543,hsdat#571] Batched: true, Format: Parquet, Location: InMemoryFileIndex[s3://amgen-edl-ois-opssup-shr-bkt/dev/west2/wrk/sap/src_sap_mch1], PartitionFilters: [], PushedFilters: [IsNotNull(charg)], ReadSchema: struct<charg:string,hsdat:string>