Sql server PySpark:使用内部联接、Case语句和Where语句翻译MSSQL代码

Sql server PySpark:使用内部联接、Case语句和Where语句翻译MSSQL代码,sql-server,join,filter,pyspark,where,Sql Server,Join,Filter,Pyspark,Where,我试图复制我在MSSQL中编写的代码,并将其转换为PySpark。我是Pypark的一个笨蛋 该查询包含内部联接、嵌入的case-when语句和一组where语句以进行筛选 SELECT Table1.Part, Table1.Serial, Table1.AIRCRAFT_NUMBER, Table1.date_removed, Table2.dbo.E15.TIME, Table2.dbo.E15.TSO, data.dbo.E

我试图复制我在MSSQL中编写的代码,并将其转换为PySpark。我是Pypark的一个笨蛋

该查询包含内部联接、嵌入的case-when语句和一组where语句以进行筛选

SELECT        Table1.Part, Table1.Serial, Table1.AIRCRAFT_NUMBER, Table1.date_removed,
                         Table2.dbo.E15.TIME, Table2.dbo.E15.TSO, data.dbo.EE18.Allowable_Time,
                         CASE WHEN (data.dbo.EE18.Allowable_Time > 0)
                         THEN data.dbo.EE18.Allowable_Time - Table2.dbo.E15.TSO END AS CAL
FROM            Table1 INNER JOIN
                         Table2.dbo.E15 ON Table1.SEQ_ID = Table2.dbo.E15.SEQ_ID AND
                         Table1.Part = Table2.dbo.E15.Part AND
                         Table1.Serial = Table2.dbo.E15.Serial AND
                         Table1.DATE_REMOVED_DESCENDING = Table2.dbo.E15.DATE_REMOVED_DESCENDING INNER JOIN
                         data.dbo.EE18 ON Table2.dbo.E15.Part = data.dbo.EE18.PART_NUMBER AND
                         Table2.dbo.E15.TIME = data.dbo.EE18.TIME
WHERE        (Table1.Part LIKE '18%') AND (Table2.dbo.E15.TIME = 'I') AND
                         (data.dbo.EE18.Allowable_Time > 0) AND (Table2.dbo.E15.TSO <= 2) OR
                         (Table1.Part LIKE '18%') AND (Table2.dbo.E15.TIME = 'T') AND
                         (data.dbo.EE18.Allowable_Time > 0) AND (Table2.dbo.E15.TSO <= 20) OR
                         (Table1.Part LIKE '18%') AND (Table2.dbo.E15.TIME = 'L') AND
                         (data.dbo.EE18.Allowable_Time > 0) AND (Table2.dbo.E15.TSO <= 8)
ORDER BY Table1.date_removed DESC
选择表1.零件,表1.序列,表1.飞机号,表1.日期,
表2.dbo.E15.TIME,表2.dbo.E15.TSO,data.dbo.EE18.u允许的时间,
案例时间(data.dbo.EE18.u时间>0)
然后data.dbo.EE18.u允许时间-表2.dbo.E15.TSO结束为校准
来自表1内部联接
表1.SEQ_ID上的表2.dbo.E15=表2.dbo.E15.SEQ_ID和
表1.零件=表2.dbo.E15.零件和
表1.Serial=表2.dbo.E15.Serial和
表1.DATE\u REMOVED\u DESCENDING=表2.dbo.E15.DATE\u REMOVED\u DESCENDING内部联接
表2.dbo.E15.Part=data.dbo.EE18.Part_编号和
表2.dbo.E15.TIME=data.dbo.EE18.TIME
其中(表1.18%等部分)和(表2.dbo.E15.TIME='I'),以及

(data.dbo.EE18.u允许时间>0)和(表2.dbo.E15.TSO 0)和(表2.dbo.E15.TSO 0)以及(Table2.dbo.E15.TSO这并不是对您的问题的真正回答,但它展示了使用某些格式可以使查询看起来更清晰。我还稍微修改了where谓词以避免冗余,并修复了括号中的逻辑问题

SELECT Table1.Part
    , Table1.Serial
    , Table1.AIRCRAFT_NUMBER
    , Table1.date_removed
    , Table2.dbo.E15.TIME
    , Table2.dbo.E15.TSO
    , data.dbo.EE18.Allowable_Time
    , CASE WHEN (data.dbo.EE18.Allowable_Time > 0) THEN data.dbo.EE18.Allowable_Time - Table2.dbo.E15.TSO END AS CAL
FROM Table1 t1
INNER JOIN Table2.dbo.E15 ON Table1.SEQ_ID = Table2.dbo.E15.SEQ_ID 
                        AND Table1.Part = Table2.dbo.E15.Part 
                        AND Table1.Serial = Table2.dbo.E15.Serial 
                        AND Table1.DATE_REMOVED_DESCENDING = Table2.dbo.E15.DATE_REMOVED_DESCENDING 
INNER JOIN data.dbo.EE18 ON Table2.dbo.E15.Part = data.dbo.EE18.PART_NUMBER 
                        AND Table2.dbo.E15.TIME = data.dbo.EE18.TIME
WHERE Table1.Part LIKE '18%' 
    AND data.dbo.EE18.Allowable_Time > 0 
    AND
    (
        Table2.dbo.E15.TIME = 'I' 
        AND 
        Table2.dbo.E15.TSO <= 2
    )
    OR
    (
        Table2.dbo.E15.TIME = 'T'
        AND
        Table2.dbo.E15.TSO <= 20
    )
    OR
    (
        Table2.dbo.E15.TIME = 'L'
        AND
        Table2.dbo.E15.TSO <= 8
    )
ORDER BY Table1.date_removed DESC
选择表1.零件
,表1。序列号
,表1.1飞机编号
,表1.1删除日期
,表2.dbo.E15.TIME
,表2.dbo.E15.TSO
,data.dbo.EE18.u允许的时间
,当(data.dbo.EE18.allowed_Time>0)然后data.dbo.EE18.allowed_Time-Table2.dbo.E15.TSO结束为CAL时的情况
来自表1 t1
表1.SEQ_ID=表2.dbo.E15.SEQ_ID上的内部联接表2.dbo.E15
表1.Part=表2.dbo.E15.Part
和表1.Serial=表2.dbo.E15.Serial
表1.DATE\u REMOVED\u DESCENDING=表2.dbo.E15.DATE\u REMOVED\u DESCENDING
表2.dbo.E15.Part=data.dbo.EE18.Part\u编号上的内部联接data.dbo.EE18
表2.dbo.E15.TIME=data.dbo.EE18.TIME
其中表1.部分如“18%”
和data.dbo.EE18.u允许的时间>0
及
(
表2.dbo.E15.TIME='I'
及

Table2.dbo.E15.TSO为什么不使用存储过程呢?那么你就不必担心你正在使用的编程语言会有任何奇怪的方言。我会密切关注那些where谓词。你似乎有大量不必要的括号,并且遗漏了重要的括号。我不知道如何做存储过程可能是一个好时机学习。