Apache spark Pyspark多重过滤器数据帧
我的输入火花数据帧是Apache spark Pyspark多重过滤器数据帧,apache-spark,pyspark,apache-spark-sql,Apache Spark,Pyspark,Apache Spark Sql,我的输入火花数据帧是 Year Month Client 2018 1 1 2018 2 1 2018 3 1 2018 4 1 2018 5 1 2018 6 1 2018 7 1 2018
Year Month Client
2018 1 1
2018 2 1
2018 3 1
2018 4 1
2018 5 1
2018 6 1
2018 7 1
2018 8 1
2018 9 1
2018 10 1
2018 11 1
2018 12 1
2019 1 1
2019 2 1
2019 3 1
2019 4 1
2019 5 1
2019 6 1
2019 7 1
2019 8 1
2019 9 1
2019 10 1
2019 11 1
2019 12 1
2018 1 2
2018 2 2
2018 3 2
2018 4 2
2018 5 2
2018 6 2
2018 7 2
2018 8 2
2018 9 2
2018 10 2
2018 11 2
2018 12 2
2019 1 2
2019 2 2
2019 3 2
2019 4 2
2019 5 2
2019 6 2
2019 7 2
2019 8 2
2019 9 2
2019 10 2
2019 11 2
2019 12 2
Dataframe按客户、年份和月份进行订购。我想为每个客户提取2019-06年后的数据
我根据上述数据共享了所需的输出
Year Month Client
2018 1 1
2018 2 1
2018 3 1
2018 4 1
2018 5 1
2018 6 1
2018 7 1
2018 8 1
2018 9 1
2018 10 1
2018 11 1
2018 12 1
2019 1 1
2019 2 1
2019 3 1
2019 4 1
2019 5 1
2019 6 1
2018 1 2
2018 2 2
2018 3 2
2018 4 2
2018 5 2
2018 6 2
2018 7 2
2018 8 2
2018 9 2
2018 10 2
2018 11 2
2018 12 2
2019 1 2
2019 2 2
2019 3 2
2019 4 2
2019 5 2
2019 6 2
你能帮我一下吗
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX您是指2019-06年之前吗?(您在2019-06年之后撰文) 如果是,您可以进行筛选:
df2 = df.filter('Year < 2019 or (Year = 2019 and Month <= 6)')
df2=df.filter('年<2019年或(年=2019年和月