Python 3.x SAS到Python中的Retain语句、Groupby和IF条件_Python 3.x_Sas

Python 3.x SAS到Python中的Retain语句、Groupby和IF条件

python-3.x sas

Python 3.x SAS到Python中的Retain语句、Groupby和IF条件,python-3.x,sas,Python 3.x,Sas,为了说明要点，我试图标记一个满足GROUPBY和IF条件的变量“RUN”。下面是我正在将SAS代码转换为Python的代码，并在下面的SAS逻辑中使用了这些代码 data data_v1; retain run; set data; by plant material; if first.material then do; if sales_quantity = 0 then run = 1; else run = 0; end; *keep plant material sales_

为了说明要点，我试图标记一个满足GROUPBY和IF条件的变量“RUN”。下面是我正在将SAS代码转换为Python的代码，并在下面的SAS逻辑中使用了这些代码

data data_v1;
retain run;
set data;
by plant material;
if first.material then do;
    if sales_quantity = 0 then run = 1; else run = 0;
end;
*keep plant material sales_quantity run;
else do;
    if run > 0 then do;
        if sales_quantity = 0 or (sales_quantity < 0.01 * Annual_Sales and sales_quantity <= 9)
        then run = run + 1; else run = 0;
    end;
    else do;
        if sales_quantity = 0 then run = 1; else run = 0;
    end;
end;

数据_v1；
保留运行；
设置数据；
植物材料；
如果第一，那么做；
如果销售数量=0，则运行=1；否则运行=0；
结束；
*保持工厂物料销售数量的正常运行；
否则你会；
如果运行>0，则执行该操作；
如果销售数量=0或（销售数量<0.01*年度销售和销售数量0:
如果（（df.loc[i，'sales_quantity'].any（）=0）或（（df.loc[i，'sales_quantity'].any（）<0.01*df.loc[i，'year_sales_sales']）和（df.loc[i，'sales_quantity'].any（），有人能在这方面帮助我吗Padas.groupBy是可行的方法吗？
import pandas as pd
import numpy as np
df=pd.DataFrame()
df['plant']=['a','a','a','a','b','b','b','b','b','b','b','c','c','c','c','c','c','c','d','d','d','d','d','d']
df['mater']=['x','x','x','y','x','x','x','y','y','y','y','x','x','x','x','y','y','y','y','y','x','x','x','x']
df['salqty']=[0,0,0,10,11,12,13,0,0,13,0,13,0,0,1,0,0,0,1,2,3,0,0,0] 
df['plantmaterial'] = df["plant"].map(str) + df["mater"]
df['annual_sales']=0.01

df['flag'] = ((df.plantmaterial != df.plantmaterial.shift()) ).astype(int)
df['run']=0

for i in range(0, len(df)):
        if df.loc[i,'flag'].any() == 1:
            if df.loc[i,'sales_quantity'].any() == 0: 
                df.loc[i,'run'] = 1
            else:
                df.loc[i,'run'] = 0
        else:
            if df.loc[i-1,'run'].any() > 0:
                if ((df.loc[i,'sales_quantity'].any() == 0) or ((df.loc[i,'sales_quantity'].any() < 0.01 * df.loc[i,'annual_sales']) and (df.loc[i,'sales_quantity'].any() <= 9))):
                    df.loc[i,'run'] = df.loc[i-1,'run'] + 1
                else:
                    df.loc[i,'run'] = 0
            else:
                if df.loc[i,'sales_quantity'].any() == 0:
                    df.loc[i,'run'] = 1
                else:
                    df.loc[i,'run'] = 0