Python 填充所有datetime列直到特定日期

Python 填充所有datetime列直到特定日期,python,pandas,datetime,Python,Pandas,Datetime,我有一个数据框架,代表不同产品和不同商店的日常需求 SKU Store F LeadTime Date Qty Value Price Level 0 504777 1 135828 11 2018-01-22 1 3.99 3.99 45 1 504777 1 135828 11 2018-01-23 0 0.00 0.00 45 2 504777

我有一个数据框架,代表不同产品和不同商店的日常需求

     SKU    Store    F  LeadTime    Date    Qty Value   Price   Level   
0   504777      1   135828  11  2018-01-22  1   3.99    3.99    45  
1   504777      1   135828  11  2018-01-23  0   0.00    0.00    45  
2   504777      1   135828  11  2018-01-24  3   11.97   3.99    42  
3   504777      1   135828  11  2018-01-25  1   3.99    3.99    41  
4   504777      1   135828  11  2018-01-26  0   0.00    0.00    41  


300 704777      2   135828  11  2018-01-22  1   4.99    3.99    45  
301 704777      2   135828  11  2018-01-23  0   0.00    0.00    47  
302 704777      2   135828  11  2018-01-24  4   12.97   3.99    48  
303 704777      2   135828  11  2018-01-25  1   3.99    3.99    49  
在本例中,我试图使用以下条件完成数据集,直到2018-01-31:

  • 以下列:
    SKU、Store、F、前置时间、日期、级别
    应填写最后一个值

  • 以下列:
    数量、价值、价格
    应填写0

因此,我的预期输出应该如下所示:

     SKU    Store    F  LeadTime    Date    Qty Value   Price   Level   
0   504777      1   135828  11  2018-01-22  1   3.99    3.99    45  
1   504777      1   135828  11  2018-01-23  0   0.00    0.00    45  
2   504777      1   135828  11  2018-01-24  3   11.97   3.99    42  
3   504777      1   135828  11  2018-01-25  1   3.99    3.99    41  
4   504777      1   135828  11  2018-01-26  1   3.99    3.99   41  
5   504777      1   135828  11  2018-01-27  0   0.00    0.00    41  
6   504777      1   135828  11  2018-01-28  0   0.00    0.00    41  
7   504777      1   135828  11  2018-01-29  0   0.00    0.00    41                                                                
8   504777      1   135828  11  2018-01-30  0   0.00    0.00    41  
9   504777      1   135828  11  2018-01-31  0   0.00    0.00    41  

300 704777      2   135828  11  2018-01-22  1   4.99    3.99    45  
301 704777      2   135828  11  2018-01-23  0   0.00    0.00    47  
302 704777      2   135828  11  2018-01-24  4   12.97   3.99    48  
303 704777      2   135828  11  2018-01-25  1   3.99    3.99    49
304 704777      2   135828  11  2018-01-26  0    0       0       49  
305 704777      2   135828  11  2018-01-27  0    0       0      49
306 704777      2   135828  11  2018-01-28  0    0       0      49  
307 704777      2   135828  11  2018-01-29  0    0       0      49  
307 704777      2   135828  11  2018-01-30  0    0       0      49  
307 704777      2   135828  11  2018-01-31  0    0       0      49  
我试过这个:

df = df.set_index('Date').groupby(['SKU', 'Store']).date_range(end = '2018-01-31', freq='D').agg({
                                             'F':'last',
                                             'LeadTime':'last',
                                             'Price':0,
                                             'Value':0,
                                             'Qty':0,
                                             'Level':'last'}).reset_index()
但这不是正确的方法:

'DataFrameGroupBy' object has no attribute 'date_range'

PS:每种产品都有不同的起始日期

SKU
Store
上的第一个groupby

同时,您可以将
start
作为df的最大值,将
end
作为
2018-01-31
创建一个

注意我在这里使用列表理解来获得速度方面的胜利

然后根据需要选择
0

最后,所有groupby数据帧和使用:



我建议您尝试对每组重新编制索引。然后创建一个列表来存储每个组,并从该列表中创建一个
DataFrame

df['Date'] = pd.to_datetime(df['Date'])

dfs = []
for _, d in df.groupby(['SKU', 'Store']):

    start_date = d.Date.iloc[0]
    end_date = start_date + pd.offsets.MonthEnd()

    d.set_index('Date', inplace=True)
    d = d.reindex(pd.date_range(start_date, end_date))
    d.fillna

    dfs.append(d)

new_df = pd.concat(dfs)

new_df

                 SKU  Store         F  LeadTime  Qty  Value  Price  Level
2018-01-22  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   45.0
2018-01-23  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   45.0
2018-01-24  504777.0    1.0  135828.0      11.0  3.0  11.97   3.99   42.0
2018-01-25  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   41.0
2018-01-26  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-27       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-28       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-29       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-30       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-31       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-22  704777.0    2.0  135828.0      11.0  1.0   4.99   3.99   45.0
2018-01-23  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   47.0
2018-01-24  704777.0    2.0  135828.0      11.0  4.0  12.97   3.99   48.0
2018-01-25  704777.0    2.0  135828.0      11.0  1.0   3.99   3.99   49.0
2018-01-26       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-27       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-28       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-29       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-30       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-31       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
然后使用
ffill
填充
NaN

new_df = pd.concat(dfs)
new_df[['Price', 'Qty', 'Value']] = new_df[['Price', 'Qty', 'Value']].fillna(0)
new_df.ffill(inplace=True)
new_df
Out[17]: 
                 SKU  Store         F  LeadTime  Qty  Value  Price  Level
2018-01-22  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   45.0
2018-01-23  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   45.0
2018-01-24  504777.0    1.0  135828.0      11.0  3.0  11.97   3.99   42.0
2018-01-25  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   41.0
2018-01-26  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-27  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-28  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-29  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-30  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-31  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-22  704777.0    2.0  135828.0      11.0  1.0   4.99   3.99   45.0
2018-01-23  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   47.0
2018-01-24  704777.0    2.0  135828.0      11.0  4.0  12.97   3.99   48.0
2018-01-25  704777.0    2.0  135828.0      11.0  1.0   3.99   3.99   49.0
2018-01-26  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-27  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-28  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-29  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-30  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-31  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
我需要使用
groupby(['SKU','Store'])
。我应该把这个放在哪里?
df['Date'] = pd.to_datetime(df['Date'])

dfs = []
for _, d in df.groupby(['SKU', 'Store']):

    start_date = d.Date.iloc[0]
    end_date = start_date + pd.offsets.MonthEnd()

    d.set_index('Date', inplace=True)
    d = d.reindex(pd.date_range(start_date, end_date))
    d.fillna

    dfs.append(d)

new_df = pd.concat(dfs)

new_df

                 SKU  Store         F  LeadTime  Qty  Value  Price  Level
2018-01-22  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   45.0
2018-01-23  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   45.0
2018-01-24  504777.0    1.0  135828.0      11.0  3.0  11.97   3.99   42.0
2018-01-25  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   41.0
2018-01-26  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-27       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-28       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-29       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-30       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-31       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-22  704777.0    2.0  135828.0      11.0  1.0   4.99   3.99   45.0
2018-01-23  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   47.0
2018-01-24  704777.0    2.0  135828.0      11.0  4.0  12.97   3.99   48.0
2018-01-25  704777.0    2.0  135828.0      11.0  1.0   3.99   3.99   49.0
2018-01-26       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-27       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-28       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-29       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-30       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
2018-01-31       NaN    NaN       NaN       NaN  NaN    NaN    NaN    NaN
new_df = pd.concat(dfs)
new_df[['Price', 'Qty', 'Value']] = new_df[['Price', 'Qty', 'Value']].fillna(0)
new_df.ffill(inplace=True)
new_df
Out[17]: 
                 SKU  Store         F  LeadTime  Qty  Value  Price  Level
2018-01-22  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   45.0
2018-01-23  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   45.0
2018-01-24  504777.0    1.0  135828.0      11.0  3.0  11.97   3.99   42.0
2018-01-25  504777.0    1.0  135828.0      11.0  1.0   3.99   3.99   41.0
2018-01-26  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-27  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-28  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-29  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-30  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-31  504777.0    1.0  135828.0      11.0  0.0   0.00   0.00   41.0
2018-01-22  704777.0    2.0  135828.0      11.0  1.0   4.99   3.99   45.0
2018-01-23  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   47.0
2018-01-24  704777.0    2.0  135828.0      11.0  4.0  12.97   3.99   48.0
2018-01-25  704777.0    2.0  135828.0      11.0  1.0   3.99   3.99   49.0
2018-01-26  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-27  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-28  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-29  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-30  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0
2018-01-31  704777.0    2.0  135828.0      11.0  0.0   0.00   0.00   49.0