Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/361.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在等周内迭代_Python_Pandas_Date - Fatal编程技术网

Python 在等周内迭代

Python 在等周内迭代,python,pandas,date,Python,Pandas,Date,我有一个带有日期、值和isoweek字段的数据框,如下所示: date | value | isoweek ----------------------------- 2018-04-01 | 5 | 2018-13 2018-04-10 | 10 | 2018-15 2018-05-01 | 10 | 2018-18 date | value | isoweek ----------------------------- 2018-04-01

我有一个带有
日期
isoweek
字段的数据框,如下所示:

date       | value | isoweek 
-----------------------------
2018-04-01 | 5     | 2018-13
2018-04-10 | 10    | 2018-15
2018-05-01 | 10    | 2018-18
date       | value | isoweek 
-----------------------------
2018-04-01 | 5     | 2018-13
NaN        | 0     | 2018-14
2018-04-10 | 10    | 2018-15
NaN        | 0     | 2018-16
NaN        | 0     | 2018-17
2018-05-01 | 10    | 2018-18
其中,
isoweek
是相应日期的年周。我的目标是遍历isoweeks,找到数据中不存在的isoweeks,并在数据框中插入一行,值为
0

预期输出如下所示:

date       | value | isoweek 
-----------------------------
2018-04-01 | 5     | 2018-13
2018-04-10 | 10    | 2018-15
2018-05-01 | 10    | 2018-18
date       | value | isoweek 
-----------------------------
2018-04-01 | 5     | 2018-13
NaN        | 0     | 2018-14
2018-04-10 | 10    | 2018-15
NaN        | 0     | 2018-16
NaN        | 0     | 2018-17
2018-05-01 | 10    | 2018-18

如何迭代原始数据帧,并在数据中找到所有缺失的isoweeks?

可能有点冗长,但在将isoweeks转换为最新版本后,您可以尝试重新采样:

s = pd.to_datetime(df['isoweek']+"-0",format='%Y-%W-%w')
u = df.set_index(s).resample("W").first()

iso_week = u.index.year.astype(str)+'-'+u.index.weekofyear.astype(str)
u['isoweek'] = u['isoweek'].fillna(pd.Series(iso_week,index=u.index))
out = u.fillna({"value":0}).reset_index(drop=True)


您可以尝试使用
apply

def func(row):
    year = (row.name)
    r = row['isoweek'].str.split('-').str[1].astype(int)
    min_week = min(r)
    max_week = max(r)
    val_range = range(min_week, max_week)
    missing = (set(val_range) - set(r.values))
    for mis_week in missing:
        row = (row.append({'isoweek': f"{year}-{mis_week}", 'date': np.nan, 'value':0}, ignore_index=True))
    return (row.sort_values(by='isoweek').reset_index(drop=True))


您可以使用生成一个每周从
开始
结束
的日期列表

dates=pd.Series(pd.date_范围(start=df['date'].min(),end=df['date'].max(),freq='W'))
isoweeks=(dates.dt.isocalendar().year.astype(str)+'-'+dates.dt.isocalendar().week.astype(str)).tolist()
max_isoweek=str(df['date'].max().isocalendar()[0])+'-'+str(df['date'].max().isocalendar()[1])
如果最大等周数不在等周数内:
isoweeks.append(最大isoweeks)
这是为了获得
开始
日期和
结束
日期之间的所有iso周

然后,您可以将df合并到一个helper数据帧中,以获得所需的内容

df=df.merge(pd.DataFrame({'isoweek':isoweeks}),how='right')
df['value'].fillna(0,原地=真)

太棒了,谢谢,这很有效!
        date        value   isoweek
 0  0   2018-04-01  5       2018-13
    1   NaN         0       2018-14
    2   2018-04-10  10      2018-15
    3   NaN         0       2018-16
    4   NaN         0       2018-17
    5   2018-05-01  10      2018-18
# print(df)

        date  value  isoweek
0 2018-04-01    5.0  2018-13
1        NaT    0.0  2018-14
2 2018-04-10   10.0  2018-15
3        NaT    0.0  2018-16
4        NaT    0.0  2018-17
5 2018-05-01   10.0  2018-18