Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/xamarin/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在第一次出现后为所有后续日期添加条目_Python_Pandas - Fatal编程技术网

Python 如何在第一次出现后为所有后续日期添加条目

Python 如何在第一次出现后为所有后续日期添加条目,python,pandas,Python,Pandas,我有一个每天更新的数据库。目前,它只有“活动案例”——比方说当值为1或更高时 df=pd.DataFrame({ “日期”:[ "2020-04-09", "2020-04-09", "2020-04-10", "2020-04-10", "2020-04-10", "2020-04-11", "2020-04-11", "2020-04-12", "2020-04-12", "2020-04-13", "2020-04-13", "2020-04-13" ], “ID”:[2,3,1,2,3

我有一个每天更新的数据库。目前,它只有“活动案例”——比方说当值为1或更高时

df=pd.DataFrame({
“日期”:[
"2020-04-09", "2020-04-09",
"2020-04-10", "2020-04-10", "2020-04-10",
"2020-04-11", "2020-04-11", 
"2020-04-12", "2020-04-12",
"2020-04-13", "2020-04-13", "2020-04-13"
],
“ID”:[2,3,1,2,3,2,3,3,2,3,1,2,3],
“值”:[1,1,1,1,1,1,1,1,1,1,1,1,1,1]
})


因此,在
2020-04-10
上可能有3个活动病例,但在第二天-只有2个,并且只有这2个病例会存储在数据库中

我需要的是在下一个日期首次出现(但不是之前)时为其保留值为0的“非活动”ID条目

所需输出:

result=pd.DataFrame({
“日期”:[
"2020-04-09", "2020-04-09",
"2020-04-10", "2020-04-10", "2020-04-10",
"2020-04-11", "2020-04-11", "2020-04-11", 
"2020-04-12", "2020-04-12", "2020-04-12", 
"2020-04-13", "2020-04-13", "2020-04-13",
],
“ID”:[2,3,1,2,3,1,2,3,1,2,3,1,2,3],
“值”:[1,1,1,1,1,0,1,1,0,1,1,1,1,1,1]
})

日期ID值
0   2020-04-09  2   1
1   2020-04-09  3   1
2   2020-04-10  1   1
3   2020-04-10  2   1
4   2020-04-10  3   1

5 2020-04-11 1 0通过
pivot
对Idea进行重塑,然后将缺失值替换为
0
,但仅适用于行,前提是之前至少存在一个非缺失值:

df1 = df.pivot('Date','ID','Value')
df2 = (df1.mask(df1.ffill().notna() & df1.isna(), 0)
          .stack()
          .astype(int)
          .reset_index(name='Value'))
print (df2)
          Date  ID  Value
0   2020-04-09   2      1
1   2020-04-09   3      1
2   2020-04-10   1      1
3   2020-04-10   2      1
4   2020-04-10   3      1
5   2020-04-11   1      0
6   2020-04-11   2      1
7   2020-04-11   3      1
8   2020-04-12   1      0
9   2020-04-12   2      1
10  2020-04-12   3      1
11  2020-04-13   1      1
12  2020-04-13   2      1
13  2020-04-13   3      1
详细信息

print (df1)
ID            1    2    3
Date                     
2020-04-09  NaN  1.0  1.0
2020-04-10  1.0  1.0  1.0
2020-04-11  NaN  1.0  1.0
2020-04-12  NaN  1.0  1.0
2020-04-13  1.0  1.0  1.0

print (df1.ffill().notna() & df1.isna())
ID              1      2      3
Date                           
2020-04-09  False  False  False
2020-04-10  False  False  False
2020-04-11   True  False  False
2020-04-12   True  False  False
2020-04-13  False  False  False

print (df1.mask(df1.ffill().notna() & df1.isna(), 0))
ID            1    2    3
Date                     
2020-04-09  NaN  1.0  1.0
2020-04-10  1.0  1.0  1.0
2020-04-11  0.0  1.0  1.0
2020-04-12  0.0  1.0  1.0
2020-04-13  1.0  1.0  1.0
df1 = df.pivot('Date','ID','Value')
df2 = (df1.mask(df1.ffill().notna() & df1.isna(), 0)
          .stack()
          .astype(int)
          .reset_index(name='Value'))
print (df2)
          Date  ID  Value
0   2020-04-09   2      1
1   2020-04-09   3      1
2   2020-04-10   1      1
3   2020-04-10   2      1
4   2020-04-10   3      1
5   2020-04-11   1      0
6   2020-04-11   2      1
7   2020-04-11   3      1
8   2020-04-12   1      0
9   2020-04-12   2      1
10  2020-04-12   3      1
11  2020-04-13   1      1
12  2020-04-13   2      1
13  2020-04-13   3      1
print (df1)
ID            1    2    3
Date                     
2020-04-09  NaN  1.0  1.0
2020-04-10  1.0  1.0  1.0
2020-04-11  NaN  1.0  1.0
2020-04-12  NaN  1.0  1.0
2020-04-13  1.0  1.0  1.0

print (df1.ffill().notna() & df1.isna())
ID              1      2      3
Date                           
2020-04-09  False  False  False
2020-04-10  False  False  False
2020-04-11   True  False  False
2020-04-12   True  False  False
2020-04-13  False  False  False

print (df1.mask(df1.ffill().notna() & df1.isna(), 0))
ID            1    2    3
Date                     
2020-04-09  NaN  1.0  1.0
2020-04-10  1.0  1.0  1.0
2020-04-11  0.0  1.0  1.0
2020-04-12  0.0  1.0  1.0
2020-04-13  1.0  1.0  1.0