Python 使用Panda使用另一行和另一列的数据创建新列
我正在尝试使用当前行“已确认”-昨天“已确认”为“新确认病例”创建新行。确认是累积的 我的数据如图所示Python 使用Panda使用另一行和另一列的数据创建新列,python,pandas,Python,Pandas,我正在尝试使用当前行“已确认”-昨天“已确认”为“新确认病例”创建新行。确认是累积的 我的数据如图所示 Country,Date,Confirmed,Deaths,Recovered,Active China,2020-01-21,10,5,1,100 China,2020-01-22,20,10,2,104 China,2020-01-23,30,15,3,116 France,2020-01-21,20,5,1,100 France,2020-01-22,30,10,2,118 Franc
Country,Date,Confirmed,Deaths,Recovered,Active
China,2020-01-21,10,5,1,100
China,2020-01-22,20,10,2,104
China,2020-01-23,30,15,3,116
France,2020-01-21,20,5,1,100
France,2020-01-22,30,10,2,118
France,2020-01-23,40,15,3,138
需要产量
Country,Date,Confirmed,Deaths,Recovered,Active,New Confirmed
China,2020-01-21,10,5,1,100,0
China,2020-01-22,20,10,2,104,10
China,2020-01-23,30,15,3,116,10
France,2020-01-21,20,5,1,100,0
France,2020-01-22,30,10,2,118,10
France,2020-01-23,40,15,3,138,10
如果使用同一行的数据,我知道如何添加新行,但不确定如何使用另一行的数据。任何提示或建议都将不胜感激。您可以使用shift()
方法、fillna()
方法和astype()
方法:
df['New Confirmed']=df['Confirmed']-df['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
df['New Confirmed']=df['Confirmed']-df.groupby('Country')['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
现在,如果您打印df
,您将获得所需的输出:
Country Date Confirmed Deaths Recovered Active New Confirmed
0 China 2020-01-21 10 5 1 100 0
1 China 2020-01-22 20 10 2 104 10
2 China 2020-01-23 30 15 3 116 10
更新:
对于这种情况,请使用groupby()
方法、shift()
方法、fillna()
方法和astype()
方法:
df['New Confirmed']=df['Confirmed']-df['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
df['New Confirmed']=df['Confirmed']-df.groupby('Country')['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
上述代码的输出:
Country Date Confirmed Deaths Recovered Active New Confirmed
0 China 2020-01-21 10 5 1 100 0
1 China 2020-01-22 20 10 2 104 10
2 China 2020-01-23 30 15 3 116 10
3 France 2020-01-21 10 5 1 100 0
4 France 2020-01-22 20 10 2 104 10
5 France 2020-01-23 30 15 3 116 10
您可以使用shift()
方法、fillna()
方法和astype()
方法:
df['New Confirmed']=df['Confirmed']-df['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
df['New Confirmed']=df['Confirmed']-df.groupby('Country')['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
现在,如果您打印df
,您将获得所需的输出:
Country Date Confirmed Deaths Recovered Active New Confirmed
0 China 2020-01-21 10 5 1 100 0
1 China 2020-01-22 20 10 2 104 10
2 China 2020-01-23 30 15 3 116 10
更新:
对于这种情况,请使用groupby()
方法、shift()
方法、fillna()
方法和astype()
方法:
df['New Confirmed']=df['Confirmed']-df['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
df['New Confirmed']=df['Confirmed']-df.groupby('Country')['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
上述代码的输出:
Country Date Confirmed Deaths Recovered Active New Confirmed
0 China 2020-01-21 10 5 1 100 0
1 China 2020-01-22 20 10 2 104 10
2 China 2020-01-23 30 15 3 116 10
3 France 2020-01-21 10 5 1 100 0
4 France 2020-01-22 20 10 2 104 10
5 France 2020-01-23 30 15 3 116 10
我工作过。谢谢我工作过。谢谢