Python 未编码(解组)累积数据帧值
COVID-19数据集如下所示。可以看出,每个日期都是过去所有日期的累积。我只想捕捉一天中的添加内容——有点像“按取消分组”。我在熊猫数据框中有这个信息。我尝试过使用“爆炸”,但这不是正确的解决方案Python 未编码(解组)累积数据帧值,python,pandas,Python,Pandas,COVID-19数据集如下所示。可以看出,每个日期都是过去所有日期的累积。我只想捕捉一天中的添加内容——有点像“按取消分组”。我在熊猫数据框中有这个信息。我尝试过使用“爆炸”,但这不是正确的解决方案 Country 3/14/20 3/15/20 3/16/20 3/17/20 3/18/20 3/19/20 3/20/20 3/21/20 _______________________________________________________
Country 3/14/20 3/15/20 3/16/20 3/17/20 3/18/20 3/19/20 3/20/20 3/21/20
___________________________________________________________________________________
China 80977 81003 81033 81058 81102 81156 81250 81305
Italy 21157 24747 27980 31506 35713 41035 47021 53578
US 2727 3499 4632 6421 7783 13677 19100 25489
Spain 6391 7798 9942 11748 13910 17963 20410 25374
Germany 4585 5795 7272 9257 12327 15320 19848 22213
Iran 12729 13938 14991 16169 17361 18407 19644 20610
France 4496 4532 6683 7715 9124 10970 12758 14463
Korea, South 8086 8162 8236 8320 8413 8565 8652 8799
Switzerland 1359 2200 2200 2700 3028 4075 5294 6575
United Kingdom 1144 1145 1551 1960 2642 2716 4014 5067
我想要的样本输出是这样的——每天只添加案例
Country 3/14/20 3/15/20 3/16/20 3/17/20 3/18/20 3/19/20 3/20/20 3/21/20
___________________________________________________________________________________
China 32 26 30 25 44 54 94 55
Italy .....
US .....
感谢所有帮助。对于我来说,正在工作,但由于示例数据中不存在前一列,因此第一列中填充了缺少的值:
#if necessary
#df = df.set_index('Country')
df = df.diff(axis=1)
print (df)
3/14/20 3/15/20 3/16/20 3/17/20 3/18/20 3/19/20 3/20/20 \
Country
China NaN 26.0 30.0 25.0 44.0 54.0 94.0
Italy NaN 3590.0 3233.0 3526.0 4207.0 5322.0 5986.0
US NaN 772.0 1133.0 1789.0 1362.0 5894.0 5423.0
Spain NaN 1407.0 2144.0 1806.0 2162.0 4053.0 2447.0
Germany NaN 1210.0 1477.0 1985.0 3070.0 2993.0 4528.0
Iran NaN 1209.0 1053.0 1178.0 1192.0 1046.0 1237.0
France NaN 36.0 2151.0 1032.0 1409.0 1846.0 1788.0
Korea, South NaN 76.0 74.0 84.0 93.0 152.0 87.0
Switzerland NaN 841.0 0.0 500.0 328.0 1047.0 1219.0
United Kingdom NaN 1.0 406.0 409.0 682.0 74.0 1298.0
3/21/20
Country
China 55.0
Italy 6557.0
US 6389.0
Spain 4964.0
Germany 2365.0
Iran 966.0
France 1705.0
Korea, South 147.0
Switzerland 1281.0
United Kingdom 1053.0
它是df.diff(axis=1)
还是可能是df.T.diff().T
?