Python 如何计算pandas中的分类timeseries数据

Python 如何计算pandas中的分类timeseries数据,python,pandas,Python,Pandas,这周我决定潜入一点熊猫。我有一个包含历史IRC日志的pandas数据框架,如下所示: timestamp action nick message 2005-11-04 01:44:33 False hack-cclub lex, hey! 2005-11-04 01:44:43 False hack-cclub lol, yea thats broke 2005-11-04 01:44:56 False lex Slas

这周我决定潜入一点熊猫。我有一个包含历史IRC日志的pandas数据框架,如下所示:

timestamp           action   nick        message
2005-11-04 01:44:33 False    hack-cclub  lex, hey!
2005-11-04 01:44:43 False    hack-cclub  lol, yea thats broke
2005-11-04 01:44:56 False    lex         Slashdot - Updated 2005-11-04 00:23:00 | Micro...
2005-11-04 01:44:56 False    hack-cclub  lex slashdot
2005-11-04 01:45:12 False    lex         port 666 is doom - doom Id Software (or mdqs o..
2005-11-04 01:45:12 False    hack-cclub  lex, port 666
2005-11-04 01:45:21 False    hitokiri    lex, port 23485
2005-11-04 01:45:45 False    hitokiri    lex, port 1024
2005-11-04 01:45:46 True     hack-cclub  slaps lex around with a wet fish
df['nick'].value_counts()[:25]
hack-cclub lex hitokiri
1          0   0
2          0   0
2          1   0
3          1   0
3          2   0
4          2   0
4          2   1
4          2   2
5          2   2
hack-cclub lex hitokiri
1          2   2
1          2   2
1          2   3
1          2   3
1          2   3
1          2   3
1          2   3
1          2   2
1          2   2
大约有550万行,我正在尝试做一些基本的可视化,比如排名前25位之类的。我知道我可以得到像这样的前25个刻痕:

timestamp           action   nick        message
2005-11-04 01:44:33 False    hack-cclub  lex, hey!
2005-11-04 01:44:43 False    hack-cclub  lol, yea thats broke
2005-11-04 01:44:56 False    lex         Slashdot - Updated 2005-11-04 00:23:00 | Micro...
2005-11-04 01:44:56 False    hack-cclub  lex slashdot
2005-11-04 01:45:12 False    lex         port 666 is doom - doom Id Software (or mdqs o..
2005-11-04 01:45:12 False    hack-cclub  lex, port 666
2005-11-04 01:45:21 False    hitokiri    lex, port 23485
2005-11-04 01:45:45 False    hitokiri    lex, port 1024
2005-11-04 01:45:46 True     hack-cclub  slaps lex around with a wet fish
df['nick'].value_counts()[:25]
hack-cclub lex hitokiri
1          0   0
2          0   0
2          1   0
3          1   0
3          2   0
4          2   0
4          2   1
4          2   2
5          2   2
hack-cclub lex hitokiri
1          2   2
1          2   2
1          2   3
1          2   3
1          2   3
1          2   3
1          2   3
1          2   2
1          2   2
我想要的是这样的滚动计数:

timestamp           action   nick        message
2005-11-04 01:44:33 False    hack-cclub  lex, hey!
2005-11-04 01:44:43 False    hack-cclub  lol, yea thats broke
2005-11-04 01:44:56 False    lex         Slashdot - Updated 2005-11-04 00:23:00 | Micro...
2005-11-04 01:44:56 False    hack-cclub  lex slashdot
2005-11-04 01:45:12 False    lex         port 666 is doom - doom Id Software (or mdqs o..
2005-11-04 01:45:12 False    hack-cclub  lex, port 666
2005-11-04 01:45:21 False    hitokiri    lex, port 23485
2005-11-04 01:45:45 False    hitokiri    lex, port 1024
2005-11-04 01:45:46 True     hack-cclub  slaps lex around with a wet fish
df['nick'].value_counts()[:25]
hack-cclub lex hitokiri
1          0   0
2          0   0
2          1   0
3          1   0
3          2   0
4          2   0
4          2   1
4          2   2
5          2   2
hack-cclub lex hitokiri
1          2   2
1          2   2
1          2   3
1          2   3
1          2   3
1          2   3
1          2   3
1          2   2
1          2   2
因此,我可以绘制一个从时间开始的前25个缺口的信息面积图。我知道我可以通过迭代整个数据帧并保持计数来做到这一点,但因为这样做的全部目的是学习使用熊猫,我希望有一种更惯用的方法来做到这一点。拥有相同的数据也很好,但是使用排名,而不是像这样运行计数:

timestamp           action   nick        message
2005-11-04 01:44:33 False    hack-cclub  lex, hey!
2005-11-04 01:44:43 False    hack-cclub  lol, yea thats broke
2005-11-04 01:44:56 False    lex         Slashdot - Updated 2005-11-04 00:23:00 | Micro...
2005-11-04 01:44:56 False    hack-cclub  lex slashdot
2005-11-04 01:45:12 False    lex         port 666 is doom - doom Id Software (or mdqs o..
2005-11-04 01:45:12 False    hack-cclub  lex, port 666
2005-11-04 01:45:21 False    hitokiri    lex, port 23485
2005-11-04 01:45:45 False    hitokiri    lex, port 1024
2005-11-04 01:45:46 True     hack-cclub  slaps lex around with a wet fish
df['nick'].value_counts()[:25]
hack-cclub lex hitokiri
1          0   0
2          0   0
2          1   0
3          1   0
3          2   0
4          2   0
4          2   1
4          2   2
5          2   2
hack-cclub lex hitokiri
1          2   2
1          2   2
1          2   3
1          2   3
1          2   3
1          2   3
1          2   3
1          2   2
1          2   2
IIUC您需要并且:


我不明白最后一列数据帧。你能解释一下吗?