Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/323.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/assembly/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在组中查找不同的计数_Python_Pandas - Fatal编程技术网

Python 在组中查找不同的计数

Python 在组中查找不同的计数,python,pandas,Python,Pandas,我有一个熊猫数据帧,格式如下: DATE ID_1 ID_2 2017-01-20 J1234 1234567 2017-01-20 K2345 2143567 2017-01-21 K2345 1234567 2017-01-21 R2233 3840173 2017-01-21 J1234 9876543 2017-01-21 J1234 0092861 2017-01-21 R2233 3792462 2017-0

我有一个熊猫数据帧,格式如下:

DATE        ID_1    ID_2
2017-01-20  J1234   1234567 
2017-01-20  K2345   2143567     
2017-01-21  K2345   1234567
2017-01-21  R2233   3840173
2017-01-21  J1234   9876543 
2017-01-21  J1234   0092861
2017-01-21  R2233   3792462
2017-01-22  J1234   3451628
我试图得到每个日期每个ID_1中有多少ID_2的不同计数,以最终通过每个ID_1(y轴)中不同的ID_2绘制日期(x轴)。因此,要绘制的数据帧如下所示:

DATE        ID_1    Count_ID_2
2017-01-20  J1234   1   
2017-01-20  K2345   1
2017-01-21  K2345   1   
2017-01-21  R2233   2
2017-01-21  J1234   2
2017-01-22  J1234   1

每个ID_1在绘图上有一条不同的线。请注意,ID_2列中有重复项。我是python和pandas的新手,正在尝试为这种操作找到正确的代码——我通常在excel中这样做,但是现在数据文件太大了,速度太慢了。提前感谢您的帮助

尝试使用
groupby
count

df.groupby(['DATE','ID_1'], as_index=False)['ID_2'].count()
输出:

         DATE   ID_1  ID_2
0  2017-01-20  J1234     1
1  2017-01-20  K2345     1
2  2017-01-21  J1234     2
3  2017-01-21  K2345     1
4  2017-01-21  R2233     2
5  2017-01-22  J1234     1

尝试使用
value\u counts
PS:pandas中的新功能可以再接受两列

df.value_counts(['DATE','ID_1'])#.reset_index()

Out[9]: 
DATE        ID_1 
2017-01-21  R2233    2
            J1234    2
2017-01-22  J1234    1
2017-01-21  K2345    1
2017-01-20  K2345    1
            J1234    1
dtype: int64