Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/290.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 按两列分组时计数出现次数_Python_Pandas - Fatal编程技术网

Python 按两列分组时计数出现次数

Python 按两列分组时计数出现次数,python,pandas,Python,Pandas,假设我有一个熊猫数据帧,如下所示: df = pd.DataFrame() df["person"] = ["p1", "p2", "p1", "p3", "p3", "p2", "p2", "p1", "p3", "p1", "p1", "p2", "p2", "p1", "p3", ] df["type"] = ["a", "a", "a", "a", "b", "a", "a", "b", "b", "b", "a", "a", "b", "a", "b",] df["valu

假设我有一个熊猫数据帧,如下所示:

df = pd.DataFrame()
df["person"] = ["p1", "p2", "p1", "p3", "p3", "p2", "p2", "p1", "p3", "p1", 
  "p1", "p2", "p2", "p1", "p3", ]
df["type"] = ["a", "a", "a", "a", "b", "a", "a", "b", "b", "b", "a", "a", 
  "b", "a", "b",]
df["value"] = np.random.random(15)

bins = [0, 0.25,0.5,0.75, 1]
labels = [f"{float(i)}-{float(j)}" for i, j in zip(bins[:-1], bins[1:])] 
df["bin"] = pd.cut(df["value"], bins=bins, labels=labels, right = False)
我想插入一个新列,返回按类型分组的人数。从浏览中,我发现以下代码行可以工作,但前提是我不包括最后一列bin。我的问题是如何在还包括列bin的数据帧中插入列计数器。提前谢谢你

df["counter"] = df.groupby(["person", "type"], as_index = False).transform("count")
把它改成

df["counter"] = df.groupby(["person", "type"], as_index = False)['value'].transform("count")
你会得到

   person type     value       bin  counter
0      p1    a  0.134629  0.0-0.25        4
1      p2    a  0.997557  0.75-1.0        4
2      p1    a  0.911967  0.75-1.0        4
3      p3    a  0.278438  0.25-0.5        1
4      p3    b  0.539296  0.5-0.75        3
5      p2    a  0.722150  0.5-0.75        4
6      p2    a  0.724028  0.5-0.75        4
7      p1    b  0.989627  0.75-1.0        2
8      p3    b  0.978790  0.75-1.0        3
9      p1    b  0.197428  0.0-0.25        2
10     p1    a  0.330113  0.25-0.5        4
11     p2    a  0.806856  0.75-1.0        4
12     p2    b  0.430026  0.25-0.5        1
13     p1    a  0.265003  0.25-0.5        4
14     p3    b  0.037202  0.0-0.25        3

谢谢在添加的部分中,列的选择有什么区别?我不认为有什么区别,因为你只是在计算每个组的大小。你的数字有问题;你应该将bin包括在你的组中。OP在浏览时说,所以我发现下面的代码行是可行的,但前提是我不包括最后一列bin,所以我认为他们不想将bin包括在计算中?我的措辞可能有点混乱。我不想在计算中包含bin,但要将其作为数据集的一部分。我第一次偶然发现的代码行不适用于作为数据集一部分的bin。