Python 大熊猫群居

Python 大熊猫群居,python,pandas,Python,Pandas,我有一个数据帧,如下所示: df=pd.DataFrame({'variable':["A","A","B","B","C","D","E","E","E","F","F","G"],'weight':[2,2,0,0,1,3,5,5,5,0,0,4]}) Out[129]: variable weight 0 A 2 1 A 2 2 B 0 3 B 0 4

我有一个数据帧,如下所示:

df=pd.DataFrame({'variable':["A","A","B","B","C","D","E","E","E","F","F","G"],'weight':[2,2,0,0,1,3,5,5,5,0,0,4]})


Out[129]: 
   variable  weight
0         A       2
1         A       2
2         B       0
3         B       0
4         C       1
5         D       3
6         E       5
7         E       5
8         E       5
9         F       0
10        F       0
11        G       4
我想创建一个新列,基于
变量组
,新列的值基于列
权重
本身

在R中:我可以从
dplyr
轻松地使用
rowwise
获得所需的输出

library(dplyr)
test <-
  data.frame(
    variable    = c("A","A","B","B","C","D","E","E","E","F","F","G"), 
    weight      = c(2,2,0,0,1,3,5,5,5,0,0,4)
  )

test%>%group_by(variable)%>%rowwise()%>%mutate(Var=ifelse (weight==2,1,ifelse(.Last.value ==1|weight>1,0,NA)))

发出

df['NEW']=l1
df
Out[232]: 
   variable  weight    NEW
0         A       2      1
1         A       2      1
2         B       0      1
3         B       0      1
4         C       1      1
5         D       3  ERROR
6         E       3  ERROR
7         E       5      0
8         E       5      0
9         F       0      1
10        F       0      1
11        G       4  ERROR
没有团员
让我知道我是否正确地解释了这一点


选项1

df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1))) 

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0
df.assign(Var=np.array([np.nan, 1, 0])[np.searchsorted([1, 2], df.weight.values)])

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0
df.assign(Var=np.array([1, 0, np.nan])[np.sign(df.weight.values - 2)])

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0

选项2

df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1))) 

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0
df.assign(Var=np.array([np.nan, 1, 0])[np.searchsorted([1, 2], df.weight.values)])

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0
df.assign(Var=np.array([1, 0, np.nan])[np.sign(df.weight.values - 2)])

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0

选项3

df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1))) 

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0
df.assign(Var=np.array([np.nan, 1, 0])[np.searchsorted([1, 2], df.weight.values)])

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0
df.assign(Var=np.array([1, 0, np.nan])[np.sign(df.weight.values - 2)])

   variable  weight  Var
0         A       2  1.0
1         A       2  1.0
2         B       0  NaN
3         B       0  NaN
4         C       1  NaN
5         D       3  0.0
6         E       5  0.0
7         E       5  0.0
8         E       5  0.0
9         F       0  NaN
10        F       0  NaN
11        G       4  0.0

我没有看到
groupby
在哪里发挥作用<代码>df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1))我无法理解组合背后的逻辑。。。你能解释一下吗?@cᴏʟᴅsᴘᴇᴇᴅ 将打开一个新的question@cᴏʟᴅsᴘᴇᴇᴅ 对不起,我做了更新,并且已经接受了Pir的回答,我感谢你们两位的时间!!这是我的错,我的样本数据无法区分这一点。不用担心,当我看到新数据并理解问题时,我会更新我的答案。让我打开一个新问题~很抱歉,因为你的答案正确地解决了输出我进行了更新,准确地说,我是由我的一位同事提出的。。。感觉被困了一整晚。。。