Python 大熊猫群居
我有一个数据帧,如下所示:Python 大熊猫群居,python,pandas,Python,Pandas,我有一个数据帧,如下所示: df=pd.DataFrame({'variable':["A","A","B","B","C","D","E","E","E","F","F","G"],'weight':[2,2,0,0,1,3,5,5,5,0,0,4]}) Out[129]: variable weight 0 A 2 1 A 2 2 B 0 3 B 0 4
df=pd.DataFrame({'variable':["A","A","B","B","C","D","E","E","E","F","F","G"],'weight':[2,2,0,0,1,3,5,5,5,0,0,4]})
Out[129]:
variable weight
0 A 2
1 A 2
2 B 0
3 B 0
4 C 1
5 D 3
6 E 5
7 E 5
8 E 5
9 F 0
10 F 0
11 G 4
我想创建一个新列,基于变量组
,新列的值基于列权重
和本身
在R中:我可以从dplyr
轻松地使用rowwise
获得所需的输出
library(dplyr)
test <-
data.frame(
variable = c("A","A","B","B","C","D","E","E","E","F","F","G"),
weight = c(2,2,0,0,1,3,5,5,5,0,0,4)
)
test%>%group_by(variable)%>%rowwise()%>%mutate(Var=ifelse (weight==2,1,ifelse(.Last.value ==1|weight>1,0,NA)))
发出
df['NEW']=l1
df
Out[232]:
variable weight NEW
0 A 2 1
1 A 2 1
2 B 0 1
3 B 0 1
4 C 1 1
5 D 3 ERROR
6 E 3 ERROR
7 E 5 0
8 E 5 0
9 F 0 1
10 F 0 1
11 G 4 ERROR
没有团员让我知道我是否正确地解释了这一点
选项1
df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1)))
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
df.assign(Var=np.array([np.nan, 1, 0])[np.searchsorted([1, 2], df.weight.values)])
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
df.assign(Var=np.array([1, 0, np.nan])[np.sign(df.weight.values - 2)])
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
选项2
df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1)))
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
df.assign(Var=np.array([np.nan, 1, 0])[np.searchsorted([1, 2], df.weight.values)])
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
df.assign(Var=np.array([1, 0, np.nan])[np.sign(df.weight.values - 2)])
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
选项3
df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1)))
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
df.assign(Var=np.array([np.nan, 1, 0])[np.searchsorted([1, 2], df.weight.values)])
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
df.assign(Var=np.array([1, 0, np.nan])[np.sign(df.weight.values - 2)])
variable weight Var
0 A 2 1.0
1 A 2 1.0
2 B 0 NaN
3 B 0 NaN
4 C 1 NaN
5 D 3 0.0
6 E 5 0.0
7 E 5 0.0
8 E 5 0.0
9 F 0 NaN
10 F 0 NaN
11 G 4 0.0
我没有看到
groupby
在哪里发挥作用<代码>df.assign(Var=df.weight.eq(2).mul(1).mask(df.weight.le(1))我无法理解组合背后的逻辑。。。你能解释一下吗?@cᴏʟᴅsᴘᴇᴇᴅ 将打开一个新的question@cᴏʟᴅsᴘᴇᴇᴅ 对不起,我做了更新,并且已经接受了Pir的回答,我感谢你们两位的时间!!这是我的错,我的样本数据无法区分这一点。不用担心,当我看到新数据并理解问题时,我会更新我的答案。让我打开一个新问题~很抱歉,因为你的答案正确地解决了输出我进行了更新,准确地说,我是由我的一位同事提出的。。。感觉被困了一整晚。。。