R:如何根据序列和其他列中的值获取数据帧列中的值
我有一个数据帧,df。我按v1和v2排序的数据帧。 对于v1中的每组唯一值(示例数据中的值1、2和3),我想计算一个新变量v5 v5的值取决于v3和v4的值: 如果v3==“新建”,则v5==v4。 如果v3==“Old”,则v5获取行中v4的值,其中v3中的前一个值等于“New”。都在v1的同一“组”内 样本数据:R:如何根据序列和其他列中的值获取数据帧列中的值,r,R,我有一个数据帧,df。我按v1和v2排序的数据帧。 对于v1中的每组唯一值(示例数据中的值1、2和3),我想计算一个新变量v5 v5的值取决于v3和v4的值: 如果v3==“新建”,则v5==v4。 如果v3==“Old”,则v5获取行中v4的值,其中v3中的前一个值等于“New”。都在v1的同一“组”内 样本数据: df <- data.frame(v1=c(1,1,1,2,2,2,3,3,3,3), v2=c(1,2,3,1,2,3,1,2,3,4),
df <- data.frame(v1=c(1,1,1,2,2,2,3,3,3,3),
v2=c(1,2,3,1,2,3,1,2,3,4),
v3=c("New", "Old", "Old","New", "Old", "New","New", "New", "Old","Old"),
v4=c("A","B","C","X","Y","Z","A","B","C","D"))
v1 v2 v3 v4
1 1 New A
1 2 Old B
1 3 Old C
2 1 New X
2 2 Old Y
2 3 New Z
3 1 New A
3 2 New B
3 3 Old C
3 4 Old D
我们可以尝试使用
data.table
。将'data.frame'转换为'data.table'(setDT(df)
),按'v1'分组,我们用NA替换'v3'中与'Old'值相对应的'v4'元素,然后使用NA.locf
(来自库(动物园)
)将NA
值替换为前面的非NA值,分配(:=/code>)用于创建新列“v5”的输出
library(data.table)
library(zoo)
setDT(df)[, v5:= na.locf(replace(v4, v3=='Old', NA)) , by = v1]
df
# v1 v2 v3 v4 v5
# 1: 1 1 New A A
# 2: 1 2 Old B A
# 3: 1 3 Old C A
# 4: 2 1 New X X
# 5: 2 2 Old Y X
# 6: 2 3 New Z Z
# 7: 3 1 New A A
# 8: 3 2 New B B
# 9: 3 3 Old C B
#10: 3 4 Old D B
或者我们可以使用ave
frombase R
df$v5 <- with(df, ave(replace(v4, v3=='Old', NA),v1, FUN= na.locf))
df$v5也可以使用dplyr
包
library(dplyr)
library(zoo)
df <- data.frame(v1=c(1,1,1,2,2,2,3,3,3,3),
v2=c(1,2,3,1,2,3,1,2,3,4),
v3=c("New", "Old", "Old","New", "Old", "New","New", "New", "Old","Old"),
v4=c("A","B","C","X","Y","Z","A","B","C","D"),
stringsAsFactors = FALSE)
df %>%
group_by(v1) %>%
mutate(v5=ifelse(v3=="New", v4, NA),
v5=na.locf(v5))
# Source: local data frame [10 x 5]
# Groups: v1 [3]
#
# v1 v2 v3 v4 v5
# (dbl) (dbl) (chr) (chr) (chr)
# 1 1 1 New A A
# 2 1 2 Old B A
# 3 1 3 Old C A
# 4 2 1 New X X
# 5 2 2 Old Y X
# 6 2 3 New Z Z
# 7 3 1 New A A
# 8 3 2 New B B
# 9 3 3 Old C B
# 10 3 4 Old D B
库(dplyr)
图书馆(动物园)
df%
分组依据(v1)%>%
变异(v5=ifelse(v3=New),v4,NA),
v5=na.locf(v5))
#来源:本地数据帧[10 x 5]
#分组:v1[3]
#
#v1 v2 v3 v4 v5
#(dbl)(dbl)(chr)(chr)(chr)
#1新的A
#2 1 2老B A
#3 1 3旧C A
#4 2 1新X X
#5 2 2老Y X
#6 2 3新Z Z
#7 3 1新A
#8 3 2新B
#9 3旧C B
#10 3 4老D B
太好了。Thx@docendo discimus
library(dplyr)
library(zoo)
df <- data.frame(v1=c(1,1,1,2,2,2,3,3,3,3),
v2=c(1,2,3,1,2,3,1,2,3,4),
v3=c("New", "Old", "Old","New", "Old", "New","New", "New", "Old","Old"),
v4=c("A","B","C","X","Y","Z","A","B","C","D"),
stringsAsFactors = FALSE)
df %>%
group_by(v1) %>%
mutate(v5=ifelse(v3=="New", v4, NA),
v5=na.locf(v5))
# Source: local data frame [10 x 5]
# Groups: v1 [3]
#
# v1 v2 v3 v4 v5
# (dbl) (dbl) (chr) (chr) (chr)
# 1 1 1 New A A
# 2 1 2 Old B A
# 3 1 3 Old C A
# 4 2 1 New X X
# 5 2 2 Old Y X
# 6 2 3 New Z Z
# 7 3 1 New A A
# 8 3 2 New B B
# 9 3 3 Old C B
# 10 3 4 Old D B