R 使用查找表创建新变量
我想使用查找表创建一个新变量。数据帧如下所示:R 使用查找表创建新变量,r,lookup,sapply,R,Lookup,Sapply,我想使用查找表创建一个新变量。数据帧如下所示: id sex age length 1 Female 1 45 2 Female 2 54 3 Female 3 56 4 Female 4 60 5 Female 5 60 6 Female 6 61 7 Female 7 63 8 Male 1 55 9 Male
id sex age length
1 Female 1 45
2 Female 2 54
3 Female 3 56
4 Female 4 60
5 Female 5 60
6 Female 6 61
7 Female 7 63
8 Male 1 55
9 Male 2 54
10 Male 3 58
11 Male 4 61
12 Male 5 65
13 Male 6 63
14 Male 7 65
15 Male 8 67
16 Male 9 68
17 Male 10 69
查找表如下所示
sex age length
Female 1 50
Female 2 53
Female 3 56
Female 4 58
Female 5 60
Female 6 61
Female 7 63
Male 1 50
Male 2 54
Male 3 57
Male 4 60
Male 5 62
Male 6 63
Male 7 65
Male 8 66
Male 9 67
Male 10 69
我想创建一个新变量growth.rate
,它有两个级别:“Normal”和“Low”,因此最终的数据框如下所示
id sex age length growth.rate
1 Female 1 45 Low
2 Female 2 54 Normal
3 Female 3 56 Low
4 Female 4 60 Normal
5 Female 5 60 Low
6 Female 6 61 Low
7 Female 7 63 Low
8 Male 1 55 Normal
9 Male 2 54 Low
10 Male 3 58 Normal
11 Male 4 61 Normal
12 Male 5 65 Normal
13 Male 6 63 Low
14 Male 7 65 Low
15 Male 8 67 Normal
16 Male 9 68 Normal
17 Male 10 69 Low
在本例中,id 1的growth.rate为“Low”,因为她的长度低于查找表中1岁女性的值
相反,id 2的生长率为“正常”,因为她的长度高于查找表中2岁女性的值
我试图调整这个解决方案,但没有成功
非常感谢任何帮助如果我们在第一个数据集和基于“性别”、“年龄”的查找数据集之间进行
左联接
,我们将得到两个“长度”列,在这些列之间进行比较,并在时使用ifelse
或case\u创建一个新列
library(dplyr)
left_join(df1, lookup, by = c('sex', 'age')) %>%
transmute(id, sex, age,
growth.rate = case_when(length.x <= length.y ~ "Low",
TRUE ~ "Normal"), length = length.x)
# id sex age growth.rate length
#1 1 Female 1 Low 45
#2 2 Female 2 Normal 54
#3 3 Female 3 Low 56
#4 4 Female 4 Normal 60
#5 5 Female 5 Low 60
#6 6 Female 6 Low 61
#7 7 Female 7 Low 63
#8 8 Male 1 Normal 55
#9 9 Male 2 Low 54
#10 10 Male 3 Normal 58
#11 11 Male 4 Normal 61
#12 12 Male 5 Normal 65
#13 13 Male 6 Low 63
#14 14 Male 7 Low 65
#15 15 Male 8 Normal 67
#16 16 Male 9 Normal 68
#17 17 Male 10 Low 69
或者使用索引
setDT(df1)[lookup, growth.rate :=
c("Normal", "Low")[1 + (length <= i.length)], on = .(sex, age)]
setDT(df1)[查找,增长率:=
c(“正常”、“低”)[1+(长度在基数R中,我们可以使用merge
通过sex
和age
将两个数据帧连接起来,并通过使用ifelse
检查条件来创建一个新列
transform(merge(df, lookup, all.x = TRUE, by = c("sex", "age")),
growth.rate = ifelse(length.x > length.y, "Normal", "Low"))
# sex age id length.x length.y growth.rate
#1 Female 1 1 45 50 Low
#2 Female 2 2 54 53 Normal
#3 Female 3 3 56 56 Low
#4 Female 4 4 60 58 Normal
#5 Female 5 5 60 60 Low
#6 Female 6 6 61 61 Low
#7 Female 7 7 63 63 Low
#8 Male 1 8 55 50 Normal
#9 Male 10 17 69 69 Low
#10 Male 2 9 54 54 Low
#11 Male 3 10 58 57 Normal
#12 Male 4 11 61 60 Normal
#13 Male 5 12 65 62 Normal
#14 Male 6 13 63 63 Low
#15 Male 7 14 65 65 Low
#16 Male 8 15 67 66 Normal
#17 Male 9 16 68 67 Normal
您可以删除不需要的列。我收到以下错误“请在…中提供偶数个参数,包括逻辑条件、结果值对(按该顺序);收到3个输入。”@Chris我无法用我在帖子中显示的数据重现错误谢谢。它在我使用索引时有效。但在我使用fcase@Chris它位于data.table的devel版本中。对不起,我应该提到它
df1 <- structure(list(id = 1:17, sex = c("Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male"), age = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L
), length = c(45L, 54L, 56L, 60L, 60L, 61L, 63L, 55L, 54L, 58L,
61L, 65L, 63L, 65L, 67L, 68L, 69L)), class = "data.frame", row.names = c(NA,
-17L))
lookup <- structure(list(sex = c("Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male"), age = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L
), length = c(50L, 53L, 56L, 58L, 60L, 61L, 63L, 50L, 54L, 57L,
60L, 62L, 63L, 65L, 66L, 67L, 69L)), class = "data.frame", row.names = c(NA,
-17L))
transform(merge(df, lookup, all.x = TRUE, by = c("sex", "age")),
growth.rate = ifelse(length.x > length.y, "Normal", "Low"))
# sex age id length.x length.y growth.rate
#1 Female 1 1 45 50 Low
#2 Female 2 2 54 53 Normal
#3 Female 3 3 56 56 Low
#4 Female 4 4 60 58 Normal
#5 Female 5 5 60 60 Low
#6 Female 6 6 61 61 Low
#7 Female 7 7 63 63 Low
#8 Male 1 8 55 50 Normal
#9 Male 10 17 69 69 Low
#10 Male 2 9 54 54 Low
#11 Male 3 10 58 57 Normal
#12 Male 4 11 61 60 Normal
#13 Male 5 12 65 62 Normal
#14 Male 6 13 63 63 Low
#15 Male 7 14 65 65 Low
#16 Male 8 15 67 66 Normal
#17 Male 9 16 68 67 Normal