基于最小值和最大值查找R数据帧
我正在寻找一种将R中的数据集连接到查找数据集的简单方法。查找数据集具有最小值、最大值和标签,如下所示基于最小值和最大值查找R数据帧,r,max,lookup,min,R,Max,Lookup,Min,我正在寻找一种将R中的数据集连接到查找数据集的简单方法。查找数据集具有最小值、最大值和标签,如下所示 library(dplyr) min_age <- c(0, .5, 1, 5, 17, 35, 50, 75) max_age <- c(.5, 1, 5, 17, 35, 50, 75, 125) age_lbl <- c("< 6 months", "6 months - 1 year", "1 year - 5 years", "5
library(dplyr)
min_age <- c(0, .5, 1, 5, 17, 35, 50, 75)
max_age <- c(.5, 1, 5, 17, 35, 50, 75, 125)
age_lbl <- c("< 6 months", "6 months - 1 year", "1 year - 5 years",
"5 years - 17 years", "17 years - 35 years", "35 years - 50 years"
, "50 years - 75 years", "> 75 years")
age_lbl <- as.factor(age_lbl)
lkp_df <- data.frame(min_age = min_age, max_age = max_age, age_grp_lb = age_lbl)
我有一个迂回的处理方法
# Introduce dummy column to obtain a cartesian
lkp_df <- lkp_df %>%
mutate(join = 1)
set.seed(6789)
pat_id <- seq(1001, 1075)
pat_age <- runif(75, 0, 95)
pat_df <- data.frame(pat_id, pat_age) %>%
mutate(pat_age_yrs = as.integer(pat_age),
pat_age_mths = as.integer(pat_age * 12)) %>%
# Introduce dummy column to obtain a cartesian
mutate(join = 1) %>%
# Create a cartesian product with the join column
inner_join(lkp_df) %>%
# Filter to keep only required records
filter(pat_age >= min_age & pat_age < max_age) %>%
# Keep only necessary columns
select(pat_id, pat_age, pat_age_yrs, pat_age_mths, age_grp_lb)
#引入伪列以获得笛卡尔坐标
lkp_df%
变异(join=1)
种子集(6789)
个人识别码%
#使用联接列创建笛卡尔乘积
内部连接(lkp\U df)%>%
#筛选以仅保留所需的记录
过滤器(帕特年龄>=最小年龄&帕特年龄<最大年龄)%>%
#只保留必要的列
选择(个人id、个人年龄、个人年龄、个人年龄、个人年龄、个人年龄、个人年龄、年龄)
有人能提出更好的方法来处理类似的情况吗。提前谢谢。
MasOrgR.如果间隔是连续的,就像这里,你也可以考虑<代码>剪切< /代码>。@ Jaap,谢谢你的链接。fuzzyjoin和data.table方法非常好。它从来没有出现在我的搜索,因为我使用的关键字。
head(pat_df)
pat_id pat_age
1001 14.397769
1002 66.694280
1003 53.628013
1004 58.782156
1005 5.032531
1006 16.430463
# Introduce dummy column to obtain a cartesian
lkp_df <- lkp_df %>%
mutate(join = 1)
set.seed(6789)
pat_id <- seq(1001, 1075)
pat_age <- runif(75, 0, 95)
pat_df <- data.frame(pat_id, pat_age) %>%
mutate(pat_age_yrs = as.integer(pat_age),
pat_age_mths = as.integer(pat_age * 12)) %>%
# Introduce dummy column to obtain a cartesian
mutate(join = 1) %>%
# Create a cartesian product with the join column
inner_join(lkp_df) %>%
# Filter to keep only required records
filter(pat_age >= min_age & pat_age < max_age) %>%
# Keep only necessary columns
select(pat_id, pat_age, pat_age_yrs, pat_age_mths, age_grp_lb)