基于最小值和最大值查找R数据帧

基于最小值和最大值查找R数据帧,r,max,lookup,min,R,Max,Lookup,Min,我正在寻找一种将R中的数据集连接到查找数据集的简单方法。查找数据集具有最小值、最大值和标签,如下所示 library(dplyr) min_age <- c(0, .5, 1, 5, 17, 35, 50, 75) max_age <- c(.5, 1, 5, 17, 35, 50, 75, 125) age_lbl <- c("< 6 months", "6 months - 1 year", "1 year - 5 years", "5

我正在寻找一种将R中的数据集连接到查找数据集的简单方法。查找数据集具有最小值、最大值和标签,如下所示

library(dplyr)

min_age <- c(0, .5, 1, 5, 17, 35, 50, 75)
max_age <- c(.5, 1, 5, 17, 35, 50, 75, 125)
age_lbl <- c("< 6 months", "6 months - 1 year", "1 year - 5 years", 
             "5 years - 17 years", "17 years - 35 years", "35 years - 50 years"
             , "50 years - 75 years", "> 75 years")
age_lbl <- as.factor(age_lbl)
lkp_df <- data.frame(min_age = min_age, max_age = max_age, age_grp_lb = age_lbl)
我有一个迂回的处理方法

# Introduce dummy column to obtain a cartesian
lkp_df <- lkp_df %>%
  mutate(join = 1)

set.seed(6789)
pat_id <- seq(1001, 1075)
pat_age <- runif(75, 0, 95)

pat_df <- data.frame(pat_id, pat_age) %>%
  mutate(pat_age_yrs = as.integer(pat_age),
         pat_age_mths = as.integer(pat_age * 12)) %>%
  # Introduce dummy column to obtain a cartesian
  mutate(join = 1) %>%
  # Create a cartesian product with the join column
  inner_join(lkp_df) %>%
  # Filter to keep only required records
  filter(pat_age >= min_age & pat_age < max_age) %>%
  # Keep only necessary columns
  select(pat_id, pat_age, pat_age_yrs, pat_age_mths, age_grp_lb)
#引入伪列以获得笛卡尔坐标
lkp_df%
变异(join=1)
种子集(6789)
个人识别码%
#使用联接列创建笛卡尔乘积
内部连接(lkp\U df)%>%
#筛选以仅保留所需的记录
过滤器(帕特年龄>=最小年龄&帕特年龄<最大年龄)%>%
#只保留必要的列
选择(个人id、个人年龄、个人年龄、个人年龄、个人年龄、个人年龄、个人年龄、年龄)
有人能提出更好的方法来处理类似的情况吗。提前谢谢。
MasOrgR.

如果间隔是连续的,就像这里,你也可以考虑<代码>剪切< /代码>。@ Jaap,谢谢你的链接。fuzzyjoin和data.table方法非常好。它从来没有出现在我的搜索,因为我使用的关键字。
head(pat_df)   
pat_id   pat_age 
1001     14.397769
1002     66.694280
1003     53.628013
1004     58.782156
1005      5.032531
1006     16.430463
# Introduce dummy column to obtain a cartesian
lkp_df <- lkp_df %>%
  mutate(join = 1)

set.seed(6789)
pat_id <- seq(1001, 1075)
pat_age <- runif(75, 0, 95)

pat_df <- data.frame(pat_id, pat_age) %>%
  mutate(pat_age_yrs = as.integer(pat_age),
         pat_age_mths = as.integer(pat_age * 12)) %>%
  # Introduce dummy column to obtain a cartesian
  mutate(join = 1) %>%
  # Create a cartesian product with the join column
  inner_join(lkp_df) %>%
  # Filter to keep only required records
  filter(pat_age >= min_age & pat_age < max_age) %>%
  # Keep only necessary columns
  select(pat_id, pat_age, pat_age_yrs, pat_age_mths, age_grp_lb)