R 如何使用从查找表中粘贴的select COLNAME值创建新列
我试图在一个数据帧(名为“input”)中生成一个新列,该列根据查找表中相关列中的值从另一个数据帧(名为“Lookup”)中获取colname作为值。下面是一些表示这两个表的假数据: 创建假查找表R 如何使用从查找表中粘贴的select COLNAME值创建新列,r,R,我试图在一个数据帧(名为“input”)中生成一个新列,该列根据查找表中相关列中的值从另一个数据帧(名为“Lookup”)中获取colname作为值。下面是一些表示这两个表的假数据: 创建假查找表 如有任何建议,将不胜感激 首先,由于数据帧LookUp和input中变量的值相同,而且LookUp$Drugs和input$druge中似乎没有重复项,因此最好将它们合并,但需要先打包:数据。table和dplyr: install.packages(c("data.table", "dplyr"))
如有任何建议,将不胜感激 首先,由于数据帧
LookUp
和input
中变量的值相同,而且LookUp$Drugs
和input$druge
中似乎没有重复项,因此最好将它们合并,但需要先打包:数据。table
和dplyr
:
install.packages(c("data.table", "dplyr"))
library(data.table)
library(dplyr)
让我们加入表格:
output <- merge(input, LookUp, by.x = "Drug", by.y = "Drugs", all.x = T)
Drug rowID CYP1A1 CYP1A2 CYP1B1 CYP2A6 CYP2A13 CYP2B6 CYP2C8 CYP2C9
1 amitriptyline 1 <NA> S_Inh <NA> <NA> NA S S_Inh S
2 asenapine 2 <NA> <NA> <NA> <NA> NA <NA> <NA> <NA>
3 bupropion 3 <NA> S <NA> S NA S_Inh S S
4 desipramine 4 <NA> <NA> <NA> Inh NA Ind <NA> <NA>
瞧 这里有一个关于
dplyr
和reforme2
包的想法
#First you add stringsAsFactors = FALSE in your dataframes,
LookUp <- data.frame(Drugs, CYP1A1,CYP1A2, CYP1B1, CYP2A6,CYP2A13,CYP2B6,CYP2C8,CYP2C9, stringsAsFactors = FALSE)
input <- data.frame(rowID=c(1:4), Drug=Drugs[c(1,3,4,9)], stringsAsFactors = FALSE)
library(dplyr)
library(reshape2)
melt(LookUp, id.vars = 'Drugs', na.rm = TRUE) %>%
group_by(Drugs) %>%
summarise(metabCYPs = toString(variable[grepl('S', value)])) %>%
left_join(input, ., by = c('Drug' = 'Drugs'))
# rowID Drug metabCYPs
#1 1 amitriptyline CYP1A2, CYP2B6, CYP2C8, CYP2C9
#2 2 asenapine <NA>
#3 3 bupropion CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9
#4 4 desipramine
dplyr
和重塑
bug me。。。下面是另一个在药物变量上使用隐式循环的想法:
metabCYPs <- sapply(LookUp$Drugs, function(x) paste0(names(LookUp[which(LookUp$Drugs == x), grepl("S", LookUp[which(LookUp$Drugs == x), setdiff(names(LookUp), "Drugs")])]), collapse = ", "))
output <- data.frame(input, metabCYPs=metabCYPs[match(input$Drugs, names(metabCYPs))])
谢谢,卡克萨。这将使用查找表中的值而不是列名创建输出列。它还为输出表中的每一行生成相同的值,而不是对应于正确药物的值。然而,Sotos解决方案似乎奏效了。
install.packages(c("data.table", "dplyr"))
library(data.table)
library(dplyr)
output <- merge(input, LookUp, by.x = "Drug", by.y = "Drugs", all.x = T)
Drug rowID CYP1A1 CYP1A2 CYP1B1 CYP2A6 CYP2A13 CYP2B6 CYP2C8 CYP2C9
1 amitriptyline 1 <NA> S_Inh <NA> <NA> NA S S_Inh S
2 asenapine 2 <NA> <NA> <NA> <NA> NA <NA> <NA> <NA>
3 bupropion 3 <NA> S <NA> S NA S_Inh S S
4 desipramine 4 <NA> <NA> <NA> Inh NA Ind <NA> <NA>
output$metabCYPs <- output[,3:10] %>%
apply(1, paste0) %>%
setdiff("NA") %>%
paste0(collapse = ", ")
output[,3:10] <- NA
#First you add stringsAsFactors = FALSE in your dataframes,
LookUp <- data.frame(Drugs, CYP1A1,CYP1A2, CYP1B1, CYP2A6,CYP2A13,CYP2B6,CYP2C8,CYP2C9, stringsAsFactors = FALSE)
input <- data.frame(rowID=c(1:4), Drug=Drugs[c(1,3,4,9)], stringsAsFactors = FALSE)
library(dplyr)
library(reshape2)
melt(LookUp, id.vars = 'Drugs', na.rm = TRUE) %>%
group_by(Drugs) %>%
summarise(metabCYPs = toString(variable[grepl('S', value)])) %>%
left_join(input, ., by = c('Drug' = 'Drugs'))
# rowID Drug metabCYPs
#1 1 amitriptyline CYP1A2, CYP2B6, CYP2C8, CYP2C9
#2 2 asenapine <NA>
#3 3 bupropion CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9
#4 4 desipramine
melt(LookUp, id.vars = 'Drugs', na.rm = TRUE) %>%
group_by(Drugs) %>%
summarise(metabCYPs = toString(variable[grepl('S', value)]),
with_Ihn = toString(variable[grepl('Inh', value)]),
with_Ind = toString(variable[grepl('Ind', value)])) %>%
left_join(input, ., by = c('Drug' = 'Drugs'))
metabCYPs <- sapply(LookUp$Drugs, function(x) paste0(names(LookUp[which(LookUp$Drugs == x), grepl("S", LookUp[which(LookUp$Drugs == x), setdiff(names(LookUp), "Drugs")])]), collapse = ", "))
output <- data.frame(input, metabCYPs=metabCYPs[match(input$Drugs, names(metabCYPs))])