R用另一个数据帧/数组中的值替换数据帧中的字符串匹配

R用另一个数据帧/数组中的值替换数据帧中的字符串匹配,r,R,我创建了一个代码来匹配不同的字符串值,如果匹配,则用后面的字符串值替换字符串值 我有一个数据帧,另一个是数组 df1 <- data.frame(campaign_source=c("googleadwords", "google display" ,"twitter banner", "facebook-post", "facebook like","inmobi","organic"),cost=c(4,2,3,4,5,6,7)) source<-c("google","fac

我创建了一个代码来匹配不同的字符串值,如果匹配,则用后面的字符串值替换字符串值

我有一个数据帧,另一个是数组

df1 <- data.frame(campaign_source=c("googleadwords", "google display" ,"twitter banner", "facebook-post", "facebook like","inmobi","organic"),cost=c(4,2,3,4,5,6,7))

source<-c("google","facebook","twitter")
(上面的答案似乎不正确。)尝试使用
grep
结果作为分配索引的替代代码:

 df1$source <- NA
 for( item in source ) df1$source[grep(item,  df1$campaign_source)] <- item
 df1$source[is.na(df1$source)] <- "other"
 df1
#-----------------
  campaign_source cost   source
1  google adwords    4   google
2  google display    2   google
3  twitter banner    3  twitter
4   facebook post    4 facebook
5   facebook like    5 facebook
6          inmobi    6    other
7         organic    7    other

df1$source这里有一个使用strsplit的替代解决方案:

df1$source <- sapply(df1$campaign_source, function(x) {
    w <- unlist(strsplit(as.character(x), " "));
    if (length(w[w %in% source]) > 0) w[w %in% source] else "other";
})
#campaign_source cost   source
#1  google adwords    4   google
#2  google display    2   google
#3  twitter banner    3  twitter
#4   facebook post    4 facebook
#5   facebook like    5 facebook
#6          inmobi    6    other
#7         organic    7    other

df1$source您好,毛里塔尼亚,谢谢您的帮助。有时df1$campaign_source中的文本没有空格分隔,文本可以是“googleadwords”或“facebook新用户”@SaugataHalder,我看到了;不幸的是,您的样本数据没有提到这些案例。我试图提供一个解决方案,而不显式地遍历
source
中的条目。没关系,42-的解决方案仍然适用于您。是的@Maurits这是您方面的一个巨大帮助,不管怎样,我只是想到了提到用例
 df1$source <- NA
 for( item in source ) df1$source[grep(item,  df1$campaign_source)] <- item
 df1$source[is.na(df1$source)] <- "other"
 df1
#-----------------
  campaign_source cost   source
1  google adwords    4   google
2  google display    2   google
3  twitter banner    3  twitter
4   facebook post    4 facebook
5   facebook like    5 facebook
6          inmobi    6    other
7         organic    7    other
df1$source <- sapply(df1$campaign_source, function(x) {
    w <- unlist(strsplit(as.character(x), " "));
    if (length(w[w %in% source]) > 0) w[w %in% source] else "other";
})
#campaign_source cost   source
#1  google adwords    4   google
#2  google display    2   google
#3  twitter banner    3  twitter
#4   facebook post    4 facebook
#5   facebook like    5 facebook
#6          inmobi    6    other
#7         organic    7    other
df1 <- data.frame(campaign_source=c("google adwords", "google display" ,"twitter banner", "facebook post", "facebook like","inmobi","organic"),cost=c(4,2,3,4,5,6,7))

source<-c("google", "facebook", "twitter");