Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/ruby-on-rails-3/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
r将数据帧列中的多个字符串替换为另一数据帧列中的多个字符串_R - Fatal编程技术网

r将数据帧列中的多个字符串替换为另一数据帧列中的多个字符串

r将数据帧列中的多个字符串替换为另一数据帧列中的多个字符串,r,R,我有一个带有列“partcipandi”的数据帧(df1)。某些参与者ID错误,应使用正确的参与者ID替换。我有另一个数据帧(df2),其中所有参与者ID都显示在Goal_ID到T4列中。“目标ID”列中的参与者ID是正确的ID。 现在我想用df2中的所有目标ID参与者替换df1中的所有参与者 这是我的原始数据帧(df1): 这是参考数据帧(df2): 例如,在我的df1中,我想要 “AR_BO_RA_92”改为“AJ_BO_RA_92” “AL_MA_RA_95”改为“AL_EN_RA_95”

我有一个带有列“partcipandi”的数据帧(df1)。某些参与者ID错误,应使用正确的参与者ID替换。我有另一个数据帧(df2),其中所有参与者ID都显示在Goal_ID到T4列中。“目标ID”列中的参与者ID是正确的ID。
现在我想用df2中的所有目标ID参与者替换df1中的所有参与者

这是我的原始数据帧(df1):

这是参考数据帧(df2):

例如,在我的df1中,我想要

“AR_BO_RA_92”改为“AJ_BO_RA_92”
“AL_MA_RA_95”改为“AL_EN_RA_95”
“AM_BO_AB_94”将替换为“AM_BO_AB_49”

等等

我考虑过使用string_replace,并从以下内容开始:

df1$Partcipant_ID <- str_replace(df1$Partcipant_ID, "AR_BO_RA_92", "AJ_BO_RA_92")

df1$participant\u ID您可以使用
match
查找字符串的位置,并更改已找到但不属于
NA
的字符串,如:

i <- match(df1$Partcipant_ID, unlist(df2[-1])) %% nrow(df2)
j <- !is.na(i)
df1$Partcipant_ID[j] <- df2$Goal_ID[i[j]]
df1$Partcipant_ID
# [1] "AA_SH_RA_91" "AA_SH_RA_91" "AB_BA_PR_93" "AB_BH_VI_90" "AB_BH_VI_90"
# [6] "AB_SA_TA_91" "AJ_BO_RA_92" "AJ_BO_RA_92" "AK_SH_HA_91" "AL_EN_RA_95"
#[11] "AL_MA_RA_95" "AL_SH_BA_99" "AM_BO_AB_49" "AM_BO_AB_94" "AM_BO_AB_94"
#[16] "AM_BO_AB_94" "AN_JA_AN_91" "AN_KL_GE_11" "AN_KL_WO_91" "AN_MA_DI_95"
#[21] "AN_MA_DI_95" "AN_SE_RA_95" "AN_SE_RA_95" "AN_SI_RA_97" "AN_SO_PU_94"
#[26] "AN_SU_RA_91" "AR_BO_RA_92" "AR_KA_VI_94" "AR_KA_VI_94" "AS_AR_SO_90"
#[31] "AS_AR_SU_95" "AS_KU_SO_90" "AS_MO_AS_97" "AW_SI_OJ_97" "AW_SI_OJ_97"
#[36] "AY_CH_SU_97" "BH_BE_LD_84" "BH_BE_LI_83" "BH_BE_LI_83" "BH_BE_LI_84"
#[41] "BH_KO_SA_87" "BH_PE_AB_89" "BH_YA_SA_87" "BI_CH_PR_94" "BI_CH_PR_94"

i我想这可能行得通。创建一个包含正确和错误代码列的真实查找表。也就是说,堆叠列,然后将后续的df3连接到df1,并使用coalesce创建一个新的part_id。您拼错了participant,这让我感觉更人性化,我总是这样做

库(dplyr)
df3%
绑定行(df2[c(1,3)]%>%重命名(T2=T3),
df2[c(1,4)]%>%重命名(T2=T4))%>%
不同的()
df1%>%
左联接(df3,由=c(“参与方ID”=“T2”))%>%
突变(目标ID=合并(目标ID,参与方ID))%>%
选择(目标ID、参与者ID、开始T2、结束T2)

那么df1中出现在df2中T2、T3、T4中的参与者ID应该替换为df2中T1中的ID?你能再澄清一下df2的结构吗?确切地说。“目标数据帧”是df1(因此这里我希望用正确的ID替换不正确的ID)。df2是参考数据帧。df2中所有列中的ID也出现在df1中,应该由df2的Goal_ID列中的ID替换。我在问题中编辑了df2的结构。也许现在更清楚了。
df1$Partcipant_ID <- str_replace(df1$Partcipant_ID, "AR_BO_RA_92", "AJ_BO_RA_92")
i <- match(df1$Partcipant_ID, unlist(df2[-1])) %% nrow(df2)
j <- !is.na(i)
df1$Partcipant_ID[j] <- df2$Goal_ID[i[j]]
df1$Partcipant_ID
# [1] "AA_SH_RA_91" "AA_SH_RA_91" "AB_BA_PR_93" "AB_BH_VI_90" "AB_BH_VI_90"
# [6] "AB_SA_TA_91" "AJ_BO_RA_92" "AJ_BO_RA_92" "AK_SH_HA_91" "AL_EN_RA_95"
#[11] "AL_MA_RA_95" "AL_SH_BA_99" "AM_BO_AB_49" "AM_BO_AB_94" "AM_BO_AB_94"
#[16] "AM_BO_AB_94" "AN_JA_AN_91" "AN_KL_GE_11" "AN_KL_WO_91" "AN_MA_DI_95"
#[21] "AN_MA_DI_95" "AN_SE_RA_95" "AN_SE_RA_95" "AN_SI_RA_97" "AN_SO_PU_94"
#[26] "AN_SU_RA_91" "AR_BO_RA_92" "AR_KA_VI_94" "AR_KA_VI_94" "AS_AR_SO_90"
#[31] "AS_AR_SU_95" "AS_KU_SO_90" "AS_MO_AS_97" "AW_SI_OJ_97" "AW_SI_OJ_97"
#[36] "AY_CH_SU_97" "BH_BE_LD_84" "BH_BE_LI_83" "BH_BE_LI_83" "BH_BE_LI_84"
#[41] "BH_KO_SA_87" "BH_PE_AB_89" "BH_YA_SA_87" "BI_CH_PR_94" "BI_CH_PR_94"
library(dplyr)

df3 <- df2[1:2] %>% 
  bind_rows(df2[c(1,3)] %>% rename(T2 = T3), 
            df2[c(1,4)] %>% rename(T2 = T4)) %>% 
  distinct()


df1 %>% 
  left_join(df3, by = c("Partcipant_ID" = "T2")) %>% 
  mutate(Goal_ID = coalesce(Goal_ID, Partcipant_ID)) %>% 
  select(Goal_ID, Partcipant_ID, Start_T2, End_T2)