R将多列中的名称替换为ID';它来自另一个物体
我有一组患者数据R将多列中的名称替换为ID';它来自另一个物体,r,dataframe,indexing,R,Dataframe,Indexing,我有一组患者数据df我正试图在R中取消识别 structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("Andrew", "Jim", "Kurt", "Lester", "Mickey", "Taylor"), class = "factor"),
df
我正试图在R中取消识别
structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("Andrew",
"Jim", "Kurt", "Lester", "Mickey", "Taylor"), class = "factor"),
heart_rate = c(78L, 82L, 67L, 105L, 85L, 94L), age = c(35L,
23L, 43L, 52L, 33L, 45L), partner = structure(c(5L, 2L, 6L,
1L, 3L, 4L), .Label = c("Andrew", "Jim ", "Kurt ", "Lester ",
"Mickey ", "Taylor "), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
我想根据名为key
的对象的id
列替换name
和partner
列的名称
structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("Andrew",
"Jim", "Kurt", "Lester", "Mickey", "Taylor"), class = "factor"),
id = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("A3",
"J9", "K5", "L4", "M4", "T7"), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
我可以使用此代码取消识别名称
列
df[["name"]] <- key[ match(df[['name']], key[['name']] ) , 'id']
df[["partner"]] <- key[ match(df[['partner']], key[['name']] ) , 'id']
我的数据框看起来像这样
structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("A3",
"J9", "K5", "L4", "M4", "T7"), class = "factor"), heart_rate = c(78L,
82L, 67L, 105L, 85L, 94L), age = c(35L, 23L, 43L, 52L, 33L, 45L
), partner = structure(c(NA, NA, NA, 1L, NA, NA), .Label = c("A3",
"J9", "K5", "L4", "M4", "T7"), class = "factor")), row.names = c(NA,
-6L), class = "data.frame")
有人有什么建议吗?对于那些可以在一行中应用于数据集中所有列的方法,以及对代码的解释,我们非常感谢 问题是在
df
中的partner
列中,大多数单词后面都有空格:
.Label = c("Andrew", "Jim ", "Kurt ", "Lester ", "Mickey ", "Taylor ")
这意味着match()
将找不到完全匹配的名称,除了名称“Andrew”,它将正确返回该索引
解决这个问题的方法是使用
df$partner = trimws(df$partner)
那么您的代码就可以正常工作了:
> df[["partner"]] <- key[ match(df[['partner']], key[['name']] ) , 'id']
> df
name heart_rate age partner
1 J9 78 35 M4
2 M4 82 23 J9
3 A3 67 43 T7
4 T7 105 52 A3
5 L4 85 33 K5
6 K5 94 45 L4
>df[[“合作伙伴”]]df
姓名心率年龄伙伴
1 J9 78 35 M4
2 M4 82 23 J9
3 A3 67 43 T7
4 T7 105 52 A3
5 L4 85 33 K5
6K59445L4
问题在于,在df
中的合作伙伴列中,大多数单词后面都有一个空格:
.Label = c("Andrew", "Jim ", "Kurt ", "Lester ", "Mickey ", "Taylor ")
这意味着match()
将找不到完全匹配的名称,除了名称“Andrew”,它将正确返回该索引
解决这个问题的方法是使用
df$partner = trimws(df$partner)
那么您的代码就可以正常工作了:
> df[["partner"]] <- key[ match(df[['partner']], key[['name']] ) , 'id']
> df
name heart_rate age partner
1 J9 78 35 M4
2 M4 82 23 J9
3 A3 67 43 T7
4 T7 105 52 A3
5 L4 85 33 K5
6 K5 94 45 L4
>df[[“合作伙伴”]]df
姓名心率年龄伙伴
1 J9 78 35 M4
2 M4 82 23 J9
3 A3 67 43 T7
4 T7 105 52 A3
5 L4 85 33 K5
6K59445L4