R 使用查找表替换因子值_R - Fatal编程技术网

R 使用查找表替换因子值

R 使用查找表替换因子值,r,R,我正在寻找更高效的代码版本，以替换数据帧中的因素这是我的数据集： structure(list(Rio.Olympics.Sports.Participating.Team = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5

我正在寻找更高效的代码版本，以替换数据帧中的因素

这是我的数据集：

structure(list(Rio.Olympics.Sports.Participating.Team = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("American Gymnastics", 
"American Swimmers", "Boxing", "European Gymnastics", "Running", 
"Free-style swimming", "Breaststroke Swimming", "Diving", "Athletics", 
"Soccer"), class = "factor"), Calendar.Quarter = structure(c(16071, 
16161, 16252, 16344, 16436, 16526, 16617, 16709, 16801, 16892, 
16983, 17075, 16071, 16161, 16252, 16344, 16436, 16526, 16617, 
16709, 16801, 16892, 16983, 17075, 16071, 16161, 16252, 16344, 
16436, 16526, 16617, 16709, 16801, 16892, 16983, 17075, 16071, 
16161, 16252, 16344, 16436, 16526, 16617, 16709, 16801, 16892, 
16983, 17075, 16071, 16161, 16252, 16344, 16436, 16526, 16617, 
16709, 16801, 16892, 16983, 17075), class = "Date"), Randomized.Viewers = c(49, 
45, 51, 55, 47, 48, 54, 57, 53, 50, 52, 58, 32, 29, 33, 40, 34, 
36, 31, 39, 37, 30, 35, 41, 5, 1, 25, 46, 38, 4, 56, 27, 21, 
43, 42, 44, 2, 59, 3, 10, 60, 7, 14, 24, 13, 16, 17, 28, 15, 
6, 19, 23, 11, 12, 20, 22, 9, 8, 18, 26)), .Names = c("Rio.Olympics.Sports.Participating.Team", 
"Calendar.Quarter", "Randomized.Viewers"), row.names = c(NA, 
-60L), class = "data.frame")

现在，我想更改因子标签。以下是我所做的：

Old_labels <- c("American Swimmers", "American Gymnastics", 
               "European Gymnastics", "Running", "Boxing")
New_labels <- c("Jupitean Swimmers", "Saturnish Gymastics", 
               "Plutoish Gymnastics", "Walking", "Fighting")
Apply_lables <- data.frame(Old_labels, New_labels)
colnames(Apply_lables)[1] <- "Old_labels"

问题：作为R的初学者，我花了几个小时努力学习。虽然我成功地得到了我想要的，但是有没有更好的方法（即更少的代码行和更快的实现）来根据查找表改变因子？我的原始数据集大约有1M行，上面的代码需要花费大量时间运行

我确实研究过这个话题，但我不认为这在任何地方都有涉及。尽管有几篇文章讨论了使用match（）使用查找表来更改行。

如果您想替换特定列表中的

旧标签，我想您可以将级别（）子集为所需的因子，并将新标签推到那里：
levels(dta$Rio.Olympics.Sports.Participating.Team)[
    dta$Rio.Olympics.Sports.Participating.Team %in% Old_labels] <-
    New_labels 

级别（dta$Rio、奥运会、体育、参与、团队）[
dta$Rio.Olympics.Sports.Participating.Team%in%Old_labels]如果您想替换特定列表中的旧标签
，我想您可以将您的级别（）细分为所需的因子，并将新标签推到那里：
levels(dta$Rio.Olympics.Sports.Participating.Team)[
    dta$Rio.Olympics.Sports.Participating.Team %in% Old_labels] <-
    New_labels 

级别（dta$Rio、奥运会、体育、参与、团队）[
dta$Rio.Olympics.Sports.Participating.Team%in%Old_labels]查看本周到达CRAN的新forcats
包。或者只使用levels（df$var）[match（Old_labels，levels（df$var））]查看本周到达CRAN的新forcats
包。或者只使用levels（df$var）[match（Old_labels，levels（df$var））]
levels(dta$Rio.Olympics.Sports.Participating.Team)[
    dta$Rio.Olympics.Sports.Participating.Team %in% Old_labels] <-
    New_labels