R 根据条件修改数据框中的字符串名称

R 根据条件修改数据框中的字符串名称,r,replace,dplyr,R,Replace,Dplyr,我有一个数据框,其中包含一个名为“Control_Category”的变量。变量中有六个名称,为了简单起见,我将使其成为泛型: df <- data.frame(Control_Category = c("Really Long Name One", "Super Really Long Name Two", "Another Really Flippin' Long Name Three", ",Seriously, It's a Fourth Long Name", "Definite

我有一个数据框,其中包含一个名为“Control_Category”的变量。变量中有六个名称,为了简单起见,我将使其成为泛型:

df <- data.frame(Control_Category = c("Really Long Name One",
"Super Really Long Name Two",
"Another Really Flippin' Long Name Three",
",Seriously, It's a Fourth Long Name",
"Definitely a Fifth Long Name",
"Finally, This guy is done, number six"))

df每次更换只需使用
gsub()
一次:

df$Control_Category <- gsub('Really Long Name One', 'One',  df$Control_Category)

df$Control\u Category这里有一个较大的数据框,名称较长:

set.seed(101)
long_names <- c("Really Long Name One",
                "Super Really Long Name Two",
                "Another Really Flippin' Long Name Three",
                ",Seriously, It's a Fourth Long Name",
                "Definitely a Fifth Long Name",
                "Finally, This guy is done, number six")

df <- data.frame(control_category=sample(long_names, 100, replace=TRUE))
head(df)

##                          control_category
## 1 Another Really Flippin' Long Name Three
## 2                    Really Long Name One
## 3            Definitely a Fifth Long Name
## 4     ,Seriously, It's a Fourth Long Name
## 5              Super Really Long Name Two
## 6              Super Really Long Name Two
请注意,级别是按字母顺序排列的(请参见
级别(类别)
)。在这种情况下,最简单的方法是通过查看当前订单手动更改订单。在这种情况下,
category[c(2,5,1,4,3,6)]
将为您提供正确的顺序。最后,

df$control_category <- factor(
    df$control_category,
    levels=category[c(2, 5, 1, 4, 3, 6)],
    labels=c("one", "two", "three", "four", "five", "six")
)
head(df)

##   control_category
## 1            three
## 2              one
## 3             five
## 4             four
## 5              two
## 6              two

df$control_category在您的真实数据中,4和5总是用“第四”、“第五”表示,而1、2、3和6总是“一”、“二”、“三”、“六”?我想您要寻找的是一个因素:
df$control_category@parksw3我认为在真实数据中,有6行以上的行,要替换的值没有排序。@neilfws是的,但是如果他知道名称是什么以及他希望如何排序,他可以适当地指定级别和标签。让我试着在下面写一个较长的答案……我知道
因子
不再酷了,但这是一个完美的应用程序。顺序不重要。我只需要替换上面更新的问题中指定的所有名称。也就是说,循环该列数据,评估名称,更新名称,继续,重复。
df$control_category <- factor(
    df$control_category,
    levels=category[c(2, 5, 1, 4, 3, 6)],
    labels=c("one", "two", "three", "four", "five", "six")
)
head(df)

##   control_category
## 1            three
## 2              one
## 3             five
## 4             four
## 5              two
## 6              two