Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/ms-access/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 强制将字符向量排序为因子,因子级别由另一个向量排序_R - Fatal编程技术网

R 强制将字符向量排序为因子,因子级别由另一个向量排序

R 强制将字符向量排序为因子,因子级别由另一个向量排序,r,R,设想这样一个数据集: # creating data for test set.seed(1839) id <- as.character(1:10) frequency <- sample(c("n", "r", "s", "o", "a"), 10, TRUE) frequency_value <- sapply( frequency, switch, "n" = -2, "r" = -1, "s" = 0, "o" = 1, "a" = 2 ) (test <-

设想这样一个数据集:

# creating data for test
set.seed(1839)
id <- as.character(1:10)
frequency <- sample(c("n", "r", "s", "o", "a"), 10, TRUE)
frequency_value <- sapply(
  frequency, switch, "n" = -2, "r" = -1, "s" = 0, "o" = 1, "a" = 2
)
(test <- data.frame(id, frequency, frequency_value))
变量
频率
具有我感兴趣的响应。它从从不到很少,有时到经常到永远。标签只是每个单词的第一个字母。顺序显示在
频率\u值中

我想做的是将
frequency
作为一个因子,其级别顺序为n、r、s、o、a。但是,我想让这取决于
频率\u值中的值。它们应该遵循
频率值中保存的顺序,而不是简单地硬编码(就像
因子(频率,级别=c(“n”、“r”、“s”、“o”、“a”))

我曾经考虑过使用这个
tidyverse
解决方案:

levels <- test[, c("frequency", "frequency_value")] %>% 
  unique() %>% 
  arrange(as.numeric(frequency_value)) %>% 
  pull(frequency) %>% 
  as.character()
test$frequency <- factor(test$frequency, levels)
级别%
唯一()%>%
排列(作为数字(频率值))%>%
拉力(频率)%>%
as.character()

测试$frequency对
中的
唯一的
组合(您正在使用的)使用顺序:

test$frequency <- factor(test$frequency, 
                         with(unique(test[, -1]), frequency[order(frequency_value)]))

Once选项可以只使用
dplyr
作为:

library(dplyr)
test <- test %>% arrange(frequency_value) %>% 
  mutate(frequency = factor(frequency, levels = unique(frequency))) 

test

#    id frequency frequency_value
# 1   7         n              -2
# 2   8         n              -2
# 3  10         n              -2
# 4   3         r              -1
# 5   9         r              -1
# 6   6         s               0
# 7   2         o               1
# 8   4         o               1
# 9   5         o               1
# 10  1         a               2

str(test)
#'data.frame':  10 obs. of  3 variables:
# $ id             : Factor w/ 10 levels "1","10","2","3",..: 8 9 2 4 10 7 3 5 6 1
# $ frequency      : Factor w/ 5 levels "n","r","s","o",..: 1 1 1 2 2 3 4 4 4 5
# $ frequency_value: num  -2 -2 -2 -1 -1 0 1 1 1 2
库(dplyr)
测试百分比排列(频率值)%>%
变异(频率=因子(频率,级别=唯一(频率)))
测验
#id频率值
#1 7 n-2
#2 8 n-2
#3 10 n-2
#4 3 r-1
#5 9 r-1
#6秒0
#7201
#84O1
#9501
#101A2
str(测试)
#“data.frame”:10个obs。共有3个变量:
#$id:系数w/10“1”、“10”、“2”、“3”级,8 9 2 4 10 7 3 5 6 1
#$frequency:系数w/5级别“n”、“r”、“s”、“o”…:1 1 2 3 4 5
#$frequency_值:num-2-2-1-1012
[1] a o r o o s n n r n
Levels: 
n r s o a
library(dplyr)
test <- test %>% arrange(frequency_value) %>% 
  mutate(frequency = factor(frequency, levels = unique(frequency))) 

test

#    id frequency frequency_value
# 1   7         n              -2
# 2   8         n              -2
# 3  10         n              -2
# 4   3         r              -1
# 5   9         r              -1
# 6   6         s               0
# 7   2         o               1
# 8   4         o               1
# 9   5         o               1
# 10  1         a               2

str(test)
#'data.frame':  10 obs. of  3 variables:
# $ id             : Factor w/ 10 levels "1","10","2","3",..: 8 9 2 4 10 7 3 5 6 1
# $ frequency      : Factor w/ 5 levels "n","r","s","o",..: 1 1 1 2 2 3 4 4 4 5
# $ frequency_value: num  -2 -2 -2 -1 -1 0 1 1 1 2