Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 多因素替代水平_R_Rename_R Factor - Fatal编程技术网

R 多因素替代水平

R 多因素替代水平,r,rename,r-factor,R,Rename,R Factor,我需要在一个数据帧中替换多个因素的级别,以便它们都是统一的。 例如,这些是其中一个因素中的水平: > levels(workco[,5]) [1] " " "1" "2" [4] "kóko" "kesätyö" "K

我需要在一个数据帧中替换多个因素的级别,以便它们都是统一的。 例如,这些是其中一个因素中的水平:

> levels(workco[,5])
 [1] " "                              "1"                              "2"                             
 [4] "kóko"                          "kesätyö"                      "Kesätyö kokoaika"            
 [7] "koko"                           "kokop"                          "kokop."                        
[10] "Kokopäivä"                    "kokopäiväinen"                "Kokopäiväinen"               
[13] "kokopäiväinen / osa-aikainen" "kokopäivänen"                 "kokp"                          
[16] "kokp."                          "Kokp."                          "osa-aik"                       
[19] "Osa-aik / Kokopäiv."           "osa-aik."                       "Osa-aik."                      
[22] "osa-aikainen"                   "Osa-aikainen"                   "osa-aikainen/kokopäiväinen"  
[25] "Osa/kokoaikainen"               "Osap."                  
假设我有12列都是因素,它们有不同的级别名称,表示不同的含义:从示例中可以看到,其中许多级别名称中显示相同的字母:
koko,kok,kop
。。。 我希望通过统一获得三个级别:
kokop
osa
kes
。此外,以编号
1
2
命名的级别应分别重新编码为
kokop
osa

到目前为止,我尝试过的事情没有成功,我担心这是因为我的思维方式比实际更复杂:我尝试过分别使用
adist()
函数和
grep()
进行循环,但我发现了错误。 例如:

code <- c("kok","osa","ma","kes",1,2," ")
list.names <- c("1", "2", "3", "4", "5", "6","7","8","9","10","11","12")
mylist <- vector("list", length(list.names))
names(mylist) <- list.names
D <- mylist
index <- mylist

for (i in ncol(workco2)){                            
  D[[i]] <- adist(workco2[,i],code,ignore.case=TRUE)
  index[[i]] <- lapply(D[[i]],which.min)
  workco2[,i] <- data.frame(code[index[[i]]])
}

你能告诉我怎么解决吗?可能比我想象的要简单得多

我通常合并因子,如下例所示。 我将子集与我的标准相对应的级别(
…%在%c(…)
)中),并用新级别覆盖它们

set.seed(357)
xy <- data.frame(name = sample(letters[1:4], size = 20, replace = TRUE), value = runif(20))
xy$name
  [1] a a b a c b d c d d c c b a c a b d c b
  Levels: a b c d
levels(xy$name)[levels(xy$name) %in% c("a", "b")] <- "a-b"
levels(xy$name)[levels(xy$name) %in% c("c", "d")] <- "c-d"
xy$name
 [1] a-b a-b a-b a-b c-d a-b c-d c-d c-d c-d c-d c-d a-b a-b c-d a-b a-b c-d c-d a-b
Levels: a-b c-d
set.seed(357)

xy我猜您需要grep和replace的组合。 这可能会加快类似音节(“ko”、“kok”)的音阶变化

数据示例

code <- as.factor(c("kok","osa","ma","kes", "koko", "osa-aikainen", "osa/kes"))

请输入代码和预期输出。对于混合级别,如“kokopÕivÕinen/osa aikainen”
,应该怎么做?对不起,罗兰,刚刚粘贴了错误消息。混合级别应编码为“osa”,如果出现,则编码为“kes”,如果osa/kes同时显示,则选择“kes”。@Gina Zetkin。我们的答案对你有帮助吗?
code <- as.factor(c("kok","osa","ma","kes", "koko", "osa-aikainen", "osa/kes"))
levels(code) <- c(levels(code), "kokop")
new.code <- replace(code, (grep ("kok", code)), "kokop")
new.code <- replace(code, (grep ("osa/kes", code)), "kes")
new.code <- replace(code, (grep ("ko", code)), "kokop")