如何在R中仅对选定列重新编码_R_Dataframe

如何在R中仅对选定列重新编码

r dataframe

如何在R中仅对选定列重新编码,r,dataframe,R,Dataframe,我有一个数据框，具有以下列名和值： | ss1 | ss2 | ss3 | |Strongly Agree |Disagree |Agree | |Agree |Agree |Disagree | |Strongly Disagree|Agree |Disagree | |Disagree

我有一个

数据框

，具有以下列名和值：

|     ss1         |     ss2      |      ss3        |          
|Strongly Agree   |Disagree      |Agree            |
|Agree            |Agree         |Disagree         |
|Strongly Disagree|Agree         |Disagree         |
|Disagree         |Strongly Agree|Strongly Disagree|

我正在寻找一种只重新编码列

ss1

和

ss3

那样

Strongly Agree - 1
Agree - 2
Disagree - 3
Strongly Disagree - 4

但是ss2列应该以相反的方式重新编码，意思是强烈不同意-1、不同意-2、同意-3和强烈同意-4 到目前为止，我已经尝试了以下代码：

If((names(df=="ss1")) |(names(df=="ss3"))) {
   lapply(df, 
     FUN = function(x) recode(x, 
        "'Strongly Disagree'=4; 
         'Disagree'=3; 
         'Agree'=2; 
         'Strongly Agree'=1; 
         'No Opinion'=''"))}

我知道我的执行语句只能用于对所有列重新编码。有没有办法将重新编码限制为仅与IF表达式匹配的列名

还有一种方法可以在我的IF条件中使用逻辑“OR”

我想保留IF条件的原因是因为我想匹配列名，然后给出重新编码条件

输出将如下所示：

|     ss1         |     ss2      |      ss3        |          
|1                |2             |2                |
|2                |3             |3                |
|4                |3             |3                |
|3                |4             |4                |

如果问题有点不清楚，我很抱歉。

以下是如何使用

dplyr

完成此操作。如果要对列进行重新编码，请使用

mutate_at

和

recode

（按照p o m的建议）。因为ss1、ss3和ss2的顺序不同，所以您需要在处使用两个不同的

mutate_
library(dplyr)
df1  <- read.table(text="ss1              ss2            ss3
'Strongly Agree'   Disagree      Agree
Agree            Agree         Disagree
'Strongly Disagree' Agree         Disagree
Disagree         'Strongly Agree' 'Strongly Disagree'", header=TRUE, stringsAsFactors=FALSE)

df1 %>%
mutate_at(.cols= vars(ss1,ss3),
 .funs = funs(recode(., 'Strongly Disagree' = 4, 'Disagree' = 3, 'Agree' = 2,
 'Strongly Agree' = 1, .default = NA_real_)) ) %>%
mutate_at(.cols= vars(ss2),
 .funs = funs(recode(., 'Strongly Disagree' = 1, 'Disagree' = 2, 'Agree' = 3,
 'Strongly Agree' = 4, .default = NA_real_)) )
  ss1 ss2 ss3
1   1   2   2
2   2   3   3
3   4   3   3
4   3   4   4

库（dplyr）
df1%
在（.cols=vars（ss1，ss3）处突变，
.funs=funs（重新编码（，“强烈不同意”=4，“不同意”=3，“同意”=2，
“强烈同意”=1，.default=NA_real（默认值））%>%
在（.cols=vars（ss2）处突变，
.funs=funs（重新编码（，“强烈不同意”=1，“不同意”=2，“同意”=3，
“强烈同意”=4，.default=NA_real_41;）
ss1 ss2 ss3
1   1   2   2
2   2   3   3
3   4   3   3
4   3   4   4
使用数据的快速解决方案。表

library(data.table)

# function to reclassify columns
  myfun = function(x)  { ifelse(x=='Strongly Disagree', 4,
                       ifelse(x=='Disagree', 3,
                       ifelse(x=='Agree', 2,
                       ifelse(x=='Strongly Agree', 1,"")))) }

# indicate which columns should be transformed
  cols <- c('ss1', 'ss3')

# Reclassify columns
  setDT(df1)[, (cols) := lapply (.SD, myfun), .SDcols=cols]

库（data.table）
#用于重新分类列的函数
myfun=函数（x）{ifelse（x=='强烈反对'，4，
如果其他（x==‘不同意’，3，
如果else（x=='Agree'，2，
ifelse（x=='强烈同意'，1，“））}
#指示应转换哪些列
cols而不是嵌套的ifelse（）
s，dplyr
提供<代码>重新编码（）
；-）：<代码>df1%>%mutate_at（.cols=vars（ss1，ss3），.funs=funs（重新编码（，“强烈不同意”=4，“不同意”=3，“同意”=2，“强烈同意”=1，.default=NA_real_u））

我想这与data.table风格完全相反

ifelse

效率很低，而且很凌乱。我认为标准的方法应该是连接，比如（尽管使用

on=

而不是设置键）。谢谢大家的提醒@Frank。我没有想到这种方法，它看起来不那么直观，但效率更高。请随意使用您在链接中指出的策略添加您的答案@佩雷拉，谢谢你，先生！我真的很喜欢第二种方法。

library(data.table)
setDT(df1)

cols <- c('ss1', 'ss3')
recDT = data.table(
  old = c('Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'), 
  new = 4:1)

for (col in cols) df1[recDT, on=setNames("old", col), paste0(col, "_new") := i.new]