R有条件地替换有序因子列中的值,而不丢失级别或其他属性
背景我正在处理从Qualtrics导出的大量大型调查数据集。每个数据集都有西班牙语和英语的重复问题。参与者回答的调查问题子集取决于他们对调查中R有条件地替换有序因子列中的值,而不丢失级别或其他属性,r,R,背景我正在处理从Qualtrics导出的大量大型调查数据集。每个数据集都有西班牙语和英语的重复问题。参与者回答的调查问题子集取决于他们对调查中lang问题的回答。西班牙语和英语问题的答案记录在数据框的不同列中。西班牙语答案的列名具有后缀\u sp。请参见下面的示例数据框 df <- structure(list(id = c(1,2,3,4,5,6,7,8,9,10), lang = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),
lang
问题的回答。西班牙语和英语问题的答案记录在数据框的不同列中。西班牙语答案的列名具有后缀\u sp
。请参见下面的示例数据框
df <- structure(list(id = c(1,2,3,4,5,6,7,8,9,10), lang = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("English / Inglés", "Spanish / Español"), class = c("ordered", "factor")), mob_1 = structure(c(5L, 2L, 6L, 1L, 6L, 8L, 8L, 8L, 8L, 8L), .Label = c("Strongly agree", "Agree", "Somewhat agree", "Neither agree nor disagree", "Somewhat disagree", "Disagree", "Strongly disagree", NA), class = c("ordered", "factor")), mob_2 = structure(c(2L, 3L, 2L, 3L, 5L, 6L, 6L, 6L, 6L, 6L), .Label = c("A lot worse", "A little worse", "The same", "A little better", "A lot better", NA), class = c("ordered", "factor")), mob_1_sp = structure(c(8L, 8L, 8L, 8L, 8L, 5L, 2L, 6L, 1L, 6L), .Label = c("Totalmente de acuerdo", "De acuerdo", "Algo de acuerdo", "Ni de acuerdo ni en desacuerdo", "Algo en desacuerdo", "En desacuerdo", "Totalmente en desacuerdo", NA), class = c("ordered", "factor")), mob_2_sp = structure(c(6L, 6L, 6L, 6L, 6L, 2L, 3L, 2L, 3L, 5L), .Label = c("Mucho peor", "Un poco peor", "Igual", "Un poco mejor", "Mucho mejor", NA), class = c("ordered", "factor"))), row.names = c(NA, -10L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"))
# A tibble: 10 x 6
id lang mob_1 mob_2 mob_1_sp mob_2_sp
<dbl> <ord> <ord> <ord> <ord> <ord>
1 1 English / Inglés Somewhat disagree A little worse NA NA
2 2 English / Inglés Agree The same NA NA
3 3 English / Inglés Disagree A little worse NA NA
4 4 English / Inglés Strongly agree The same NA NA
5 5 English / Inglés Disagree A lot better NA NA
6 6 Spanish / Español NA NA Algo en desacuerdo Un poco peor
7 7 Spanish / Español NA NA De acuerdo Igual
8 8 Spanish / Español NA NA En desacuerdo Un poco peor
9 9 Spanish / Español NA NA Totalmente de acuerdo Igual
10 10 Spanish / Español NA NA En desacuerdo Mucho mejor
> str(df)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 10 obs. of 6 variables:
$ id : num 1 2 3 4 5 6 7 8 9 10
$ lang : Ord.factor w/ 2 levels "English / Inglés"<..: 1 1 1 1 1 2 2 2 2 2
$ mob_1 : Ord.factor w/ 8 levels "Strongly agree"<..: 5 2 6 1 6 8 8 8 8 8
$ mob_2 : Ord.factor w/ 6 levels "A lot worse"<..: 2 3 2 3 5 6 6 6 6 6
$ mob_1_sp: Ord.factor w/ 8 levels "Totalmente de acuerdo"<..: 8 8 8 8 8 5 2 6 1 6
$ mob_2_sp: Ord.factor w/ 6 levels "Mucho peor"<"Un poco peor"<..: 6 6 6 6 6 2 3 2 3 5
我认为级别映射的问题在于,在调查响应中有8个级别,但在我的可复制数据框架中只有4个唯一值
如果您能帮我指出哪里出了问题,如果有什么方法可以在不影响列属性的情况下将西班牙语列值插入到英语列中,我将不胜感激 如何在不更改有序因子水平的情况下,从
mob\u 1\u sp
中添加值?由于mob_1_sp
中的级别最初并不在mob_1
中出现。@Ronakshamob_1_sp
中的级别与mob_1
中的级别相同,只是西班牙语和英语中的级别不同。我认为R不知道这些级别是相同的,唯一的区别是西班牙语和英语。@Ronaksha同意,但是,由于英文和西班牙文列之间因子的有序值在原始数据集中是相同的(例如,mob_1==1L
的标签是“强烈同意”
,而mob_1_sp==1L
的标签是“acuerdo总量”
(西班牙语翻译为“强烈同意”)我是否可以只将mob_1_sp
的数值插入mob_1
而不删除mob_1
的级别/标签?
for (i in colnames(df)) {
if(grepl("_sp", i)) {
eng_var <- gsub("_sp","",i) #get name of english variable equivalent
levels(df[[i]]) <- levels(df[[eng_var]]) #assign english levels to spanish variable
df[[eng_var]] = as.ordered(ifelse(df$lang=="Spanish / Español",as.numeric(df[[i]]),df[[eng_var]])) #conditionally replace values of english variable
levels(df[[eng_var]]) <- levels(df[[i]]) #re-assign english levels from spanish variable
}
}
> df
# A tibble: 10 x 6
id lang mob_1 mob_2 mob_1_sp mob_2_sp
<dbl> <ord> <ord> <ord> <ord> <ord>
1 1 English / Inglés Somewhat agree A lot worse NA NA
2 2 English / Inglés Agree A little worse NA NA
3 3 English / Inglés Neither agree nor disagree A lot worse NA NA
4 4 English / Inglés Strongly agree A little worse NA NA
5 5 English / Inglés Neither agree nor disagree The same NA NA
6 6 Spanish / Español Somewhat agree A lot worse Somewhat disagree A little worse
7 7 Spanish / Español Agree A little worse Agree The same
8 8 Spanish / Español Neither agree nor disagree A lot worse Disagree A little worse
9 9 Spanish / Español Strongly agree A little worse Strongly agree The same
10 10 Spanish / Español Neither agree nor disagree The same Disagree A lot better