Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
数据清理-转换为tidyverse_R_Tidyverse_Data Munging - Fatal编程技术网

数据清理-转换为tidyverse

数据清理-转换为tidyverse,r,tidyverse,data-munging,R,Tidyverse,Data Munging,我很好奇下面的代码是否可以转换为tidyverse代码。我尝试过dplyr::mutate,但未能使其正常工作 df$Gender[df$Gender == "M"] <- "Man" df$Gender[df$Gender == "Male"] <- "Man" df$Gender[df$Gender == "F"] <- "Woman" df$Gender[df$Gender == "Female"] <- "Woman" df$Gender[df$Gender ==

我很好奇下面的代码是否可以转换为tidyverse代码。我尝试过dplyr::mutate,但未能使其正常工作

df$Gender[df$Gender == "M"] <- "Man"
df$Gender[df$Gender == "Male"] <- "Man"
df$Gender[df$Gender == "F"] <- "Woman"
df$Gender[df$Gender == "Female"] <- "Woman"
df$Gender[df$Gender == "M & F"] <- "Man and Woman"
df$Gender[df$Gender == "Male & Female"] <- "Man and Woman"

df$Gender[df$Gender==“M”]这里有一种方法,使用
dplyr::case\u when()

最后,还有一个提示:当有很多独特的值需要分组时,使用
case\u when()
(或嵌套
ifelse()
s,或子集赋值等)可能会变得相当乏味。避免很多痛苦的一种方法是使用命名向量用字典样式的“查找表”(非正式术语——请参阅一些背景资料)替换每个值。根据我的经验,这通常感觉最干净:

# the unique values 
gender_values <- c("M","Man","Male","F","Woman","Female","MF","male-female")

# associate unique values with our new labels: "m", "f", and "b"
gender_lkup <- setNames(c("m","m","m","f","f","f","b","b"), gender_values)

# suppose this is a column of a df 
raw_column <- sample(gender_values, 10, replace=TRUE)

# create a clean one with `gender_lkup` 
clean_column <- gender_lkup[raw_column]

# inspect the two vectors side-by-side
data.frame(original=raw_column, cleaned=clean_column)
#唯一值

性别值这里有一种方法,使用
dplyr::case\u when()

最后,还有一个提示:当有很多独特的值需要分组时,使用
case\u when()
(或嵌套
ifelse()
s,或子集赋值等)可能会变得相当乏味。避免很多痛苦的一种方法是使用命名向量用字典样式的“查找表”(非正式术语——请参阅一些背景资料)替换每个值。根据我的经验,这通常感觉最干净:

# the unique values 
gender_values <- c("M","Man","Male","F","Woman","Female","MF","male-female")

# associate unique values with our new labels: "m", "f", and "b"
gender_lkup <- setNames(c("m","m","m","f","f","f","b","b"), gender_values)

# suppose this is a column of a df 
raw_column <- sample(gender_values, 10, replace=TRUE)

# create a clean one with `gender_lkup` 
clean_column <- gender_lkup[raw_column]

# inspect the two vectors side-by-side
data.frame(original=raw_column, cleaned=clean_column)
#唯一值
性别价值观
# the unique values 
gender_values <- c("M","Man","Male","F","Woman","Female","MF","male-female")

# associate unique values with our new labels: "m", "f", and "b"
gender_lkup <- setNames(c("m","m","m","f","f","f","b","b"), gender_values)

# suppose this is a column of a df 
raw_column <- sample(gender_values, 10, replace=TRUE)

# create a clean one with `gender_lkup` 
clean_column <- gender_lkup[raw_column]

# inspect the two vectors side-by-side
data.frame(original=raw_column, cleaned=clean_column)