R 创建一个新变量
我有以下数据: 我想从“R 创建一个新变量,r,dplyr,stringr,R,Dplyr,Stringr,我有以下数据: 我想从“NAME”创建一个变量“TITLE”,该变量的值为主控,未命中,MR,MRS和其他MISS,有时类似于MLLE,MRS有时使用dplyr软件包显示为Ms或MME 我试过这个: Title_Master <- titanic2 %>% filter(str_detect(Name, "Master") & Sex == "male") %>% mutate(Title = "Master") Title_Miss <- ti
NAME
”创建一个变量“TITLE
”,该变量的值为<代码>主控,未命中
,MR
,MRS
和其他
MISS
,有时类似于MLLE
,MRS
有时使用dplyr软件包显示为Ms
或MME
我试过这个:
Title_Master <- titanic2 %>%
filter(str_detect(Name, "Master") & Sex == "male") %>%
mutate(Title = "Master")
Title_Miss <- titanic2 %>%
filter((str_detect(Name, "Miss") | str_detect(Name, "Mmlle")) & Sex ==
"female") %>%
mutate(Title = "Miss")
Title_Mr <- titanic2 %>%
filter(str_detect(Name, "Mr") & Sex == "male") %>%
mutate(Title = "Mr")
Title_Mrs <- titanic2 %>%
filter((str_detect(Name, "Mrs") | str_detect(Name, "Ms") |
str_detect(Name, "Mme")) & Sex == "female") %>%
mutate(Title = "Mrs")
T_Title <- rbind(Title_Master, Title_Miss, Title_Mr, Title_Mrs)
Title\u Master%
过滤器(str_detect(Name,“Master”)&性别==“male”)%>%
变异(Title=“Master”)
职位空缺率%
过滤器((str_detect(Name,“Miss”)| str_detect(Name,“Mmlle”))&Sex==
“女性”)%%>%
变异(Title=“Miss”)
标题\u Mr%
过滤器(str_detect(Name,“Mr”)&性别==“male”)%>%
变异(Title=“Mr”)
头衔(女士)%
过滤器((str_detect(名称,“Mrs”)| str_detect(名称,“Ms”)|
str_detect(姓名,“Mme”)&性别==“女性”)%>%
变异(Title=“Mrs”)
T_Title#始终包括库,使用的数据集对再现性很重要
图书馆(tidyverse)
图书馆(stringr)
#安装软件包(“泰坦尼克号”)
图书馆(泰坦尼克号)
titanic2%变异(Title=case\u when(str\u detect(Name,“Master”)&Sex=male“~”Master“,
str|u detect(Name,“Miss | Mmlle”)&Sex==“female”~“Miss”,
str_detect(Name,“Mr”)&Sex==“male”~“Mr”,
str|u detect(Name,“Mrs | Ms | Mme”)&Sex==“female”~“Mrs”,
正确~“其他”)%%>%groupby(性别、头衔)%%>%summary(N=N())
#一个tibble:6x3
#组:性别[?]
性别头衔
1女78
2女73女士
3名女性其他1名
4男硕士21
5男240
6男5女
请参见dplyr::case\u whenHello,谢谢您的回复。我正在尝试这样做:泰坦尼克2%>%选择(姓名、性别)%>%变异(类型=case_当(“主人”和性别=“男性”~“主人”,“小姐”~“Mlle”)&性别=“女性”~“小姐”,“先生”&性别=“男性”~“先生”(“女士”~“女士”~“女士”)和性别=“女性”~“女士”~“真的”~“其他”))但是我得到了以下错误:错误:在:“((“Mrs”|“Ms”|“Mme”)&Sex==“female”~“Mrs”TRUE”dplyr::case\u when
是对多个嵌套的ifelse(…)
表达式的改进
#Always includes libraries and data set used is important for reproduciblity
library(tidyverse)
library(stringr)
#install.packages("titanic")
library(titanic)
titanic2 <- titanic::titanic_test
titanic2 %>% mutate(Title = case_when(str_detect(Name, "Master") & Sex == "male" ~ "Master",
str_detect(Name, "Miss|Mmlle") & Sex == "female" ~ "Miss",
str_detect(Name, "Mr") & Sex == "male" ~ "Mr",
str_detect(Name, "Mrs|Ms|Mme") & Sex == "female" ~ "Mrs",
TRUE ~ "OTHER")) %>% group_by(Sex, Title) %>% summarise(N=n())
# A tibble: 6 x 3
# Groups: Sex [?]
Sex Title N
<chr> <chr> <int>
1 female Miss 78
2 female Mrs 73
3 female OTHER 1
4 male Master 21
5 male Mr 240
6 male OTHER 5