R中数据帧的组合因子级别
我有三个级别的R中数据帧的组合因子级别,r,r-factor,R,R Factor,我有三个级别的因子类型变量:致命伤害,非致命伤害和仅p.D.: head(OttawaCollisions$Collision_Classification) [1] P.D. only Non-fatal injury P.D. only P.D. only P.D. only P.D. only Levels: Fatal injury Non-fatal injury P.D. only 如何将“致命伤害”和
因子类型变量:致命伤害
,非致命伤害
和仅p.D.
:
head(OttawaCollisions$Collision_Classification)
[1] P.D. only Non-fatal injury P.D. only P.D. only P.D. only P.D. only
Levels: Fatal injury Non-fatal injury P.D. only
如何将“致命伤害”和“非致命伤害”合并到一个级别,以便将死亡添加到伤害中
更妙的是,我能以某种方式消除死亡吗?在这种情况下,我需要从数据帧中删除每个致命的实例,而不仅仅是编码NA或其他内容。数据:
x <- factor( rep( c('P.D. only', 'Non-fatal injury' , 'fatal injury'), 2) )
x
# [1] P.D. only Non-fatal injury fatal injury P.D. only
# [5] Non-fatal injury fatal injury
# Levels: fatal injury Non-fatal injury P.D. only
编辑:基于数据帧名称的组合代码
OttawaCollisions$CollisionClass <- factor( x = OttawaCollisions$CollisionClass,
levels = c('P.D. only', 'Non-fatal injury' , 'fatal injury'),
labels = c('P.D. only', 'Fatalities', 'Fatalities'))
OttawaCollisions$CollisionClass <- droplevels(OttawaCollisions$CollisionClass)
EDIT3:另一个基本R解决方案。我更喜欢这个基本的R解决方案,而不是第一个(在factor()
中使用标签),因为当数据中有更多级别时,它会使生活更轻松
OttawaCollisions$CollisionClass <- as.character(OttawaCollisions$CollisionClass)
OttawaCollisions$CollisionClass <- factor( with(OttawaCollisions,
replace( CollisionClass,
CollisionClass %in% c( "fatal injury", "Non-fatal injury"),
"Fatalities") ) )
Ottawa$CollisionClass您还可以直接重新分配级别:
> test_df <- tibble(x=as.factor(c('Fatal','Non-fatal','PD','Fatal','Non-fatal','PD')), y=1:6)
> test_df
# A tibble: 6 x 2
x y
<fct> <int>
1 Fatal 1
2 Non-fatal 2
3 PD 3
4 Fatal 4
5 Non-fatal 5
6 PD 6
> levels(test_df$x)
[1] "Fatal" "Non-fatal" "PD"
>测试测向测试测向
#一个tibble:6x2
xy
1致命1
2非致命性2
3 PD 3
4致命4
5非致命5
第6页第6页
>等级(测试单位df$x)
[1] 致命“非致命”PD
现在您已经知道了顺序,请替换要组合的级别名称:
> levels(test_df$x) <- c("Fatal","Other","Other")
> test_df
# A tibble: 6 x 2
x y
<fct> <int>
1 Fatal 1
2 Other 2
3 Other 3
4 Fatal 4
5 Other 5
6 Other 6
>级别(测试测向$x)测试测向
#一个tibble:6x2
xy
1致命1
2其他2
3其他3
4致命4
5其他5
6其他6
然后您可以进行其他处理,例如:
> library(dplyr)
> test_df %>% group_by(x) %>% summarize(n)
# A tibble: 2 x 2
x n
<fct> <dbl>
1 Fatal 45.0
2 Other 45.0
>库(dplyr)
>测试测向%>%分组依据(x)%>%汇总(n)
#一个tibble:2x2
x n
1.45.0
2其他45.0
谢谢。我会马上告诉你,tx=x将是Ottawaclisions$CollisionClass=Ottawaclisions$CollisionClass?
> levels(test_df$x) <- c("Fatal","Other","Other")
> test_df
# A tibble: 6 x 2
x y
<fct> <int>
1 Fatal 1
2 Other 2
3 Other 3
4 Fatal 4
5 Other 5
6 Other 6
> library(dplyr)
> test_df %>% group_by(x) %>% summarize(n)
# A tibble: 2 x 2
x n
<fct> <dbl>
1 Fatal 45.0
2 Other 45.0