R 无变量数据时添加NULL
下面是一个示例DF,它说明了我遇到的问题。我遇到了一个问题,一个组的所有变量都没有一个值,所以R没有为它返回任何东西。也就是说,在下面的数据中,R返回:R 无变量数据时添加NULL,r,dataframe,tidyverse,missing-data,R,Dataframe,Tidyverse,Missing Data,下面是一个示例DF,它说明了我遇到的问题。我遇到了一个问题,一个组的所有变量都没有一个值,所以R没有为它返回任何东西。也就是说,在下面的数据中,R返回: Course Gender n English1 Female 1 English1 Male 3 English2 Female 2 English2 Male 1 English2 Unknown 1 English3 Female 3 English3 Unknown 1 df1 <- data.frame
Course Gender n
English1 Female 1
English1 Male 3
English2 Female 2
English2 Male 1
English2 Unknown 1
English3 Female 3
English3 Unknown 1
df1 <- data.frame("Course"=c("English1", "English1", "English1", "English1",
"English2", "English2", "English2", "English2",
"English3", "English3", "English3", "English3"),
Gender=c("Male", "Female", "Male", "Male", "Male", "Female",
"Unknown", "Female", "Female", "Female", "Female",
"Unknown"), Grade=c("A", "A", "C", "D", "D", "A", "B",
"C", "B", "D", "A", "C"))
library(dplyr)
df1 %>% group_by(Course, Gender) %>% count
我之所以需要这样做,是因为我需要有相同的组(每门课程有三个性别)来进行rMarkdown输出。非常感谢您的帮助
data.frame(xtabs(a~Gender+Course,cbind(a=1,df1)))[c(2,1,3)]
Course Gender Freq
1 English1 Female 1
2 English1 Male 3
3 English1 Unknown 0
4 English2 Female 2
5 English2 Male 1
6 English2 Unknown 1
7 English3 Female 3
8 English3 Male 0
9 English3 Unknown 1
如果您不关心订购,则:
data.frame(xtabs(Grade~.,cbind(Grade=1,df1)))
实际上,在代码中的
count
函数之后使用complete
函数的dplyr
解决方案。您可以选择fill=list(value=0)选项,用所需的值填充缺少的行,但也可以是任何其他行
请注意,您必须先取消分组
,否则将对每组执行一次此操作,从而复制行
这一点现在非常简单,并且更适合您表达需求的方式:
df1 %>%
group_by(Course,Gender) %>%
count %>%
ungroup() %>%
complete(Course,Gender,fill=list(n=0))
# A tibble: 9 x 3
Course Gender n
<fct> <fct> <dbl>
1 English1 Female 1
2 English1 Male 3
3 English1 Unknown 0
4 English2 Female 2
5 English2 Male 1
6 English2 Unknown 1
7 English3 Female 3
8 English3 Male 0
9 English3 Unknown 1
df1%>%
分组依据(课程、性别)%>%
计数%>%
解组()%>%
完成(课程、性别、填写=列表(n=0))
#一个tibble:9x3
课程性别
1名英国人1名女性1名
2英语1男3
3英语1未知0
4英语2女2
5英语2男1
6英语2未知1
7英语3女3
8英语3男0
9英语3未知1
从dplyr 0.8.0
开始,您只需在语句中添加.drop=FALSE
:
df1 %>%
group_by(Course, Gender, .drop = FALSE) %>%
count
输出:
# A tibble: 9 x 3
# Groups: Course, Gender [9]
Course Gender n
<fct> <fct> <int>
1 English1 Female 1
2 English1 Male 3
3 English1 Unknown 0
4 English2 Female 2
5 English2 Male 1
6 English2 Unknown 1
7 English3 Female 3
8 English3 Male 0
9 English3 Unknown 1
请参阅tidyr::complete()
分组依据(课程、性别、.drop=FALSE)
如果您使用的是dplyr 0.8.0
或更高版本,则可解决此问题
# A tibble: 9 x 3
# Groups: Course, Gender [9]
Course Gender n
<fct> <fct> <int>
1 English1 Female 1
2 English1 Male 3
3 English1 Unknown 0
4 English2 Female 2
5 English2 Male 1
6 English2 Unknown 1
7 English3 Female 3
8 English3 Male 0
9 English3 Unknown 1
df1 %>% count(Course, Gender, .drop = FALSE)
# A tibble: 9 x 3
Course Gender n
<fct> <fct> <int>
1 English1 Female 1
2 English1 Male 3
3 English1 Unknown 0
4 English2 Female 2
5 English2 Male 1
6 English2 Unknown 1
7 English3 Female 3
8 English3 Male 0
9 English3 Unknown 1