R 使用聚合函数以特定方式处理NA

R 使用聚合函数以特定方式处理NA,r,dataframe,na,R,Dataframe,Na,我有一个如下所示的数据框: Project Week Number Project1 01 46.0 Project2 01 46.4 Project3 01 105.0 Project1 02 70.0 Project2 02 84.0 Project3 02 34.8 Project1 03 83.0 Project3 03 37.9 编辑: 我想计算每周每个项目的总和 因此,我使用聚合函数: aggregate(Number ~ Projec

我有一个如下所示的数据框:

Project Week Number
Project1   01  46.0
Project2   01  46.4
Project3   01 105.0
Project1   02  70.0
Project2   02  84.0
Project3   02  34.8
Project1   03  83.0
Project3   03  37.9
编辑:

我想计算每周每个项目的总和

因此,我使用聚合函数:

aggregate(Number ~ Project + Week, data = my.df, sum)
如您所见,第3周的项目2没有价值

使用聚合函数只会将其保留为空。 我想要的是在行中填入0

我试过:

aggregate(Number ~ Project + Week, data = my.df, sum, na.action = 0)

但都不管用。 有什么想法吗?

您可以使用xtabs:

我们还可以使用tidyr包中的完整函数在第3周填写Project2的值。然后,我们可以聚合数据

library(tidyr)

my.df2 <- my.df %>% 
  complete(Project, Week, fill = list(Number = 0))

my.df2

# # A tibble: 9 x 3
#    Project  Week Number
#      <chr> <chr>  <dbl>
# 1 Project1    01   46.0
# 2 Project1    02   70.0
# 3 Project1    03   83.0
# 4 Project2    01   46.4
# 5 Project2    02   84.0
# 6 Project2    03    0.0
# 7 Project3    01  105.0
# 8 Project3    02   34.8
# 9 Project3    03   37.9
资料

或者,您可以使用填充为0的tidyr排列

然后使用“聚集”将其恢复为原始形式

aggregate(Number ~ Project + Week, data = my.df, sum) %>% 
  spread(key = Week,value = Number,fill = 0) %>% 
  gather(key = Week, value = Number,`1`,`2`,`3`)

您可以在BaseR中实现这一点,它相当于用BaseR翻译的tidyr::complete代码,请参见@www的答案

df <- merge(
  setNames(expand.grid(unique(df$Project),unique(df$Week)),c("Project","Week")),
  df, all.x=TRUE)
df$Number[is.na(df$Number)] <- 0

请使用dputwell显示您的数据,agregation函数不会神奇地创建最初不存在的数据!:-您需要首先明确地创建缺失组合的行,或者将输出与包含所有组合的data.Frame合并
library(tidyr)

my.df2 <- my.df %>% 
  complete(Project, Week, fill = list(Number = 0))

my.df2

# # A tibble: 9 x 3
#    Project  Week Number
#      <chr> <chr>  <dbl>
# 1 Project1    01   46.0
# 2 Project1    02   70.0
# 3 Project1    03   83.0
# 4 Project2    01   46.4
# 5 Project2    02   84.0
# 6 Project2    03    0.0
# 7 Project3    01  105.0
# 8 Project3    02   34.8
# 9 Project3    03   37.9
my.df <- read.table(text = "Project Week Number
Project1   '01'  46.0
                 Project2   01  46.4
                 Project3   01 105.0
                 Project1   02  70.0
                 Project2   02  84.0
                 Project3   02  34.8
                 Project1   03  83.0
                 Project3   03  37.9",
                 header = TRUE, stringsAsFactors = FALSE)

my.df$Week <- paste0("0", my.df$Week)
aggregate(Number ~ Project + Week, data = my.df, sum) %>% 
  spread(key = Week,value = Number,fill = 0)
aggregate(Number ~ Project + Week, data = my.df, sum) %>% 
  spread(key = Week,value = Number,fill = 0) %>% 
  gather(key = Week, value = Number,`1`,`2`,`3`)
df <- merge(
  setNames(expand.grid(unique(df$Project),unique(df$Week)),c("Project","Week")),
  df, all.x=TRUE)
df$Number[is.na(df$Number)] <- 0