R 数据帧分区，如何_R_Dataframe

R 数据帧分区，如何

r dataframe

R 数据帧分区，如何,r,dataframe,R,Dataframe,这是我的数据 Date male female test 2013-10-06 7.21 0.651 1 2013-10-12 NA NA 1 2013-10-18 4.68 1.040 1 2013-10-24 3.47 0.363 2 2013-10-30 2.42 0.507 2 基本上，我需要统计工商业污水附加费的有效个案数目： test nobs 1 2 2

这是我的数据

      Date    male  female test
2013-10-06    7.21   0.651  1
2013-10-12      NA      NA  1
2013-10-18    4.68   1.040  1
2013-10-24    3.47   0.363  2
2013-10-30    2.42   0.507  2

基本上，我需要统计工商业污水附加费的有效个案数目：

test    nobs
   1       2
   2       2

我完全是R的新手。我当前的代码继续生成0个NOB

partition <- function(directory, id = 1:200) {
  files = list.files(directory)
  results = NULL

  for(file in files) {
    data = read.csv(file)
    comp = complete.cases(data)

    for(i in id) {
      results["test"] = i
      r = comp["test" == i]
      results["nobs"] = length(r)
    }

  }     
  results
}

库（data.table）
df1库（data.table）
df1使用R基函数，df
是您的数据：
> res <- sapply(split(df, df$test), function(x) sum(complete.cases(x)), USE.NAMES=FALSE)
> res <- data.frame(test=names(res), nobs=res)
> res
  test nobs
1    1    2
2    2    2

>res
测试nobs
1    1    2
2    2    2
使用R基函数，df
是您的数据：
> res <- sapply(split(df, df$test), function(x) sum(complete.cases(x)), USE.NAMES=FALSE)
> res <- data.frame(test=names(res), nobs=res)
> res
  test nobs
1    1    2
2    2    2

>res
测试nobs
1    1    2
2    2    2
在基本R中使用聚合：
aggregate(list(nobs=complete.cases(data)), data["test"], FUN=sum)
#  test nobs
#1    1    2
#2    2    2

在基本R中使用聚合
：
aggregate(list(nobs=complete.cases(data)), data["test"], FUN=sum)
#  test nobs
#1    1    2
#2    2    2

使用dplyr：
library(dplyr)
data %>% na.omit %>% group_by(test) %>% summarise(nobs = n())
Source: local data frame [2 x 2]

  test nobs
1    1    2
2    2    2

使用dplyr：
library(dplyr)
data %>% na.omit %>% group_by(test) %>% summarise(nobs = n())
Source: local data frame [2 x 2]

  test nobs
1    1    2
2    2    2

使用plyr：
library("plyr")
partition <- function(directory, id = 1:200) {
  files <- list.files(directory)
  ldply(files, function (file) {
    data <- read.csv(file)
    data <- data[complete.cases(data), ]
    setNames(data.frame(file, table(factor(data$test, levels = id))), c("file", "test", "nobs"))
  })
}

库（“plyr”）
使用plyr进行分区：
library("plyr")
partition <- function(directory, id = 1:200) {
  files <- list.files(directory)
  ldply(files, function (file) {
    data <- read.csv(file)
    data <- data[complete.cases(data), ]
    setNames(data.frame(file, table(factor(data$test, levels = id))), c("file", "test", "nobs"))
  })
}

库（“plyr”）
谢谢。我已经用你的评论更新了答案。谢谢。我已经用你的评论更新了答案。或者data%%>%na.omit%%>%count（test）
，或者data%%>%na.omit%%>%groupby（test）%%>%tally（）
或者data%%>%na.omit%%>%count（test）
，或者data%%>%na.omit%%>%groupby（test）%%>%tally（）