R编程:plyr如何使用ddply计算列中的值

R编程:plyr如何使用ddply计算列中的值,r,plyr,R,Plyr,我想将我的数据的通过/失败状态总结如下。换句话说,我想告诉您每种产品/类型的合格和不合格案例的数量 library(ggplot2) library(plyr) product=c("p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2") type=c("t1","t1","t1","t1","t1","t1","

我想将我的数据的通过/失败状态总结如下。换句话说,我想告诉您每种产品/类型的合格和不合格案例的数量

library(ggplot2)
library(plyr)
product=c("p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2")
type=c("t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2","t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2")
skew=c("s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2")
color=c("c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3")
result=c("pass","pass","fail","pass","pass","pass","fail","pass","fail","pass","fail","pass","fail","pass","fail","pass","pass","pass","pass","fail","fail","pass","pass","fail")
df = data.frame(product, type, skew, color, result)
下面的cmd返回pass+fail案例的总数,但我希望pass和fail有单独的列

dfSummary <- ddply(df, c("product", "type"), summarise, N=length(result))
理想的结果是

         product type Pass Fail
 1       p1      t1   5    1
 2       p1      t2   3    3
 3       p2      t1   4    2
 4       p2      t2   3    3
我尝试过这样的事情:

 dfSummary <- ddply(df, c("product", "type"), summarise, Pass=length(df$product[df$result=="pass"]), Fail=length(df$product[df$result=="fail"]) )
dfSummary试试:

说明:

  • 您正在将数据集
    df
    提供给
    ddply
    函数
  • ddply
    正在拆分变量“产品”和“类型”
    • 这导致
      length(unique(product))*length(unique(type))
      片段(即数据的子集
      df
      )在两个变量的每个组合上分割
  • 对于每个片段,
    ddply
    应用您提供的一些功能。在本例中,您计算有多少个
    result==“pass”
    result==“fail”
  • 现在,
    ddply
    为每个工件留下了一些结果,即您拆分的变量(产品和类型)和您请求的结果(通过和失败)
  • 它将所有片段组合在一起并返回

  • 您还可以使用
    重塑2::dcast

    library(reshape2)
    dcast(product + type~result,data=df, fun.aggregate= length,value.var = 'result')
    ##   product type fail pass
    ## 1      p1   t1    1    5
    ## 2      p1   t2    3    3
    ## 3      p2   t1    2    4
    ## 4      p2   t2    3    3
    

    太好了,这正是我需要的!谢谢你的及时答复!太多了!这也行。比ddply快得多。Thanx:)
    dfSummary <- ddply(df, c("product", "type"), summarise, 
                       Pass=sum(result=="pass"), Fail=sum(result=="fail") )
    
      product type Pass Fail
    1      p1   t1    5    1
    2      p1   t2    3    3
    3      p2   t1    4    2
    4      p2   t2    3    3
    
    library(reshape2)
    dcast(product + type~result,data=df, fun.aggregate= length,value.var = 'result')
    ##   product type fail pass
    ## 1      p1   t1    1    5
    ## 2      p1   t2    3    3
    ## 3      p2   t1    2    4
    ## 4      p2   t2    3    3