Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby-on-rails-4/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 在多变量中使用tapply_R_Tapply - Fatal编程技术网

R 在多变量中使用tapply

R 在多变量中使用tapply,r,tapply,R,Tapply,我有一组数据,其中包含有关客户及其花费的信息,每个客户只出现一次: customer<-c("Andy","Bobby","Oscar","Oliver","Jane","Cathy","Emma","Chris") age<-c(25,34,20,35,23,35,34,22) gender<-c("male","male","male","male","female","female","female","female") moneyspent<-c(100,100,

我有一组数据,其中包含有关客户及其花费的信息,每个客户只出现一次:

customer<-c("Andy","Bobby","Oscar","Oliver","Jane","Cathy","Emma","Chris")
age<-c(25,34,20,35,23,35,34,22)
gender<-c("male","male","male","male","female","female","female","female")
moneyspent<-c(100,100,200,200,400,400,500,200)

data<-data.frame(customer=customer,age=age,gender=gender,moneyspent=moneyspent)
其中:

female   male 
  375    150
然而,我现在想找出性别和年龄组的平均花费金额,我的目标是:

 Male Age 20-30      Female Age 20-30      Male Age 30-40      Female Age 30-40
    150                     300                 150                   450
我如何修改tapply代码,使其给出这些结果


谢谢

您可能需要使用
cut

mat <- tapply(moneyspent, list(gender, age=cut(age, breaks=c(20,30,40), 
                include.lowest=TRUE)), mean)

nm1 <- outer(rownames(mat), colnames(mat), FUN=paste)
setNames(c(mat), nm1)
#female [20,30]   male [20,30] female (30,40]   male (30,40] 
#       300            150            450            150 

使用plyr软件包 也会得到同样的结果

注意:Summarize和Summarise执行相同的功能

警告:加载
plyr
屏蔽了
dplyr
的摘要!在再次使用诸如“汇总”之类的功能之前,您需要分离
plyr

mat <- tapply(moneyspent, list(gender, age=cut(age, breaks=c(20,30,40), 
                include.lowest=TRUE)), mean)

nm1 <- outer(rownames(mat), colnames(mat), FUN=paste)
setNames(c(mat), nm1)
#female [20,30]   male [20,30] female (30,40]   male (30,40] 
#       300            150            450            150 
library(dplyr)
data %>% 
     group_by(gender, age=cut(age, breaks=c(20,30,40), 
              include.lowest=TRUE)) %>% 
     summarise(moneyspent=mean(moneyspent))
 library(data.table)
 setDT(data)[, list(moneyspent=mean(moneyspent)),
     by=list(gender, age=cut(age, breaks= c(20,30,40), include.lowest=TRUE))]
library(plyr)

ddply(data,.(gender, age=cut(age, breaks=c(20,30,40), 
                  include.lowest=TRUE)), summarize, moneyspent=mean(moneyspent))