如何在R中按组创建订单统计信息?
如何在R中按组计算订单统计信息。我希望根据一列聚合结果,然后每个组只返回一行。根据某种顺序,该行应该是组的第n个元素。理想情况下,我只想使用基函数如何在R中按组创建订单统计信息?,r,R,如何在R中按组计算订单统计信息。我希望根据一列聚合结果,然后每个组只返回一行。根据某种顺序,该行应该是组的第n个元素。理想情况下,我只想使用基函数 x <- data.frame(Group=c("A","A", "A", "C", "C"), Name=c("v", "u", "w", "x", "y"), Quantity=c(3,3,4,2,0)) > x Group Name Quantity 1
x <- data.frame(Group=c("A","A", "A", "C", "C"),
Name=c("v", "u", "w", "x", "y"),
Quantity=c(3,3,4,2,0))
> x
Group Name Quantity
1 A v 3
2 A u 3
3 A w 4
4 C x 2
5 C y 0
我尝试了以下操作,但收到了一条没有信息的错误消息
aggregate.data.frame(x, list(x$Group), function(y){ max(y[,'Quantity'])})
Error in `[.default`(y, , "Quantity") (from #1) : incorrect number of dimensions"
一些聚合合并魔法:
f <- function(x, N) {
sel <- function(x) { # Choose the N-th highest value from the set, or lowest element if there < N unique elements. Is there a built-in for this?
z <- unique(x) # This assums that you wan the N-th highest unique value. Simply don't filter by unique if not.
z[order(z, decreasing=TRUE)][min(N, length(z))]
}
xNq <- aggregate(Quantity ~ Group, data=x, sel) # Choose the N-th highest quantity within each "Group"
xNm <- merge(x, xNq) # Add the matching "Name" values
x <- aggregate(Name ~ Quantity + Group, data=xNm, sel) # Choose the N-th highest Name in each group
x[c('Group', 'Name', 'Quantity')] # Put into original order
}
> f(x, 2)
## Group Name Quantity
## 1 A u 3
## 2 C y 0
> f(x, 1)
## Group Name Quantity
## 1 A w 4
## 2 C x 2
fx#定义订购功能,增加数量,减少名称
按顺序我和你一起去
do.call(rbind, by(x, x$Group, function(x)
x[order(-x$Quantity, x$Name),][1,]))
根据别人的建议。我发现它比其他发布的解决方案(我很欣赏)更适合我的思考过程。我认为你的“N”和“等级”应该一致x[ranks==2,]$Name
根据需要返回c('v','y')
而不是c('u','y')
。我一开始也犯了同样的错误。通过编辑,您在每个组中取了Name
的最小值,这对于示例来说恰好是正确的,因为排名1的情况下Name
只有一个值,但总体上不正确。+1是最简单的解决方案!我可以建议为dec/inc order..@agstudy在in.order函数中添加一个参数吗?这是一个有效的建议。如果我要自己使用这个,我肯定会这样做。不过,为了简洁起见,我将保持原样。
x <-
data.frame(
Group = c("A","A", "A", "C", "C", "A", "A") ,
Name = c("v", "u", "w", "x", "y" ,"v", "u") ,
Quantity = c(3,3,4,2,0,4,1)
)
# sort your data to start..
# note that Quantity vs. Group and Name
# are sorted in different directions,
# so the -as.numeric() flips them
x <-
x[
order(
-as.numeric( x$Group ) ,
x$Quantity ,
-as.numeric( x$Name ) ,
decreasing = TRUE
) ,
]
# once your data frame is sorted the way you want your Ns to occur, the rest is easy
# rank your data..
# just create the numerical order,
# but within each group..
# (or you could add those ranks directly to the data frame if you like)
ranks <-
unlist(
tapply(
order( x$Group ) ,
as.numeric( x$Group ) ,
order
)
)
# N = 1
x[ ranks == 1 , ]
# N = 2
x[ ranks == 2 , ]
# define ordering function, increasing on Quantity, decreasing on Name
in.order <- function(group) with(group, group[order(Quantity, -rank(Name)), ])
# set desired rank for each Group
N <- 2
# get Nth row by Group, according to in.order
group.rows <- by(x, x$Group, function(group) head(tail(in.order(group), N), 1))
# collapse rows into data.frame
do.call(rbind, group.rows)
# Group Name Quantity
# A A u 3
# C C y 0
do.call(rbind, by(x, x$Group, function(x)
x[order(-x$Quantity, x$Name),][1,]))