两列的结果出现在R中的同一列中
我有一个包含Sol.grp非数字和年龄数字列的数据框。我试图将年龄平均值和观察值计数存储在两个单独的列中 我使用了以下代码: 示例数据:前20行两列的结果出现在R中的同一列中,r,aggregate-functions,R,Aggregate Functions,我有一个包含Sol.grp非数字和年龄数字列的数据框。我试图将年龄平均值和观察值计数存储在两个单独的列中 我使用了以下代码: 示例数据:前20行 sol.grp age Account A 29.6 Account B 29.6 WMID 26.9 Qty 1.7 PM 3.0 CS 2043.8 ED 24.3 TM 24.3 Account A 24.3 Account
sol.grp age
Account A 29.6
Account B 29.6
WMID 26.9
Qty 1.7
PM 3.0
CS 2043.8
ED 24.3
TM 24.3
Account A 24.3
Account B 133.3
WMID 27.0
Qty 2.1
PM 29.2
CS 29.4
ED 97.8
TM 192.9
Account A 651.6
Account B 148.6
WMID 125.2
Qty 31.1
您可以使用data.table尝试此操作
数据
以下来自您自己的代码的代码运行良好:
aggregate(age~sol.grp, data=na.omit(all.tkts), FUN=function(x) c(mean= mean(x), count=length(x)))
sol.grp age.mean age.count
1 Account A 235.16667 3.00000
2 Account B 103.83333 3.00000
3 CS 1036.60000 2.00000
4 ED 61.05000 2.00000
5 PM 16.10000 2.00000
6 Qty 11.63333 3.00000
7 TM 108.60000 2.00000
8 WMID 59.70000 3.00000
避免将data.frame放在aggregate周围,因为aggregate返回data.frame
编辑:
输出的详细信息如下:
> dd = aggregate(age~sol.grp, data=na.omit(all.tkts), FUN=function(x) c(mean= mean(x), count=length(x)))
> str(dd)
'data.frame': 8 obs. of 2 variables:
$ sol.grp: Factor w/ 8 levels "Account A","Account B",..: 1 2 3 4 5 6 7 8
$ age : num [1:8, 1:2] 235.2 103.8 1036.6 61 16.1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "mean" "count"
>
> dd$sol.grp
[1] Account A Account B CS ED PM Qty TM WMID
Levels: Account A Account B CS ED PM Qty TM WMID
> dd$age
mean count
[1,] 235.16667 3
[2,] 103.83333 3
[3,] 1036.60000 2
[4,] 61.05000 2
[5,] 16.10000 2
[6,] 11.63333 3
[7,] 108.60000 2
[8,] 59.70000 3
>
> dd$age[,2]
[1] 3 3 2 2 2 3 2 3
>
> dd$age[,1]
[1] 235.16667 103.83333 1036.60000 61.05000 16.10000 11.63333 108.60000 59.70000
最好使用dput显示一个示例数据集,即.dputheaddata,20-do.calldata.frame,aggregate。。。在没有示例数据的情况下未测试。@Mathan无法使用示例数据集重现问题。seed25;all.tkts我现在在末尾添加了示例数据集。谢谢@Mathan我使用您的示例更新版本进行了测试,没有发现任何问题。@Mathan没有问题。我很想知道你是如何得到你在帖子中提到的输出的。:-你是否在你那端尝试了我的代码并得到了相同的结果?@Mathan是的,我做了,唯一的区别是它不会改变“将矩阵强制为数据帧”。第二栏。
library(data.table)
res1 <- setDT(all.tkts)[, list(Mean=mean(age, na.rm=TRUE), Count=.N),
keyby=sol.grp]
res2 <- do.call(data.frame,aggregate(age~sol.grp,
data=na.omit(all.tkts), FUN=function(x) c(mean= mean(x), count=length(x))))
res2
# sol.grp age.mean age.count
#1 Account A 235.16667 3
#2 Account B 103.83333 3
#3 CS 1036.60000 2
#4 ED 61.05000 2
#5 PM 16.10000 2
#6 Qty 11.63333 3
#7 TM 108.60000 2
#8 WMID 59.70000 3
all.tkts <- structure(list(sol.grp = structure(c(1L, 2L, 8L, 6L, 5L, 3L,
4L, 7L, 1L, 2L, 8L, 6L, 5L, 3L, 4L, 7L, 1L, 2L, 8L, 6L), .Label = c("Account A",
"Account B", "CS", "ED", "PM", "Qty", "TM", "WMID"), class = "factor"),
age = c(29.6, 29.6, 26.9, 1.7, 3, 2043.8, 24.3, 24.3, 24.3,
133.3, 27, 2.1, 29.2, 29.4, 97.8, 192.9, 651.6, 148.6, 125.2,
31.1)), .Names = c("sol.grp", "age"), class = "data.frame", row.names = c(NA,
-20L))
aggregate(age~sol.grp, data=na.omit(all.tkts), FUN=function(x) c(mean= mean(x), count=length(x)))
sol.grp age.mean age.count
1 Account A 235.16667 3.00000
2 Account B 103.83333 3.00000
3 CS 1036.60000 2.00000
4 ED 61.05000 2.00000
5 PM 16.10000 2.00000
6 Qty 11.63333 3.00000
7 TM 108.60000 2.00000
8 WMID 59.70000 3.00000
> dd = aggregate(age~sol.grp, data=na.omit(all.tkts), FUN=function(x) c(mean= mean(x), count=length(x)))
> str(dd)
'data.frame': 8 obs. of 2 variables:
$ sol.grp: Factor w/ 8 levels "Account A","Account B",..: 1 2 3 4 5 6 7 8
$ age : num [1:8, 1:2] 235.2 103.8 1036.6 61 16.1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "mean" "count"
>
> dd$sol.grp
[1] Account A Account B CS ED PM Qty TM WMID
Levels: Account A Account B CS ED PM Qty TM WMID
> dd$age
mean count
[1,] 235.16667 3
[2,] 103.83333 3
[3,] 1036.60000 2
[4,] 61.05000 2
[5,] 16.10000 2
[6,] 11.63333 3
[7,] 108.60000 2
[8,] 59.70000 3
>
> dd$age[,2]
[1] 3 3 2 2 2 3 2 3
>
> dd$age[,1]
[1] 235.16667 103.83333 1036.60000 61.05000 16.10000 11.63333 108.60000 59.70000