R 使用'ftable'属性提取数据
我有时使用R 使用'ftable'属性提取数据,r,subset,R,Subset,我有时使用ftable函数纯粹是为了表示层次类别。但是,有时,当表很大时,我希望在使用它之前进一步子集该表 假设我们从以下几点开始: mytable <- ftable(Titanic, row.vars = 1:3) mytable ## Survived No Yes ## Class Sex Age ## 1st Male Child 0 5 ##
ftable
函数纯粹是为了表示层次类别。但是,有时,当表很大时,我希望在使用它之前进一步子集该表
假设我们从以下几点开始:
mytable <- ftable(Titanic, row.vars = 1:3)
mytable
## Survived No Yes
## Class Sex Age
## 1st Male Child 0 5
## Adult 118 57
## Female Child 0 1
## Adult 4 140
## 2nd Male Child 0 11
## Adult 154 14
## Female Child 0 13
## Adult 13 80
## 3rd Male Child 35 13
## Adult 387 75
## Female Child 17 14
## Adult 89 76
## Crew Male Child 0 0
## Adult 670 192
## Female Child 0 0
## Adult 3 20
str(mytable)
## ftable [1:16, 1:2] 0 118 0 4 0 154 0 13 35 387 ...
## - attr(*, "row.vars")=List of 3
## ..$ Class: chr [1:4] "1st" "2nd" "3rd" "Crew"
## ..$ Sex : chr [1:2] "Male" "Female"
## ..$ Age : chr [1:2] "Child" "Adult"
## - attr(*, "col.vars")=List of 1
## ..$ Survived: chr [1:2] "No" "Yes"
## NULL
然而,我不喜欢这种方法,因为如果你不小心的话,总的布局有时会改变。将其与以下内容进行比较,其中删除了仅对儿童进行分组的要求,并增加了仅对未存活儿童进行分组的要求:
ftable(as.table(mytable)[c("1st", "3rd"), , , "No"])
## Age Child Adult
## Class Sex
## 1st Male 0 118
## Female 0 4
## 3rd Male 35 387
## Female 17 89
我不喜欢行和列的总体布局已经改变。在提取单个列时,必须记住使用drop=FALSE
来维护维度,这是一个典型的情况:
ftable(as.table(mytable)[c("1st", "3rd"), , , "No", drop = FALSE])
## Survived No
## Class Sex Age
## 1st Male Child 0
## Adult 118
## Female Child 0
## Adult 4
## 3rd Male Child 35
## Adult 387
## Female Child 17
## Adult 89
我知道有很多方法可以获得我想要的数据,从原始数据的子集开始,然后使我的ftable
,但是对于这个问题,让我们假设这是不可能的
最终目标是采用一种方法,使我能够从ftable
中提取,从而保留嵌套“行”层次结构的显示格式
还有其他解决办法吗?我们是否可以使用行.vars
和列.vars
属性从ftable
中提取数据并保留其格式
我目前的方法也不适用于分层列,因此我希望建议的解决方案也能处理这些情况 例如:
mytable[c("1st", "3rd"), , "Child", ]
## Error: incorrect number of dimensions
## Only the underlying data are seen as having dims
dim(mytable)
## [1] 16 2
## I'm OK with the "Age" column being dropped in this case....
ftable(as.table(mytable)[c("1st", "3rd"), , "Child", ])
## Survived No Yes
## Class Sex
## 1st Male 0 5
## Female 0 1
## 3rd Male 35 13
## Female 17 14
tab2 <- ftable(Titanic, row.vars = 1:2, col.vars = 3:4)
tab2
## Age Child Adult
## Survived No Yes No Yes
## Class Sex
## 1st Male 0 5 118 57
## Female 0 1 4 140
## 2nd Male 0 11 154 14
## Female 0 13 13 80
## 3rd Male 35 13 387 75
## Female 17 14 89 76
## Crew Male 0 0 670 192
## Female 0 0 3 20
我可以通过以下方式回到我想要的:
ftable(as.table(tab2)[c("1st", "3rd"), , , , drop = FALSE], row.vars = 1:2, col.vars = 3:4)
但我希望有更直接的东西。以下是我能够一起破解的东西,来自:
一旦数据通过各种因素的组合聚合到频率,就像
泰坦尼克号
数据集一样,可以说,将原始数据子集并将其制表以供显示比操纵输出对象更容易
我知道OP要求使用ftable
解决方案,但在评论部分来回征求其他意见,我想我会对这个问题发表一个不同的看法,因为它说明了一种在没有自定义函数的情况下同时对数据子集和生成列联表的层次结构的方法
这是一种使用表
包的方法,它保留了泰坦尼克号
数据的层次结构,并且在我们对数据帧进行子集划分时消除了空单元格
首先,我们将传入的表转换为数据帧,以便在tablar()
函数中对其进行子集
library(titanic)
df <- as.data.frame(Titanic)
…以及输出:
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
Age
Child
Survived
Class Sex No Yes
1st Male 0 5
Female 0 1
3rd Male 35 13
Female 17 14
如果我们删除DropEmpty()
,我们将根据表中的因子变量复制整个表格结构
> # remove DropEmpty() to replicate entire factor structure
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum,
+ data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
Age
Child Adult
Survived Survived
Class Sex No Yes No Yes
1st Male 0 5 0 0
Female 0 1 0 0
2nd Male 0 0 0 0
Female 0 0 0 0
3rd Male 35 13 0 0
Female 17 14 0 0
Crew Male 0 0 0 0
Female 0 0 0 0
>
复制OP中的第二个和第三个示例也很简单
> # second example from question
> tabular((Class * Sex * Age) ~ Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd") & df$Survived=="No",])
Survived
Class Sex Age No
1st Male Child 0
Adult 118
Female Child 0
Adult 4
3rd Male Child 35
Adult 387
Female Child 17
Adult 89
> # third example from question
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd"),])
Age
Child Adult
Survived Survived
Class Sex No Yes No Yes
1st Male 0 5 118 57
Female 0 1 4 140
3rd Male 35 13 387 75
Female 17 14 89 76
>
我以前走过这条路。放弃了,现在我将原始数据子集并使用
ftable
。@RomanLuštrik,你有什么进展想分享吗?我喜欢ftable
,但遗憾的是它似乎被忽视了。它甚至没有一个合适的as.data.frame
方法……您希望的结果是在控制台中查看输出,还是打算格式化此表以在文档中使用?您是否愿意接受一种不使用ftable
的替代方法?@KevinArseneau,更多内容请在控制台查看,出于好奇,最好的方法是什么。我知道有几个软件包可以为具有分层行和列的报告创建很棒的LaTeX和HTML表。Hi。谢谢你的回答。我知道我在报告中使用的“表格”软件包,但正如我在问题下的评论中所提到的,这并不完全是我想要的。不过,对于那些可能不熟悉的人来说,tabular
功能的一些很好的例子。@A5C1D2H2I1M1N2O1R2T1-理解最初的问题是关于ftable()
,正如我在回答中指出的。我认为将数据集聚合成一个狭义的格式整洁的数据集,并将tabular()
的data=
参数中的数据子集以分层方式显示要比通过ftable()
操作对象结构输出更容易。也就是说,您发布的提取操作符的特定版本是操纵ftable()
对象的一种创造性方法。
tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
Age
Child
Survived
Class Sex No Yes
1st Male 0 5
Female 0 1
3rd Male 35 13
Female 17 14
> # remove DropEmpty() to replicate entire factor structure
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum,
+ data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
Age
Child Adult
Survived Survived
Class Sex No Yes No Yes
1st Male 0 5 0 0
Female 0 1 0 0
2nd Male 0 0 0 0
Female 0 0 0 0
3rd Male 35 13 0 0
Female 17 14 0 0
Crew Male 0 0 0 0
Female 0 0 0 0
>
> # second example from question
> tabular((Class * Sex * Age) ~ Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd") & df$Survived=="No",])
Survived
Class Sex Age No
1st Male Child 0
Adult 118
Female Child 0
Adult 4
3rd Male Child 35
Adult 387
Female Child 17
Adult 89
> # third example from question
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd"),])
Age
Child Adult
Survived Survived
Class Sex No Yes No Yes
1st Male 0 5 118 57
Female 0 1 4 140
3rd Male 35 13 387 75
Female 17 14 89 76
>