可用于R中Tufte盒形图的功能?
我有一些数据,我已经分成足够多的组,标准箱线图看起来非常拥挤。塔夫特有他自己的箱线图,在其中,基本上可以放下全部或一半箱线,如下所示: 一些样本数据:可用于R中Tufte盒形图的功能?,r,graphics,tufte,R,Graphics,Tufte,我有一些数据,我已经分成足够多的组,标准箱线图看起来非常拥挤。塔夫特有他自己的箱线图,在其中,基本上可以放下全部或一半箱线,如下所示: 一些样本数据: cw <- transform(ChickWeight, Time = cut(ChickWeight$Time,4) ) cw$Chick <- as.factor( sample(LETTERS[seq(3)], nrow(cw), replace=TRUE) ) levels(cw$Diet) <- c("Lo
cw <- transform(ChickWeight,
Time = cut(ChickWeight$Time,4)
)
cw$Chick <- as.factor( sample(LETTERS[seq(3)], nrow(cw), replace=TRUE) )
levels(cw$Diet) <- c("Low Fat","Hi Fat","Low Prot.","Hi Prot.")
cw您显然只需要一个垂直版本,因此我使用panel.bwclot代码,去掉所有非基本元素,如框和帽,并在参数中设置horizontal=FALSE,并创建了panel.tuftebxp函数。还将点的cex设置为默认值的一半。还有很多选择可以根据您的喜好进行调整。“时间”的“数字”因子名称看起来很草率,但我认为“概念证明”很清楚,你可以清理对你来说很重要的东西:
panel.tuftebxp <-
function (x, y, box.ratio = 1, box.width = box.ratio/(1 + box.ratio), horizontal=FALSE,
pch = box.dot$pch, col = box.dot$col,
alpha = box.dot$alpha, cex = box.dot$cex, font = box.dot$font,
fontfamily = box.dot$fontfamily, fontface = box.dot$fontface,
fill = box.rectangle$fill, varwidth = FALSE, notch = FALSE,
notch.frac = 0.5, ..., levels.fos = if (horizontal) sort(unique(y)) else sort(unique(x)),
stats = boxplot.stats, coef = 1.5, do.out = TRUE, identifier = "bwplot")
{
if (all(is.na(x) | is.na(y)))
return()
x <- as.numeric(x)
y <- as.numeric(y)
box.dot <- trellis.par.get("box.dot")
box.rectangle <- trellis.par.get("box.rectangle")
box.umbrella <- trellis.par.get("box.umbrella")
plot.symbol <- trellis.par.get("plot.symbol")
fontsize.points <- trellis.par.get("fontsize")$points
cur.limits <- current.panel.limits()
xscale <- cur.limits$xlim
yscale <- cur.limits$ylim
if (!notch)
notch.frac <- 0
#removed horizontal code
blist <- tapply(y, factor(x, levels = levels.fos), stats,
coef = coef, do.out = do.out)
blist.stats <- t(sapply(blist, "[[", "stats"))
blist.out <- lapply(blist, "[[", "out")
blist.height <- box.width
if (varwidth) {
maxn <- max(table(x))
blist.n <- sapply(blist, "[[", "n")
blist.height <- sqrt(blist.n/maxn) * blist.height
}
blist.conf <- if (notch)
sapply(blist, "[[", "conf")
else t(blist.stats[, c(2, 4), drop = FALSE])
ybnd <- cbind(blist.stats[, 3], blist.conf[2, ], blist.stats[,
4], blist.stats[, 4], blist.conf[2, ], blist.stats[,
3], blist.conf[1, ], blist.stats[, 2], blist.stats[,
2], blist.conf[1, ], blist.stats[, 3])
xleft <- levels.fos - blist.height/2
xright <- levels.fos + blist.height/2
xbnd <- cbind(xleft + notch.frac * blist.height/2, xleft,
xleft, xright, xright, xright - notch.frac * blist.height/2,
xright, xright, xleft, xleft, xleft + notch.frac *
blist.height/2)
xs <- cbind(xbnd, NA_real_)
ys <- cbind(ybnd, NA_real_)
panel.segments(rep(levels.fos, 2), c(blist.stats[, 2],
blist.stats[, 4]), rep(levels.fos, 2), c(blist.stats[,
1], blist.stats[, 5]), col = box.umbrella$col, alpha = box.umbrella$alpha,
lwd = box.umbrella$lwd, lty = box.umbrella$lty, identifier = paste(identifier,
"whisker", sep = "."))
if (all(pch == "|")) {
mult <- if (notch)
1 - notch.frac
else 1
panel.segments(levels.fos - mult * blist.height/2,
blist.stats[, 3], levels.fos + mult * blist.height/2,
blist.stats[, 3], lwd = box.rectangle$lwd, lty = box.rectangle$lty,
col = box.rectangle$col, alpha = alpha, identifier = paste(identifier,
"dot", sep = "."))
}
else {
panel.points(x = levels.fos, y = blist.stats[, 3],
pch = pch, col = col, alpha = alpha, cex = cex,
identifier = paste(identifier,
"dot", sep = "."))
}
panel.points(x = rep(levels.fos, sapply(blist.out, length)),
y = unlist(blist.out), pch = plot.symbol$pch, col = plot.symbol$col,
alpha = plot.symbol$alpha, cex = plot.symbol$cex*0.5,
identifier = paste(identifier, "outlier", sep = "."))
}
bwplot(weight ~ Diet + Time + Chick, data=cw, panel=
function(x,y, ...) panel.tuftebxp(x=x,y=y,...))
panel.tuftebxp这是我的函数。不幸的是,虽然它引用了panel.tuftebox,但我在学习R的前几个月编写了这段代码是为了一个非常特定的目的(因此,很遗憾,我无意对其进行推广),因此它从来没有作为单独的panel函数编写过
library(lattice)
library(taRifx)
compareplot(~weight | Diet * Time * Chick,
data.frame=cw ,
main = "Chick Weights",
box.show.mean=FALSE,
box.show.whiskers=FALSE,
box.show.box=FALSE
)
这里是惯用的ggplot
解决方案(或者更确切地说是一种优雅的黑客)
这是一个不使用任何软件包的解决方案,只需操作boxplotPAR
图形参数即可。我的建议最接近@DWin,但是去掉颜色和轴,只使用几行代码。@gsk3和@Ramnath的两个建议都非常好,而且比我的建议先进得多,但如果我可以评论的话,它们没有涉及塔夫特的主要哲学。如果我们能摆脱灰色背景、白色“囚禁栏”和不必要的颜色,上述所有解决方案都将获得清晰、简单和正确的数据墨水平衡
应该归功于的创作者,他们包括了可爱的chart.Boxplot
wrapper,灵感来自Tufte作品。我只是提取了函数的一些元素,使它更简单。只需附上@gsk3上面的“cw”样本数据
attach(cw)
par(mfrow=c(1,3))
boxplot(weight~Time, horizontal = F, main = "", xlab="Time", ylab="Weight",
pars = list(boxcol = "white", medlty = "blank", medpch=16, medcex = 1.3,
whisklty = c(1, 1), staplelty = "blank", outcex = 0.5), axes = FALSE)
axis(1,at=1:4,label=c(1:4))
axis(2)
boxplot(weight~Chick, horizontal = F, main = "", xlab = "Chick",
ylab = "", pars = list(boxcol = "white", medlty = "blank", medpch=16,
medcex = 1.3, whisklty = c(1, 1), staplelty = "blank", outcex = 0.5),
axes = FALSE)
axis(1,at=1:3,label=c("A","B","C"))
boxplot(weight~Diet, horizontal = F, main = "", xlab = "Diet", ylab = "",
pars = list(boxcol = "white", medlty = "blank", medpch=16, medcex = 1.3,
whisklty = c(1, 1), staplelty = "blank", outcex = 0.5), axes = FALSE)
axis(1,at=1:4,label=c("LoFat","HiFat","LoProt","HiProt"))
在软件包ggthemes
by中提供了制作一些簇状图的功能,可在上获得。该软件包为ggplot
提供了一系列主题,包括:
geom_tufterangeframe
:Tufte的范围框架
geom_tufteboxplot
:Tufte盒形图
theme_tufte
:一种基于tufte的最小墨水,用于定量信息的视觉显示
下面是github上的软件包自述文件中的Tufte最小箱线图示例:
也很可爱。但遗憾的是,没有ggbwTufte函数。您可能想先阅读“W.A.Stock和J.T.Behrens.盒、线和中间间隙图:显示特征对胡须长度估计准确性和偏差的影响。教育统计杂志,16(1):1-201991”,发现Tukey的箱线图变化不如经典形式。@hadley:可以。谢谢你的推荐。我想你是指塔夫特的版本,因为塔基的是经典形式?啊,是的,我的意思是塔夫特的变体是劣等的。希望你能编辑评论。@hadley:很酷的论文。再次感谢您的推荐。不过,我要指出的是,他们的“线图”(显示的结果几乎与经典的箱线图相同)更类似于第二种簇绒设计,这是我一直比较喜欢的设计。还有一个关于外部效度的问题,因为大学生几乎可以肯定看到的Tukey比Tufte箱型图更多。但是我会谨慎地使用它们,不管是什么都可以:-),做得很好。是的,肯定比其他的更像簇绒风格。
attach(cw)
par(mfrow=c(1,3))
boxplot(weight~Time, horizontal = F, main = "", xlab="Time", ylab="Weight",
pars = list(boxcol = "white", medlty = "blank", medpch=16, medcex = 1.3,
whisklty = c(1, 1), staplelty = "blank", outcex = 0.5), axes = FALSE)
axis(1,at=1:4,label=c(1:4))
axis(2)
boxplot(weight~Chick, horizontal = F, main = "", xlab = "Chick",
ylab = "", pars = list(boxcol = "white", medlty = "blank", medpch=16,
medcex = 1.3, whisklty = c(1, 1), staplelty = "blank", outcex = 0.5),
axes = FALSE)
axis(1,at=1:3,label=c("A","B","C"))
boxplot(weight~Diet, horizontal = F, main = "", xlab = "Diet", ylab = "",
pars = list(boxcol = "white", medlty = "blank", medpch=16, medcex = 1.3,
whisklty = c(1, 1), staplelty = "blank", outcex = 0.5), axes = FALSE)
axis(1,at=1:4,label=c("LoFat","HiFat","LoProt","HiProt"))