如何在ggplot2中创建Marimekko/马赛克绘图
当x和y都是分类变量时,Marimekko/Mosaic图是一个很好的默认图。使用ggplot创建这些图形的最佳方法是什么 我能找到的唯一参考是这个4yo,但它似乎有点过时。目前是否有更好或更简单的实现?GGally包有一个函数如何在ggplot2中创建Marimekko/马赛克绘图,r,ggplot2,R,Ggplot2,当x和y都是分类变量时,Marimekko/Mosaic图是一个很好的默认图。使用ggplot创建这些图形的最佳方法是什么 我能找到的唯一参考是这个4yo,但它似乎有点过时。目前是否有更好或更简单的实现?GGally包有一个函数GGally\u ratio,但这会产生一些完全不同的结果: 第一次尝试。但我不知道如何将因子标签放在轴上 makeplot_mosaic <- function(data, x, y, ...){ xvar <- deparse(substitute(
GGally\u ratio
,但这会产生一些完全不同的结果:
第一次尝试。但我不知道如何将因子标签放在轴上
makeplot_mosaic <- function(data, x, y, ...){
xvar <- deparse(substitute(x))
yvar <- deparse(substitute(y))
mydata <- data[c(xvar, yvar)];
mytable <- table(mydata);
widths <- c(0, cumsum(apply(mytable, 1, sum)));
heights <- apply(mytable, 1, function(x){c(0, cumsum(x/sum(x)))});
alldata <- data.frame();
allnames <- data.frame();
for(i in 1:nrow(mytable)){
for(j in 1:ncol(mytable)){
alldata <- rbind(alldata, c(widths[i], widths[i+1], heights[j, i], heights[j+1, i]));
}
}
colnames(alldata) <- c("xmin", "xmax", "ymin", "ymax")
alldata[[xvar]] <- rep(dimnames(mytable)[[1]],rep(ncol(mytable), nrow(mytable)));
alldata[[yvar]] <- rep(dimnames(mytable)[[2]],nrow(mytable));
ggplot(alldata, aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax)) +
geom_rect(color="black", aes_string(fill=yvar)) +
xlab(paste(xvar, "(count)")) + ylab(paste(yvar, "(proportion)"));
}
我不久前自己做的,只使用了
geom_bar
,我把它变成了一个通用函数,所以它应该可以在任何两个因子上工作
ggMMplot <- function(var1, var2){
require(ggplot2)
levVar1 <- length(levels(var1))
levVar2 <- length(levels(var2))
jointTable <- prop.table(table(var1, var2))
plotData <- as.data.frame(jointTable)
plotData$marginVar1 <- prop.table(table(var1))
plotData$var2Height <- plotData$Freq / plotData$marginVar1
plotData$var1Center <- c(0, cumsum(plotData$marginVar1)[1:levVar1 -1]) +
plotData$marginVar1 / 2
ggplot(plotData, aes(var1Center, var2Height)) +
geom_bar(stat = "identity", aes(width = marginVar1, fill = var2), col = "Black") +
geom_text(aes(label = as.character(var1), x = var1Center, y = 1.05))
}
ggMMplot(diamonds$cut, diamonds$clarity)
ggMMplot是一个基于ggplot2的库,旨在根据1-3个变量的特征自动选择绘图类型。它包含用于马赛克图的函数。例子:
plotluck(mtcars、vs、齿轮)
您可以使用名为“ggmosaic”()的ggplot2扩展包
这里提供了包含示例代码和可视化结果的详细教程。不久前,我在一个项目中遇到过同样的问题。我的解决方案是使用geom_-bar
和facet_-grid
中的scales=“free_-x”,space=“free_-x”
选项来适应不同的条宽:
# using diamonds dataset for illustration
df <- diamonds %>%
group_by(cut, clarity) %>%
summarise(count = n()) %>%
mutate(cut.count = sum(count),
prop = count/sum(count)) %>%
ungroup()
ggplot(df,
aes(x = cut, y = prop, width = cut.count, fill = clarity)) +
geom_bar(stat = "identity", position = "fill", colour = "black") +
# geom_text(aes(label = scales::percent(prop)), position = position_stack(vjust = 0.5)) + # if labels are desired
facet_grid(~cut, scales = "free_x", space = "free_x") +
scale_fill_brewer(palette = "RdYlGn") +
# theme(panel.spacing.x = unit(0, "npc")) + # if no spacing preferred between bars
theme_void()
#使用钻石数据集进行说明
df%
分组依据(切割、清晰度)%>%
汇总(计数=n())%>%
突变(cut.count=总和(count),
prop=计数/总和(计数))%>%
解组()
ggplot(df,
aes(x=切割,y=道具,宽度=切割.计数,填充=清晰度))+
几何图形栏(stat=“identity”、position=“fill”、color=“black”)+
#geom_文本(aes(标签=比例::百分比(prop)),位置=位置_堆栈(vjust=0.5))+#如果需要标签
平面网格(~cut,scales=“free\ux”,space=“free\ux”)+
缩放填充酿酒器(调色板=“RdYlGn”)+
#主题(panel.spacing.x=单位(0,“npc”))+#如果条之间没有首选间距
主题_void()
根据作者的建议,这里有一个版本使用了ggmosaic
。(请注意,因此您需要最新的版本。)
库(tidyverse)
图书馆(GG)
#从链接的博客文章复制的数据
df感谢所有创建此条目的人,这确实帮助了我,因为ggmosaic没有做我想做的事情(而且没有正确标记轴)。来自Z.Lin的nice函数抛出了一种警告,在其中似乎说明了警告,这在技术上是不真实的,实际上是在警告我们,ggplotocracy,祝福并感谢他们,认为geom_bar不应该真的有可变宽度。我想我明白了这一点,所以我从杰克·费舍尔那里得到了这个函数,并根据自己的需要进行了调整。如果它对其他人有用,这里是:
makeplot_mosaic2 <- function(data, x, y, statDigits = 1, residDigits = 1, pDigits = 3, ...){
### from https://stackoverflow.com/questions/19233365/how-to-create-a-marimekko-mosaic-plot-in-ggplot2,
### this from Jake Fisher (I think)
xvar <- deparse(substitute(x))
yvar <- deparse(substitute(y))
mydata <- data[c(xvar, yvar)]
mytable <- table(mydata)
widths <- c(0, cumsum(apply(mytable, 1, sum)))
heights <- apply(mytable, 1, function(x){c(0, cumsum(x/sum(x)))})
alldata <- data.frame()
allnames <- data.frame()
for(i in 1:nrow(mytable)){
for(j in 1:ncol(mytable)){
alldata <- rbind(alldata, c(widths[i], widths[i+1], heights[j, i], heights[j+1, i]))
}
}
colnames(alldata) <- c("xmin", "xmax", "ymin", "ymax")
alldata[[xvar]] <- rep(dimnames(mytable)[[1]],rep(ncol(mytable), nrow(mytable)))
alldata[[yvar]] <- rep(dimnames(mytable)[[2]],nrow(mytable))
chisq <- chisq.test(mytable)
df <- chisq$parameter
pval <- chisq$p.value
chisqval <- chisq$statistic
# stdResids <- chisq$stdres
alldata$xcent <- (alldata$xmin + alldata$xmax)/2
alldata$ycent <- (alldata$ymin + alldata$ymax)/2
alldata$stdres <- round(as.vector(t(chisq$stdres)), residDigits)
# print(chisq$stdres)
# print(alldata)
titleTxt1 <- paste0("Mosaic plot of ",
yvar,
" against ",
xvar,
", ")
titleTxt2 <- paste0("chisq(",
df,
") = ",
round(chisqval, statDigits),
", p = ",
format.pval(pval, digits = pDigits))
titleTxt <- paste0(titleTxt1, titleTxt2)
subTitleTxt <- "Cell labels are standardised residuals"
ggplot(data = alldata,
aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax)) +
geom_rect(color="black", aes_string(fill=yvar)) +
geom_text(aes(x = xcent, y = ycent, label = stdres)) +
xlab(paste0("Count of '",
xvar,
"', total = ",
max(alldata$xmax))) + # tweaked by CE
ylab(paste0("Proportion of '",
yvar,
"' per level of '",
xvar,
"'")) +
ggtitle(titleTxt,
subtitle = subTitleTxt) +
theme_bw() +
theme(plot.title = element_text(hjust = .5),
plot.subtitle = element_text(hjust = .5))
}
makeplot_mosaic2(mtcars, vs, gear)
makeplot_mosaic2(diamonds, cut, clarity)
makeplot\u mosaic2您是否尝试过graphics::mosaicplot
?我想继续使用ggplot2
,以便能够使用其他功能(刻面等)对其进行扩展,这不是一个很好的答案,但请参阅使用rCharts
和dimplejs
的交互式Marimekko Plot版本。谢谢!已对此进行了一些更新,以整理标签,并允许通过colour brewer轻松指定色标-通过There is a warning下载。“忽略未知美学:宽度”。也许这可以更新。我认为Z.Lin在当前的R/tidyverse实现中做得很好。这可能是一个很好的答案,但ggmosaic有点复杂,也许你应该解释如何使用它获得绘图,或者至少给出一个可用的代码行。公平地说,这个问题是在没有可复制代码的情况下提出的。我添加了对程序包教程的参考,这将有助于解决问题。哎呀,可能应该说我的调整是用标准化残差标记单元格,添加标题和副标题,以及调整轴标签。
library(tidyverse)
library(ggmosaic)
# Data copied from linked blog post
df <- data.frame(
segment = LETTERS[1:4],
segpct = c(40, 30, 20, 10),
Alpha = c(60, 40, 30, 25),
Beta = c(25, 30, 30, 25),
Gamma = c(10, 20, 20, 25),
Delta = c(5, 10, 20, 25)
)
# Convert to "long" for plotting
df_long <- gather(df, key = "greek_letter", value = "pct",
-c("segment", "segpct")) %>%
mutate(
greek_letter = factor(
greek_letter,
levels = c("Alpha", "Beta", "Gamma", "Delta")
),
weight = (segpct * pct) / 10000
)
# Plot
ggplot(df_long) +
geom_mosaic(aes(x = product(greek_letter, segment), fill = greek_letter,
weight = weight))
makeplot_mosaic2 <- function(data, x, y, statDigits = 1, residDigits = 1, pDigits = 3, ...){
### from https://stackoverflow.com/questions/19233365/how-to-create-a-marimekko-mosaic-plot-in-ggplot2,
### this from Jake Fisher (I think)
xvar <- deparse(substitute(x))
yvar <- deparse(substitute(y))
mydata <- data[c(xvar, yvar)]
mytable <- table(mydata)
widths <- c(0, cumsum(apply(mytable, 1, sum)))
heights <- apply(mytable, 1, function(x){c(0, cumsum(x/sum(x)))})
alldata <- data.frame()
allnames <- data.frame()
for(i in 1:nrow(mytable)){
for(j in 1:ncol(mytable)){
alldata <- rbind(alldata, c(widths[i], widths[i+1], heights[j, i], heights[j+1, i]))
}
}
colnames(alldata) <- c("xmin", "xmax", "ymin", "ymax")
alldata[[xvar]] <- rep(dimnames(mytable)[[1]],rep(ncol(mytable), nrow(mytable)))
alldata[[yvar]] <- rep(dimnames(mytable)[[2]],nrow(mytable))
chisq <- chisq.test(mytable)
df <- chisq$parameter
pval <- chisq$p.value
chisqval <- chisq$statistic
# stdResids <- chisq$stdres
alldata$xcent <- (alldata$xmin + alldata$xmax)/2
alldata$ycent <- (alldata$ymin + alldata$ymax)/2
alldata$stdres <- round(as.vector(t(chisq$stdres)), residDigits)
# print(chisq$stdres)
# print(alldata)
titleTxt1 <- paste0("Mosaic plot of ",
yvar,
" against ",
xvar,
", ")
titleTxt2 <- paste0("chisq(",
df,
") = ",
round(chisqval, statDigits),
", p = ",
format.pval(pval, digits = pDigits))
titleTxt <- paste0(titleTxt1, titleTxt2)
subTitleTxt <- "Cell labels are standardised residuals"
ggplot(data = alldata,
aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax)) +
geom_rect(color="black", aes_string(fill=yvar)) +
geom_text(aes(x = xcent, y = ycent, label = stdres)) +
xlab(paste0("Count of '",
xvar,
"', total = ",
max(alldata$xmax))) + # tweaked by CE
ylab(paste0("Proportion of '",
yvar,
"' per level of '",
xvar,
"'")) +
ggtitle(titleTxt,
subtitle = subTitleTxt) +
theme_bw() +
theme(plot.title = element_text(hjust = .5),
plot.subtitle = element_text(hjust = .5))
}
makeplot_mosaic2(mtcars, vs, gear)
makeplot_mosaic2(diamonds, cut, clarity)