R 为ggplot2堆叠条形图中的每个条形创建不同的颜色比例
我有一个堆叠的条形图,如下所示: 虽然这些颜色看起来不错,但有这么多相似的颜色代表不同的药物却令人困惑。我希望图表中的每个条都有一个单独的调色板,例如,class1可以使用调色板“Blues”,而class2可以使用调色板“BuGn”(找到调色板名称) 我发现有些情况下,人们手动为每个条形图编码颜色(例如),但我不确定我所问的是否可行——这些条形图需要基于调色板,因为每个药物类别中都有很多药物 创建上述图表的代码:R 为ggplot2堆叠条形图中的每个条形创建不同的颜色比例,r,ggplot2,bar-chart,color-palette,R,Ggplot2,Bar Chart,Color Palette,我有一个堆叠的条形图,如下所示: 虽然这些颜色看起来不错,但有这么多相似的颜色代表不同的药物却令人困惑。我希望图表中的每个条都有一个单独的调色板,例如,class1可以使用调色板“Blues”,而class2可以使用调色板“BuGn”(找到调色板名称) 我发现有些情况下,人们手动为每个条形图编码颜色(例如),但我不确定我所问的是否可行——这些条形图需要基于调色板,因为每个药物类别中都有很多药物 创建上述图表的代码: library(ggplot2) library(plyr) library(
library(ggplot2)
library(plyr)
library(RColorBrewer)
drug_name <- c("a", "a", "b", "b", "b", "c", "d", "e", "e", "e", "e", "e", "e",
"f", "f", "g", "g", "g", "g", "h", "i", "j", "j", "j", "k", "k",
"k", "k", "k", "k", "l", "l", "m", "m", "m", "n", "o")
df <- data.frame(drug_name)
#get the frequency of each drug name
df_count <- count(df, 'drug_name')
#add a column that specifies the drug class
df_count$drug_class <- vector(mode='character', length=nrow(df_count))
df_count$drug_class[df_count$drug_name %in% c("a", "c", "e", "f")] <- 'class1'
df_count$drug_class[df_count$drug_name %in% c("b", "o")] <- 'class2'
df_count$drug_class[df_count$drug_name %in% c("d", "h", "i")] <- 'class3'
df_count$drug_class[df_count$drug_name %in% c("g", "j", "k", "l", "m", "n")] <- 'class4'
#expand color palette (from http://novyden.blogspot.com/2013/09/how-to-expand-color-palette-with-ggplot.html)
colorCount = length(unique(df_count$drug_name))
getPalette = colorRampPalette(brewer.pal(9, "Set1"))
test_plot <- ggplot(data = df_count, aes(x=drug_class, y=freq, fill=drug_name) ) + geom_bar(stat="identity") + scale_fill_manual(values=getPalette(colorCount))
test_plot
库(ggplot2)
图书馆(plyr)
图书馆(RColorBrewer)
有这么多颜色,你的情节会很混乱。也许最好只在每个条形图上标注药物名称和数量。下面的代码显示了为每个条创建单独选项板的一种方法,以及如何标记这些条
首先,添加一列,用于定位条形标签:
library(dplyr) # for the chaining (%>%) operator
## Add a column for positioning drug labels on graph
df_count = df_count %>% group_by(drug_class) %>%
mutate(cum.freq = cumsum(freq) - 0.5*freq)
其次,创建选项板。下面的代码使用了四种不同的Colorbrewer调色板,但您可以使用调色板创建函数或方法的任意组合来尽可能精细地控制颜色
## Create separate palette for each drug class
# Count the number of colors we'll need for each bar
ncol = table(df_count$drug_class)
# Make the palettes
pal = mapply(function(x,y) brewer.pal(x,y), ncol, c("BrBG","OrRd","YlGn","Set2"))
pal[[2]] = pal[[2]][1:2] # We only need 2 colors but brewer.pal creates 3 minimum
pal = unname(unlist(pal)) # Combine palettes into single vector of colors
ggplot(data = df_count, aes(x=drug_class, y=freq, fill=drug_name) ) +
geom_bar(stat="identity", colour="black", lwd=0.2) +
geom_text(aes(label=paste0(drug_name,": ", freq), y=cum.freq), colour="grey20") +
scale_fill_manual(values=pal) +
guides(fill=FALSE)
创建调色板有许多策略和功能。下面是另一种方法,使用hcl
函数:
lum = seq(100, 50, length.out=4) # Vary the luminance for each bar
shift = seq(20, 60, length.out=4) # Shift the hues for each bar
pal2 = mapply(function(n, l, s) hcl(seq(0 + s, 360 + s, length.out=n+1)[1:n], 100, l),
ncol, lum, shift)
pal2 = unname(unlist(pal2))
上面的各种调色板并不一致地转移到不同的类中,而是根据命名向量(a、b、c…)进行绘制,因此在不同的类中进行分割。有关详细信息,请参见
为了将它们与每套条形图“匹配”,我们需要按类对data.frame
进行排序,并将调色板与名称适当对齐
创建重复选项板以测试正确(预期)顺序。
repeating.pal = mapply(function(x,y) brewer.pal(x,y), ncol, c("Set2","Set2","Set2","Set2"))
repeating.pal[[2]] = repeating.pal[[2]][1:2] # We only need 2 colors but brewer.pal creates 3 minimum
repeating.pal = unname(unlist(repeating.pal))
df_count_sorted$labOrder <- df_count$drug_name
df_count$colours<-repeating.pal
ggplot(data = df_sorted, aes(x=drug_class, y=freq, fill=labOrder) ) +
geom_bar(stat="identity", colour="black", lwd=0.2) +
geom_text(aes(label=paste0(drug_name,": ", freq), y=cum.freq), colour="grey20") +
scale_fill_manual(values=df_sorted$colours) +
guides(fill=FALSE)
根据类别对数据进行排序(我们希望颜色保持的顺序!)
您可以查看。乍一看,这似乎是一个类似的案例。这太棒了。我想出了一个(某种)解决办法,在某种程度上实现了这一点,但并没有完全解决问题。我是在@eipi10的答案的基础上,用pal替换的
ggplot(data = df_sorted, aes(x=drug_class, y=freq, fill=labOrder) ) +
geom_bar(stat="identity", colour="black", lwd=0.2) +
geom_text(aes(label=paste0(drug_name,": ", freq), y=cum.freq), colour="grey20") +
scale_fill_manual(values=df_sorted$colours) +
guides(fill=FALSE)