R ggplot2数据和比例的日志转换_R_Plot_Ggplot2_Distribution_Data Visualization

R ggplot2数据和比例的日志转换

r plot

R ggplot2数据和比例的日志转换,r,plot,ggplot2,distribution,data-visualization,R,Plot,Ggplot2,Distribution,Data Visualization,这是我上一个问题的后续，我昨天已经回答了自己。我当前的问题是，在下面的可再现示例中，用于绘制数据值混合分布成分的线既不出现在预期位置，也不具有预期形状，如下所示（参见第二幅图中y=0处的红线）完整的可复制示例： library(ggplot2) library(scales) library(RColorBrewer) library(mixtools) NUM_COMPONENTS <- 2 set.seed(12345) # for reproducibility data

这是我上一个问题的后续，我昨天已经回答了自己。我当前的问题是，在下面的可再现示例中，用于绘制数据值混合分布成分的线既不出现在预期位置，也不具有预期形状，如下所示（参见第二幅图中y=0处的红线）

完整的可复制示例：

library(ggplot2) library(scales) library(RColorBrewer) library(mixtools) NUM_COMPONENTS <- 2 set.seed(12345) # for reproducibility data(diamonds, package='ggplot2') # use built-in data myData <- diamonds$price # extract 'k' components from mixed distribution 'data' mix.info <- normalmixEM(myData, k = NUM_COMPONENTS, maxit = 100, epsilon = 0.01) summary(mix.info) numComponents <- length(mix.info$sigma) message("Extracted number of component distributions: ", numComponents) calc.components <- function(x, mix, comp.number) { mix$lambda[comp.number] * dnorm(x, mean = mix$mu[comp.number], sd = mix$sigma[comp.number]) } g <- ggplot(data.frame(x = myData)) + scale_fill_continuous("Count", low="#56B1F7", high="#132B43") + scale_x_log10("Diamond Price [log10]", breaks = trans_breaks("log10", function(x) 10^x), labels = prettyNum) + scale_y_continuous("Count") + geom_histogram(aes(x = myData, fill = 0.01 * ..density..), binwidth = 0.01) print(g) # we could select needed number of colors randomly: #DISTRIB_COLORS <- sample(colors(), numComponents) # or, better, use a palette with more color differentiation: DISTRIB_COLORS <- brewer.pal(numComponents, "Set1") distComps <- lapply(seq(numComponents), function(i) stat_function(fun = calc.components, arg = list(mix = mix.info, comp.number = i), geom = "line", # use alpha=.5 for "polygon" size = 1, color = "red")) # DISTRIB_COLORS[i] print(g + distComps)

库（ggplot2）图书馆（比例尺）图书馆（RColorBrewer）图书馆（混合工具） NUM_COMPONENTS最后，我已经解决了问题，删除了我以前的答案，并在下面提供了我的最新解决方案（我唯一没有解决的是组件的图例面板-它不是出于某种原因出现的，但对于EDA ，为了证明混合分布的存在，我认为它已经足够好了）。完整的可复制溶液如下。感谢所有直接或间接帮助过我们的人 library(ggplot2) library(scales) library(RColorBrewer) library(mixtools) NUM_COMPONENTS <- 2 set.seed(12345) # for reproducibility data(diamonds, package='ggplot2') # use built-in data myData <- diamonds$price calc.components <- function(x, mix, comp.number) { mix$lambda[comp.number] * dnorm(x, mean = mix$mu[comp.number], sd = mix$sigma[comp.number]) } overlayHistDensity <- function(data, calc.comp.fun) { # extract 'k' components from mixed distribution 'data' mix.info <- normalmixEM(data, k = NUM_COMPONENTS, maxit = 100, epsilon = 0.01) summary(mix.info) numComponents <- length(mix.info$sigma) message("Extracted number of component distributions: ", numComponents) DISTRIB_COLORS <- suppressWarnings(brewer.pal(NUM_COMPONENTS, "Set1")) # create (plot) histogram and ... g <- ggplot(as.data.frame(data), aes(x = data)) + geom_histogram(aes(y = ..density..), binwidth = 0.01, alpha = 0.5) + theme(legend.position = 'top', legend.direction = 'horizontal') comp.labels <- lapply(seq(numComponents), function (i) paste("Component", i)) # ... fitted densities of components distComps <- lapply(seq(numComponents), function (i) stat_function(fun = calc.comp.fun, args = list(mix = mix.info, comp.number = i), size = 2, color = DISTRIB_COLORS[i])) legend <- list(scale_colour_manual(name = "Legend:", values = DISTRIB_COLORS, labels = unlist(comp.labels))) return (g + distComps + legend) } overlayPlot <- overlayHistDensity(log10(myData), 'calc.components') print(overlayPlot) 库（ggplot2）图书馆（比例尺）图书馆（RColorBrewer）图书馆（混合工具） NUM_COMPONENTS刚刚意识到，对于这个问题和前面的问题，我可能需要将组件分布数据值乘以每个组件分布中的元素总数（在我们的例子中，它们相等），以便从密度分布转移到计数分布。如果它有意义，那么我应该如何使用stat\u function（）？我想，通过在calc.components 函数中添加一个乘数作为相应的参数，在stat\u函数的arg 列表中添加一个相应的参数，我否决了这个问题，因为它太冗长，粗体字会降低可读性。请让你的问题更切题。此外，您承认我们需要最少的可复制示例。请尝试创建一个。@Roland:我对否决票没有问题，只要它像你刚才那样被证实。对不起，粗体字-我试图强调重要的元素/要点。将限制其在未来的使用，并将尝试提供更紧凑的问题。关于可复制的示例，我刚刚创建了一个，不久将用它更新我的问题。谢谢你的帮助@托尼托诺夫：谢谢，我会记住这一点。在接下来的5分钟内用可复制的示例更新我的问题…谢谢编辑。现在好多了。