R：函数内ggplot的无法解释的行为_R_Function_Ggplot2

R：函数内ggplot的无法解释的行为

r function

R：函数内ggplot的无法解释的行为,r,function,ggplot2,R,Function,Ggplot2,我编写了一个函数，该函数使用ggplot2在数据帧的数字列上生成直方图，该数据帧将被传递给它。函数将这些绘图存储到列表中，然后返回列表然而，当我运行函数时，我一次又一次地得到相同的绘图我的代码如下，我还提供了一个可复制的示例 hist_of_columns = function(data, class, variables_to_exclude = c()){ library(ggplot2) library(ggthemes) data = as.data.fr

我编写了一个函数，该函数使用ggplot2在数据帧的数字列上生成直方图，该数据帧将被传递给它。函数将这些绘图存储到列表中，然后返回列表

然而，当我运行函数时，我一次又一次地得到相同的绘图

我的代码如下，我还提供了一个可复制的示例

hist_of_columns = function(data, class, variables_to_exclude = c()){

    library(ggplot2)
    library(ggthemes)

    data = as.data.frame(data)

    variables_numeric = names(data)[unlist(lapply(data, function(x){is.numeric(x) | is.integer(x)}))]

    variables_not_to_plot = c(class, variables_to_exclude)



    variables_to_plot = setdiff(variables_numeric, variables_not_to_plot)

    indices = match(variables_to_plot, names(data))

    index_of_class = match(class, names(data))

    plots = list()

    for (i in (1 : length(variables_to_plot))){



          p  = ggplot(data, aes(x= data[, indices[i]], color= data[, index_of_class], fill=data[, index_of_class])) +
           geom_histogram(aes(y=..density..), alpha=0.3,
           position="identity", bins = 100)+ theme_economist() +
           geom_density(alpha=.2) + xlab(names(data)[indices[i]]) + labs(fill = class) + guides(color = FALSE)

          name = names(data)[indices[i]]

          plots[[name]] = p
    }

   plots

}


data(mtcars)

mtcars$am = factor(mtcars$am)

data = mtcars

variables_to_exclude = 'mpg'

class = 'am'

plots = hist_of_columns(data, class, variables_to_exclude)

如果您检查列表图，您会发现它包含重复的相同图。

以下是使用

tidyeval

的策略，它完成了您想要的：

library(rlang)
library(tidyverse)

hist_of_cols <- function(data, class, drop_vars) {

    # tidyeval overhead
    class_enq <- enquo(class)
    drop_enqs <- enquo(drop_vars)

    data %>%
        group_by(!!class_enq) %>% # keep the 'class' column always
        select(-!!drop_enqs) %>% # drop any 'drop_vars'
        select_if(is.numeric) %>% # keep only numeric columns
        gather("key", "value", -!!class_enq) %>% # go to long form
        split(.$key) %>% # make a list of data frames
        map(~ ggplot(., aes(value, fill = !!class_enq)) + # plot as usual
                geom_histogram() +
                geom_density(alpha = .5) +
                labs(x = unique(.$key)))

}
hist_of_cols(mtcars, am, mpg)

hist_of_cols(mtcars, am, c(mpg, wt))

库（rlang）
图书馆（tidyverse）
历史分数%#放弃任何“放弃变量”
如果（是数值）%>%，请选择_#仅保留数值列
聚集（“键”、“值”、-！！类enq）%>%#转到长格式
拆分（.$key）%>%#制作数据帧列表
地图（~ggplot（，aes（value，fill=！！class_enq））+#照常绘图
geom_直方图（）+
几何密度（α=0.5）+
实验室（x=唯一（.$key）））
}
历史记录（mtcars、am、mpg）
历史记录（mtcars、am、c（mpg、wt））

以下是使用

tidyeval

实现您所追求目标的策略：

library(rlang)
library(tidyverse)

hist_of_cols <- function(data, class, drop_vars) {

    # tidyeval overhead
    class_enq <- enquo(class)
    drop_enqs <- enquo(drop_vars)

    data %>%
        group_by(!!class_enq) %>% # keep the 'class' column always
        select(-!!drop_enqs) %>% # drop any 'drop_vars'
        select_if(is.numeric) %>% # keep only numeric columns
        gather("key", "value", -!!class_enq) %>% # go to long form
        split(.$key) %>% # make a list of data frames
        map(~ ggplot(., aes(value, fill = !!class_enq)) + # plot as usual
                geom_histogram() +
                geom_density(alpha = .5) +
                labs(x = unique(.$key)))

}
hist_of_cols(mtcars, am, mpg)

hist_of_cols(mtcars, am, c(mpg, wt))

库（rlang）
图书馆（tidyverse）
历史分数%#放弃任何“放弃变量”
如果（是数值）%>%，请选择_#仅保留数值列
聚集（“键”、“值”、-！！类enq）%>%#转到长格式
拆分（.$key）%>%#制作数据帧列表
地图（~ggplot（，aes（value，fill=！！class_enq））+#照常绘图
geom_直方图（）+
几何密度（α=0.5）+
实验室（x=唯一（.$key）））
}
历史记录（mtcars、am、mpg）
历史记录（mtcars、am、c（mpg、wt））

只需使用

aes\u string

将字符串变量传递到

ggplot（）调用中即可。现在，您的绘图使用不同的数据源，而不是与ggplot的数据参数对齐。在x下方，颜色和填充是独立的、不相关的向量，尽管它们来自相同的源，但ggplot
不知道：
ggplot(data, aes(x= data[, indices[i]], color= data[, index_of_class], fill=data[, index_of_class]))

但是，对于aes_字符串，将字符串名称传递给x、颜色和填充将指向数据：
只需使用aes\u string
将字符串变量传递到ggplot（）
调用中。现在，您的绘图使用不同的数据源，而不是与ggplot的数据参数对齐。在x下方，颜色和填充是独立的、不相关的向量，尽管它们来自相同的源，但ggplot
不知道：
ggplot(data, aes(x= data[, indices[i]], color= data[, index_of_class], fill=data[, index_of_class]))

但是，对于aes_字符串，将字符串名称传递给x、颜色和填充将指向数据：
您的gather（）函数调用将-am作为参数，但我希望使用一般形式（mtcars只是一个示例）。此外，是否可以将类作为字符串传递（表示类变量的名称），并将_变量作为与变量名称对应的字符串向量丢弃？最后，您可以解释我的代码出现故障的原因吗？更新了gather（）
中的问题当前解决方案不起作用的原因是您正在通过data[，index]
向每个aes（）
变量传递向量。因为ggplot2
使用它，所以只有在完成循环时才查找该向量的实际值。因此它总是看到相同的向量值（因为i
处于其最大值），因此您会得到一个相同绘图的列表。您的gather（）函数调用将-am作为参数，但我想要一个通用形式（mtcars只是一个示例）。此外，是否可以将类作为字符串传递（表示类变量的名称），并将_变量作为与变量名称对应的字符串向量丢弃？最后，您可以解释我的代码出现故障的原因吗？更新了gather（）
中的问题当前解决方案不起作用的原因是您正在通过data[，index]
向每个aes（）
变量传递向量。因为ggplot2
使用它，所以只有在完成循环时才查找该向量的实际值。因此，它总是看到相同的向量值（因为i
处于其最大值），因此可以得到一个相同图的列表。