R 我能想象一下”的平均值吗;另一个变量";对于方框图中的方框?

R 我能想象一下”的平均值吗;另一个变量";对于方框图中的方框?,r,ggplot2,R,Ggplot2,我有以下数据,显示了不同高中背景下的学分百分比: 我的代码如下: ggplot(fulldata,aes(x=fct_reorder(gymnasiegrov, PERC_CREDIT, .fun = median,na.rm=T), y=PERC_CREDIT))+geom_boxplot()+coord_flip() 我被要求添加每个组/箱线图的平均年龄信息,因为年龄可能是一个混淆变量 这真的可以做到吗(使用geom_文本或类似的东西),或者我必须以另一种方式将这些信息可视化 平均年龄值

我有以下数据,显示了不同高中背景下的学分百分比:

我的代码如下:

ggplot(fulldata,aes(x=fct_reorder(gymnasiegrov, PERC_CREDIT, .fun = median,na.rm=T), y=PERC_CREDIT))+geom_boxplot()+coord_flip()
我被要求添加每个组/箱线图的平均年龄信息,因为年龄可能是一个混淆变量

这真的可以做到吗(使用geom_文本或类似的东西),或者我必须以另一种方式将这些信息可视化

平均年龄值应显示在各组的connecton中。它们不必叠加在情节上。如果我能说服R markdown在同一页上显示一个表和一个方框图,那么在它旁边显示这些值是完全可以接受的,只要它们的顺序正确

数据的小摘录:

structure(list(start_date = structure(c(17776, 17776, 17776, 
17776, 17776, 17776), class = "Date"), PERC_CREDIT = c(56.2962962962963, 
69.6296296296296, 0, 1.48148148148148, 60, 0), gymnasiegrov = structure(c(11L, 
9L, 6L, 13L, 13L, 4L), .Label = c("medieprogrammet/medieproduktion", 
"Hotell- och Restaurang", "komvux", "teknikprogrammet", "specialutformat program", 
"naturvetenskapliga programmet", "ekonomiprogrammet/ ekonomi", 
"bygg, el, fordon, hantverk, sjöfart, industriteknik", "ekonomiprogrammet/ juridik", 
"Oklart", "samhällsvetenskapliga programmet", "Handels- och administrationsprogrammet", 
"estetiska programmet", "friskoleprogram", "samhälls- och ekonomiprogrammet"
), class = c("ordered", "factor")), ålder = structure(c(20, 20, 
19, 32, 27, 26), class = "difftime", units = "days")), row.names = c(NA, 
-6L), groups = structure(list(start_date = structure(17776, class = "Date"), 
    .rows = list(1:6)), row.names = c(NA, -1L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))
structure(list(start_date = structure(c(17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776), class = "Date"), 
    PERC_CREDIT = c(56.2962962962963, 69.6296296296296, 0, 1.48148148148148, 
    60, 0, 0, 0, 0, 16.2962962962963, 1.48148148148148, 1.48148148148148, 
    0, 93.3333333333333, 45.1851851851852, 0, 0, 93.3333333333333, 
    0, 71.1111111111111, 5.18518518518519, 65.1851851851852, 
    69.6296296296296, 1.48148148148148, 1.48148148148148, 86.6666666666667, 
    84.4444444444444, 97.037037037037, 85.1851851851852, 83.7037037037037, 
    0, 80, 57.037037037037, 61.4814814814815, 0, 80.7407407407407, 
    80, 0, 0, 84.4444444444444, 34.8148148148148, 1.48148148148148, 
    44.4444444444444, 0, 70.3703703703704, 0, 76.2962962962963, 
    14.0740740740741, 94.8148148148148, 86.6666666666667, 0, 
    80, 94.0740740740741, 95.5555555555556, 100, 84.4444444444444, 
    79.2592592592593, 28.1481481481481, 94.0740740740741, 83.7037037037037, 
    55.5555555555556, 95.5555555555556, 0, 0, 14.0740740740741, 
    22.962962962963, 0, 47.4074074074074, 50.3703703703704, 0, 
    51.8518518518518, 84.4444444444444, 88.1481481481482, 82.2222222222222, 
    45.9259259259259, 37.7777777777778, 84.4444444444444, 0, 
    0, 0, 86.6666666666667, 6.66666666666667, 76.2962962962963, 
    25.9259259259259, 34.0740740740741, 0, 0, 0, 8.88888888888889, 
    51.8518518518518, 102.222222222222, 94.0740740740741, 86.6666666666667, 
    33.3333333333333, 80, 0, 1.48148148148148, 48.8888888888889, 
    0, 28.1481481481481, 0, 82.2222222222222, 0, 0, 84.4444444444444, 
    97.7777777777778, 78.5185185185185, 95.5555555555556, 70.3703703703704, 
    1.48148148148148, 27.4074074074074, 80.7407407407407, 82.962962962963, 
    97.7777777777778, 94.0740740740741, 72.5925925925926, 82.962962962963, 
    95.5555555555556, 0, 82.962962962963, 0, 82.2222222222222, 
    70.3703703703704, 97.7777777777778, 1.48148148148148, 20, 
    82.962962962963, 0, 68.8888888888889, 60.7407407407407, 97.7777777777778, 
    25.9259259259259, 46.6666666666667, 0, 84.4444444444444, 
    69.6296296296296, 82.2222222222222, 100, 0, 82.2222222222222, 
    1.48148148148148, 80, 85.9259259259259, 95.5555555555556, 
    77.7777777777778, 97.7777777777778, 97.7777777777778, 53.3333333333333, 
    33.3333333333333, 33.3333333333333, 12.5925925925926, 23.7037037037037, 
    77.7777777777778, 77.7777777777778), gymnasiegrov = structure(c(11L, 
    9L, 6L, 13L, 13L, 4L, 3L, 8L, 7L, 7L, 5L, 5L, 5L, 8L, 6L, 
    12L, 4L, 11L, 2L, 11L, 3L, 3L, 6L, 7L, 4L, 14L, 12L, 7L, 
    8L, 7L, 8L, 7L, 11L, 5L, 5L, 7L, 7L, 11L, 4L, 5L, 14L, 7L, 
    2L, 10L, 10L, 7L, 6L, 3L, 5L, 9L, 8L, 13L, 3L, 4L, 6L, 4L, 
    9L, 9L, 8L, 4L, 4L, 5L, 1L, 7L, 12L, 7L, 7L, 11L, 6L, 6L, 
    7L, 11L, 7L, 9L, 8L, 6L, 7L, 7L, 11L, 4L, 7L, 7L, 7L, 7L, 
    11L, 6L, 10L, 7L, 9L, 7L, 11L, 9L, 8L, 5L, 7L, 3L, 11L, 7L, 
    6L, 7L, 7L, 8L, 7L, 7L, 7L, 7L, 7L, 13L, 6L, 7L, 7L, 9L, 
    7L, 12L, 7L, 7L, 11L, 15L, 7L, 6L, 6L, 7L, 7L, 2L, 7L, 4L, 
    7L, 5L, 7L, 11L, 7L, 9L, 11L, 7L, 6L, 7L, 7L, 5L, 7L, 7L, 
    11L, 8L, 4L, 13L, 9L, 7L, 7L, 10L, 10L, 10L, 10L, 10L, 10L, 
    10L), .Label = c("medieprogrammet/medieproduktion", "Hotell- och Restaurang", 
    "komvux", "teknikprogrammet", "specialutformat program", 
    "naturvetenskapliga programmet", "ekonomiprogrammet/ ekonomi", 
    "bygg, el, fordon, hantverk, sjöfart, industriteknik", "ekonomiprogrammet/ juridik", 
    "Oklart", "samhällsvetenskapliga programmet", "Handels- och administrationsprogrammet", 
    "estetiska programmet", "friskoleprogram", "samhälls- och ekonomiprogrammet"
    ), class = c("ordered", "factor")), ålder = structure(c(20, 
    20, 19, 32, 27, 26, 23, 22, 20, 20, 25, 25, 23, 22, 19, 26, 
    24, 26, 23, 20, 25, 25, 24, 21, 19, 26, 24, 24, 23, 22, 21, 
    20, 20, 29, 27, 21, 20, 20, 20, 25, 24, 19, 39, 34, 29, 22, 
    20, 33, 25, 19, 22, 21, 30, 24, 22, 21, 19, 22, 25, 19, 26, 
    24, 29, 20, 22, 19, 19, 20, 30, 20, 21, 19, 19, 19, 22, 21, 
    19, 19, 23, 19, 20, 20, 20, 20, 24, 24, 33, 19, 19, 21, 24, 
    19, 23, 33, 21, 27, 23, 20, 19, 20, 19, 22, 21, 19, 21, 19, 
    21, 19, 20, 19, 19, 20, 19, 21, 22, 19, 20, 25, 19, 22, 19, 
    19, 19, 25, 23, 20, 19, 26, 19, 21, 19, 20, 25, 20, 19, 23, 
    19, 28, 19, 19, 19, 32, 20, 23, 21, 19, 20, 47, 39, 27, 26, 
    25, 24, 21), class = "difftime", units = "days")), row.names = c(NA, 
-154L), groups = structure(list(start_date = structure(17776, class = "Date"), 
    .rows = list(1:154)), row.names = c(NA, -1L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))
大量数据摘录:

structure(list(start_date = structure(c(17776, 17776, 17776, 
17776, 17776, 17776), class = "Date"), PERC_CREDIT = c(56.2962962962963, 
69.6296296296296, 0, 1.48148148148148, 60, 0), gymnasiegrov = structure(c(11L, 
9L, 6L, 13L, 13L, 4L), .Label = c("medieprogrammet/medieproduktion", 
"Hotell- och Restaurang", "komvux", "teknikprogrammet", "specialutformat program", 
"naturvetenskapliga programmet", "ekonomiprogrammet/ ekonomi", 
"bygg, el, fordon, hantverk, sjöfart, industriteknik", "ekonomiprogrammet/ juridik", 
"Oklart", "samhällsvetenskapliga programmet", "Handels- och administrationsprogrammet", 
"estetiska programmet", "friskoleprogram", "samhälls- och ekonomiprogrammet"
), class = c("ordered", "factor")), ålder = structure(c(20, 20, 
19, 32, 27, 26), class = "difftime", units = "days")), row.names = c(NA, 
-6L), groups = structure(list(start_date = structure(17776, class = "Date"), 
    .rows = list(1:6)), row.names = c(NA, -1L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))
structure(list(start_date = structure(c(17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 
17776, 17776, 17776, 17776, 17776, 17776, 17776), class = "Date"), 
    PERC_CREDIT = c(56.2962962962963, 69.6296296296296, 0, 1.48148148148148, 
    60, 0, 0, 0, 0, 16.2962962962963, 1.48148148148148, 1.48148148148148, 
    0, 93.3333333333333, 45.1851851851852, 0, 0, 93.3333333333333, 
    0, 71.1111111111111, 5.18518518518519, 65.1851851851852, 
    69.6296296296296, 1.48148148148148, 1.48148148148148, 86.6666666666667, 
    84.4444444444444, 97.037037037037, 85.1851851851852, 83.7037037037037, 
    0, 80, 57.037037037037, 61.4814814814815, 0, 80.7407407407407, 
    80, 0, 0, 84.4444444444444, 34.8148148148148, 1.48148148148148, 
    44.4444444444444, 0, 70.3703703703704, 0, 76.2962962962963, 
    14.0740740740741, 94.8148148148148, 86.6666666666667, 0, 
    80, 94.0740740740741, 95.5555555555556, 100, 84.4444444444444, 
    79.2592592592593, 28.1481481481481, 94.0740740740741, 83.7037037037037, 
    55.5555555555556, 95.5555555555556, 0, 0, 14.0740740740741, 
    22.962962962963, 0, 47.4074074074074, 50.3703703703704, 0, 
    51.8518518518518, 84.4444444444444, 88.1481481481482, 82.2222222222222, 
    45.9259259259259, 37.7777777777778, 84.4444444444444, 0, 
    0, 0, 86.6666666666667, 6.66666666666667, 76.2962962962963, 
    25.9259259259259, 34.0740740740741, 0, 0, 0, 8.88888888888889, 
    51.8518518518518, 102.222222222222, 94.0740740740741, 86.6666666666667, 
    33.3333333333333, 80, 0, 1.48148148148148, 48.8888888888889, 
    0, 28.1481481481481, 0, 82.2222222222222, 0, 0, 84.4444444444444, 
    97.7777777777778, 78.5185185185185, 95.5555555555556, 70.3703703703704, 
    1.48148148148148, 27.4074074074074, 80.7407407407407, 82.962962962963, 
    97.7777777777778, 94.0740740740741, 72.5925925925926, 82.962962962963, 
    95.5555555555556, 0, 82.962962962963, 0, 82.2222222222222, 
    70.3703703703704, 97.7777777777778, 1.48148148148148, 20, 
    82.962962962963, 0, 68.8888888888889, 60.7407407407407, 97.7777777777778, 
    25.9259259259259, 46.6666666666667, 0, 84.4444444444444, 
    69.6296296296296, 82.2222222222222, 100, 0, 82.2222222222222, 
    1.48148148148148, 80, 85.9259259259259, 95.5555555555556, 
    77.7777777777778, 97.7777777777778, 97.7777777777778, 53.3333333333333, 
    33.3333333333333, 33.3333333333333, 12.5925925925926, 23.7037037037037, 
    77.7777777777778, 77.7777777777778), gymnasiegrov = structure(c(11L, 
    9L, 6L, 13L, 13L, 4L, 3L, 8L, 7L, 7L, 5L, 5L, 5L, 8L, 6L, 
    12L, 4L, 11L, 2L, 11L, 3L, 3L, 6L, 7L, 4L, 14L, 12L, 7L, 
    8L, 7L, 8L, 7L, 11L, 5L, 5L, 7L, 7L, 11L, 4L, 5L, 14L, 7L, 
    2L, 10L, 10L, 7L, 6L, 3L, 5L, 9L, 8L, 13L, 3L, 4L, 6L, 4L, 
    9L, 9L, 8L, 4L, 4L, 5L, 1L, 7L, 12L, 7L, 7L, 11L, 6L, 6L, 
    7L, 11L, 7L, 9L, 8L, 6L, 7L, 7L, 11L, 4L, 7L, 7L, 7L, 7L, 
    11L, 6L, 10L, 7L, 9L, 7L, 11L, 9L, 8L, 5L, 7L, 3L, 11L, 7L, 
    6L, 7L, 7L, 8L, 7L, 7L, 7L, 7L, 7L, 13L, 6L, 7L, 7L, 9L, 
    7L, 12L, 7L, 7L, 11L, 15L, 7L, 6L, 6L, 7L, 7L, 2L, 7L, 4L, 
    7L, 5L, 7L, 11L, 7L, 9L, 11L, 7L, 6L, 7L, 7L, 5L, 7L, 7L, 
    11L, 8L, 4L, 13L, 9L, 7L, 7L, 10L, 10L, 10L, 10L, 10L, 10L, 
    10L), .Label = c("medieprogrammet/medieproduktion", "Hotell- och Restaurang", 
    "komvux", "teknikprogrammet", "specialutformat program", 
    "naturvetenskapliga programmet", "ekonomiprogrammet/ ekonomi", 
    "bygg, el, fordon, hantverk, sjöfart, industriteknik", "ekonomiprogrammet/ juridik", 
    "Oklart", "samhällsvetenskapliga programmet", "Handels- och administrationsprogrammet", 
    "estetiska programmet", "friskoleprogram", "samhälls- och ekonomiprogrammet"
    ), class = c("ordered", "factor")), ålder = structure(c(20, 
    20, 19, 32, 27, 26, 23, 22, 20, 20, 25, 25, 23, 22, 19, 26, 
    24, 26, 23, 20, 25, 25, 24, 21, 19, 26, 24, 24, 23, 22, 21, 
    20, 20, 29, 27, 21, 20, 20, 20, 25, 24, 19, 39, 34, 29, 22, 
    20, 33, 25, 19, 22, 21, 30, 24, 22, 21, 19, 22, 25, 19, 26, 
    24, 29, 20, 22, 19, 19, 20, 30, 20, 21, 19, 19, 19, 22, 21, 
    19, 19, 23, 19, 20, 20, 20, 20, 24, 24, 33, 19, 19, 21, 24, 
    19, 23, 33, 21, 27, 23, 20, 19, 20, 19, 22, 21, 19, 21, 19, 
    21, 19, 20, 19, 19, 20, 19, 21, 22, 19, 20, 25, 19, 22, 19, 
    19, 19, 25, 23, 20, 19, 26, 19, 21, 19, 20, 25, 20, 19, 23, 
    19, 28, 19, 19, 19, 32, 20, 23, 21, 19, 20, 47, 39, 27, 26, 
    25, 24, 21), class = "difftime", units = "days")), row.names = c(NA, 
-154L), groups = structure(list(start_date = structure(17776, class = "Date"), 
    .rows = list(1:154)), row.names = c(NA, -1L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

您可以简单地在绘图旁边绘制平均年龄

library(ggpubr) # for ggarrange

fulldata$age <- as.numeric(fulldata$ålder)

# your plot
g1 <- ggplot(fulldata,aes(x=fct_reorder(gymnasiegrov, PERC_CREDIT, .fun = median,na.rm=T), y = PERC_CREDIT)) + geom_boxplot() + coord_flip()

# age mean plot 
g2 <- ggplot(fulldata) + stat_summary(aes(x = fct_reorder(gymnasiegrov, PERC_CREDIT, .fun = median,na.rm=T), y = age), 
                                     fun.data = "mean_se") + coord_flip() + 
  theme(axis.text.y = element_blank(), # remove y axis labels since the're long
        axis.title.y = element_blank())# and the same as the first.

ggarrange(g1, g2, ncol=2, widths = c(.65,.35))
library(ggpubr)#用于ggpubr

fulldata$age您的数据不包含列
PERC\u CREDIT
。请试着制作一个reprex。也可以使用内置的数据集压缩代码。对不起,当我发布那个问题时,我很累。我猜你所说的“重复”是指“可复制的例子”。我不清楚这个短语在堆栈交换中是否有隐含的含义,但为了以防万一,我发布了我用来计算第一个图的数据。如果还有什么我能帮忙的,请告诉我。是的,“reprex”是这里常见的一个词。我已经删除了我的否决票,尽管代码仍然非常丰富-请尝试将您的数据减少到未来问题中最相关的部分。另外,我直到现在才看到你的回复——下次如果你在评论中添加@tjebo,我会得到通知。我不需要将此添加到我的评论中,因为“帖子所有者”(这里:你)在任何情况下都会得到通知。回答很好,谢谢!是否我可以将标准误差乘以1.96,这样我就可以得到接近置信区间的值?或者标准误差已经乘以2,所以我们在平均值的每一侧都看到1*(标准误差)?您可以在
stat\u summary()
函数中指定您想要的描述性函数。使用
fun.ymin
fun.ymax
参数更改段对应的内容。