R中交互式直方图选择的数据提取与汇总
我想使用plotly或其他软件包创建一个交互式直方图,如果更适合R,则使用类似于此示例集的数据:R中交互式直方图选择的数据提取与汇总,r,plotly,histogram,R,Plotly,Histogram,我想使用plotly或其他软件包创建一个交互式直方图,如果更适合R,则使用类似于此示例集的数据: test<-data.frame(sex=c("m","m","f","f","m","m","f","m","f","m","m","m","f
test<-data.frame(sex=c("m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m"),weight=runif(80,5,9))
我想展示每个性别的体重分布的两个叠加直方图,以及一些汇总统计数据,如标准差、平均值、样本数量、所有性别以及全球范围。
此外,我希望能够更好地使用范围滑块或选择框进行选择,同时将这些汇总统计信息更新到选择中。然后,我希望能够向原始数据集添加一个变量,以指示样本是否是选择的一部分。
谢谢你的帮助!即使它只是指向一个相关的在线资源,我也很难找到一个解决类似问题的资源。@DataZhukov这是一个基于您更大数据样本的修订答案。根据回答,我删除了并排思考年龄金字塔,并展示了如何使用{plotly}作为直方图 虽然{plotly}支持交互,但它基于静态html网页的概念。这意味着在查看页面的客户端/用户上不进行活动计算。 对于简单的统计/摘要,您可以查看{crosstalk}&以启用一些动态更新,即客户端计算。 对于一个成熟的动态选择/过滤/重新计算类型的交互,{shinny}是一个不错的选择。但那是另一场球赛 {plotly}允许您通过指定add_text层自由放置文本批注。 我是根据你的数据构造的。您也可以手动以向量的形式定义它 如果使用数据帧作为输入数据结构,请注意{plotly}对变量使用tilde符号
test<-data.frame(sex=c("m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m","m","m","f","f","m","m","f","m","f","m"),weight=runif(80,5,9))
# calculate mean, sd, etc based on given data
# note you can also define this with simple vectors
total_stats <- test_df %>%
summarise(SAMPLE = n(), MEAN_WEIGHT = mean(weight), SD = sd(weight)) %>%
mutate(sex = "m+f")
group_stats <- test_df %>% group_by(sex) %>%
summarise(SAMPLE = n(), MEAN_WEIGHT = mean(weight), SD = sd(weight))
my_stats <- bind_rows(total_stats, group_stats) %>%
mutate(LABEL = paste0(sex, " sample size: ", SAMPLE
, " with mean ", round(MEAN_WEIGHT, 2)
, " and SD ", round(SD, 2)
)
)
# format your text, e.g. font face and size ---- format to your liking
tf <- list(
family = "sans serif",
size = 11
)
这将产生:
显然,您可以自由定义文本批注的x、y位置
默认行为将计数条并排放置。如果要强制覆盖行为,可以绘制两个直方图并强制这两个图形层进行覆盖。对于后者,需要在布局层中设置模式。我还使用了alpha透明度,因为您的数据样本中可能存在重叠计数。文本放置等遵循上述原则
# split test data frame in a male and female df
males <- test %>% filter(sex == "m")
fems <- test %>% filter(sex == "f")
plot_ly(
alpha = 0.5 # set alpha to ensure visibility on overlapping counts
, nbinsx = 20 # set number of bins
) %>%
#------------ add a histogram layer per group -------------------
add_histogram(data = males, x = ~weight, name = "male") %>%
add_histogram(data = fems, x = ~weight, name = "female") %>%
#------------ tweak layout --------------------------------------
layout(
barmode = "overlay" # to change side-by-side default to overlay
)
谢谢你的回答,至少在一定程度上帮助了我。我编辑了我的问题,以显示示例数据集,使其更接近我计划使用的实际数据,但它应该是两个直方图,而不是您在这里提出的金字塔图。此外,你的答案中的SD和Mean是否随选择内容更新?对我来说似乎不是这样,而且我还不清楚如何提取或指示图形上的选定数据。我一直在尝试使用shiny来完成这项工作,但没有成功。我修改了我的答案,以备不时之需。简单地说,交互性级别仅限于操纵其显示方式,例如,您可以选择要显示/不显示或放大的图层“男/女”。客户端没有活动的重新计算,即静态。html小部件对此提供了一些支持。然而,如果你想把你的绘图和这些小部件结合起来,你需要使用一个与{crosstalk}兼容的小部件,现在它是有限的。如果您想使用数据的主动筛选并重新计算stats:=动态,那么Shining是一个不错的选择。祝你好运
# split test data frame in a male and female df
males <- test %>% filter(sex == "m")
fems <- test %>% filter(sex == "f")
plot_ly(
alpha = 0.5 # set alpha to ensure visibility on overlapping counts
, nbinsx = 20 # set number of bins
) %>%
#------------ add a histogram layer per group -------------------
add_histogram(data = males, x = ~weight, name = "male") %>%
add_histogram(data = fems, x = ~weight, name = "female") %>%
#------------ tweak layout --------------------------------------
layout(
barmode = "overlay" # to change side-by-side default to overlay
)