R 如何可视化以下数据集?
我有以下数据集,可在统计编程语言R中复制:R 如何可视化以下数据集?,r,data-visualization,R,Data Visualization,我有以下数据集,可在统计编程语言R中复制: library(data.table) sheet1 <- data.table(userID = c('abc123', 'abc123', 'abc123', 'def456', 'def456'), sessionID = c('1529665492722.251rq8', '1529922427795.g2k607go',
library(data.table)
sheet1 <- data.table(userID = c('abc123', 'abc123', 'abc123', 'def456', 'def456'),
sessionID = c('1529665492722.251rq8',
'1529922427795.g2k607go',
'1529931067235.0yw5eqfa6',
'1529945600035.345m7ym1',
'1529950171742.fhmkcj6l'),
month = '6',
totalpageviews = c('10', '15', '56', '23', '24'),
pagePath = c('application/123', 'application/456', 'application/789', 'application/101112', 'application/131415'))
sheet2 <- data.table(userID = c('abc123', 'abc123'),
sessionID = c('1529665492722.251rq8', '1529922427795.g2k607go'),
eventCategory = c('x', 'x', 'c'),
eventAction = c('y', 'z', 'a'),
pagePath = c('application/123', 'application/123', 'application/123'))
库(data.table)
sheet1学习需要一些时间,但ggplot2可以为您提供大量里程。退房
如果您喜欢将SessionID转换为有序或数字的形式,而不是当前使用的分类变量,那么它还可以帮助您查看时间序列趋势
以下是我如何想象您目前拥有的:
# install.packages('dplyr')
library(dplyr)
sheet <- full_join(sheet1, sheet2)
# install.packages('ggplot2') # visualization package
library(ggplot2)
# all data; bars including NAs and Event category/action
(p <- ggplot(sheet) +
geom_col(aes(sessionID, totalpageviews, fill = interaction(eventCategory, eventAction)), position = 'dodge') +
guides(fill = guide_legend(title = 'Event Category.Action')) +
theme(axis.text.x = element_text(angle = -30, hjust = .3)))
# just application/123
(p2 <- p %+% (sheet %>% filter(pagePath == 'application/123')))
# just page views and page path
(p3 <- ggplot(sheet %>% select(totalpageviews, pagePath)) +
geom_bar(aes(totalpageviews, pagePath), stat = 'identity', fill = scales::muted('blue')))
#安装程序包('dplyr'))
图书馆(dplyr)
有很多方法可以可视化数据,但是可视化的主要目标是什么?你想向未经培训的人展示什么?课程中发生的所有活动。在会话1529665492722.251rq8
中,记录了页面应用程序/123的10个页面视图,用户在其中触发了2个事件x和c。