R GGPLOT-显示年级和年份之间的年度入学率连通性
我有1990-2017年的学生入学数据:R GGPLOT-显示年级和年份之间的年度入学率连通性,r,ggplot2,dplyr,R,Ggplot2,Dplyr,我有1990-2017年的学生入学数据: nominal_roll1 <- tribble(~"Grade",~"1991-92", ~"1992-93", ~"1993-94", ~"1994-95", ~"1995-96",~"1996-97", ~"1997-98", ~"1998-99", ~"1999-00", ~"2000-01", ~"2001-02",~"2002-03", ~"2003-04", ~"2004-05", ~"2005-06", ~"2006-07", ~"
nominal_roll1 <- tribble(~"Grade",~"1991-92", ~"1992-93", ~"1993-94", ~"1994-95", ~"1995-96",~"1996-97", ~"1997-98", ~"1998-99", ~"1999-00", ~"2000-01", ~"2001-02",~"2002-03", ~"2003-04", ~"2004-05", ~"2005-06", ~"2006-07", ~"2007-08",~"2008-09", ~"2009-10", ~"2010-11", ~"2011-12", ~"2012-13", ~"2013-14",~"2014-15", ~"2015-16", ~"2016-17", ~"2017-18",
"K4", 88,92,99,101,90,99,103,111,95,92,84,92,107,86,93,82,98,92,96,121,154,137,137,145,155,160,160,
"K5", 87,89,88,102,107,94,102,106,111,102,98,88,72,89,84,108,82,115,98,93,121,154,137,137,145,155,160,
"Gr. 1", 107,102,105,104,122,114,119,134,111,125,120,113,118,121,104,109,103,113,135,88,93,121,154,137,137,137,155,
"Gr. 2", 90,113,100,109,99,118,102,105,130,104,132,128,114,108,97,99,109,98,97,87,88,93,121,154,137,137,137,
"Gr. 3", 81,86,102,102,112,108,119,103,112,121,105,121,107,113,90,101,93,101,102,97,87,88,93,121,154,154,137,
"Gr. 4", 67,84,86,91,88,105,111,113,94,114,122,127,138,109,92,92,99,89,98,90,97,87,88,93,121,121,154,
"Gr. 5", 67,76,84,94,96,97,117,112,119,109,106,104,121,145,100,102,90,103,94,98,90,97,87,88,93,93,121,
"Gr. 6", 66,76,74,83,92,95,81,113,105,102,106,106,100,115,120,107,101,89,106,127,98,90,97,87,88,88,93,
"Gr. 3", 81,77,86,85,88,88,112,96,113,110,120,111,120,121,94,126,103,110,93,83,127,98,90,97,87,87,88,
"Gr. 8", 59,76,71,68,84,74,48,85,94,85,102,124,131,111,84,113,123,104,111,88,83,127,98,90,97,97,87,
"Sr. 1", 62,62,64,89,77,73,90,82,104,122,120,106,103,177,138,149,152,174,184,88,111,83,127,98,90,90,97,
"Sr. 2", 55,78,62,68,62,76,71,131,69,85,130,132,113,141,91,175,125,159,182,182,184,111,83,127,98,98,90,
"Sr. 3", 3,71,60,51,66,44,53,97,75,59,82,143,136,136,76,108,144,126,98,98,182,184,88,83,127,127,98,
"SR. 4", 0,66,65,32,49,67,83,56,77,45,79,68,182,160,69,121,97,127,157,157,98,182,59,88,83,83,127,
"MSP", 0,1,1,1,0,0,0,0,0,0,16,20,41,10,22,36,42,38,51,NA,NA,NA,20,NA,NA,NA,NA)
这很好,但你可以最清楚地看到,在过去5年中,学生入学率是如何稳定的:4年级到5年级到6年级的学生人数是相同的。然而,它的表现方式使它看起来不稳定
有没有人知道我如何更好地表达这一点,展示一个毕业年份和下一个毕业年份之间的联系?我正在使用
cumsum
和其他方法,但无法实现每年的连接。我希望结果能代表过去几年的稳定,现在看来,这看起来很混乱。如果你想让人们对出勤人数的变化不那么敏感,也许是一个平铺图
library(tidyverse)
nominal_tidy1 %>%
drop_na(Grade) %>%
ggplot(aes(x = Year, y = Grade, fill = Attendance)) +
geom_tile() +
scale_fill_viridis_c() +
theme_minimal(16) +
theme(legend.title = element_text(size = 14),
legend.text = element_text(size = 14),
axis.text.x = element_text(angle = 90),
text = element_text(family="Lato"),
plot.title = element_text(size=18, hjust = 0.5),
plot.caption = element_text(size = 12, hjust = 1),
axis.text.y = element_text(hjust = 0),
panel.grid = element_line(colour = "#F0F0F0"),
plot.margin = unit(c(1,1,0.5,1), "cm")) +
labs(title = "Nominal Roll, 1991 - 2018")
好的,展开我的评论:
我们假设g
年级t
的注册与(g-1)
年级(t-1)
的注册大致相同。例如,2000年上四年级的学生应在一年后上五年级(+/-一些随机波动):
(很抱歉,stackoverflow似乎不支持LaTeX公式)
函数\gamma(g,t)
是生长函数;基本上,它也是一个矩阵,就像你的nominal\u roll1
。如果您的假设是正确的,那么它的行(不同年份具有相同等级的元素)应该或多或少保持不变。各栏的情况可能不太一样,例如,你可能预计一年级入学人数的增长比例过高
但是,如果您绘制了\gamma
的平铺图,则会得到以下结果(积分到):
数值约为1,存在一些随机噪声,但从2011年起,矩阵令人怀疑平静(除2016-17年外,无噪声,无波动)。显然,政策变化产生了一些影响
代码如下:
gamma <- nominal_roll1[2:nrow(nominal_roll1), 3:ncol(nominal_roll1)] /
nominal_roll1[1:(nrow(nominal_roll1)-1), 2:(ncol(nominal_roll1)-1)]
gamma$intoGrade <- nominal_roll1$Grade[2:nrow(nominal_roll1)]
library(tidyverse)
gamma_tidy <- gamma %>%
mutate(FakeCrudeBirthRate = rnorm(nrow(.), mean = 12.5, sd = .5),
FakeFertilityRate = rnorm(nrow(.), mean = 2.2, sd = .05)) %>%
gather(Year, AttndRise, `1992-93`:`2017-18`) %>%
mutate(Year_ = as.numeric(str_trunc(.$Year, side = "right", width = 4, ellipsis = "")),
intoGrade = factor(intoGrade, levels = c("K5","Gr. 1","Gr. 2","Gr. 3","Gr. 4",
"Gr. 5","Gr. 6","Gr. 7","Gr. 8","Sr. 1", "Sr. 2", "Sr. 3", "Sr. 4", "MSP")))
gamma_tidy$AttndRise[is.infinite(gamma_tidy$AttndRise)] = NA
gamma_tidy %>%
drop_na(intoGrade) %>%
ggplot(aes(x = Year, y = intoGrade, fill = AttndRise)) +
geom_tile() +
scale_fill_viridis_c() +
theme_minimal(16) +
theme(legend.title = element_text(size = 14),
legend.text = element_text(size = 14),
axis.text.x = element_text(angle = 90),
text = element_text(family="Lato"),
plot.title = element_text(size=18, hjust = 0.5),
plot.caption = element_text(size = 12, hjust = 1),
axis.text.y = element_text(hjust = 0),
panel.grid = element_line(colour = "#F0F0F0"),
plot.margin = unit(c(1,1,0.5,1), "cm")) +
labs(title = "Rise in Roll, 1992 - 2018")
gamma%
变异(年份=为.numeric(str_trunc(.$Year,side=“right”,width=4,省略号=”),
intoGrade=系数(intoGrade,等级=c(“K5”、“第1组”、“第2组”、“第3组”、“第4组”),
“第5组”、“第6组”、“第7组”、“第8组”、“第1组”、“第2组”、“第3组”、“第4组”、“MSP”))
gamma_tidy$AttndRise[无限(gamma_tidy$AttndRise)]=NA
伽玛值%>%
下降(入年级)%>%
ggplot(aes(x=年份,y=入品位,填充=AttndRise))+
geom_瓷砖()+
鳞片_填充_绿色_c()+
主题(16)+
主题(legend.title=元素\文本(大小=14),
legend.text=元素\文本(大小=14),
轴.text.x=元素_文本(角度=90),
text=element_text(family=“Lato”),
plot.title=元素\文本(大小=18,大小=0.5),
plot.caption=元素\文本(大小=12,大小=1),
axis.text.y=元素\文本(hjust=0),
panel.grid=element_line(color=“#f0”),
plot.margin=单位(c(1,1,0.5,1),“cm”))+
实验室(title=“滚动上升,1992-2018”)
我不明白:你想展示什么?或者,更好的是:根据图表,你想做出什么决定?谢谢,我应该说的是-2010年左右,除其他因素外,政策发生了变化,我希望能够展示这些变化产生的影响,从df中可以看出入学率的稳定性,但我不能很好地表达这一点,你不应该显示入学人数的相对变化吗?这应该是多年来大致相同的百分比,所有年级都应该是相同的,除非政策变化影响了它,对吗?是的,你是对的,我认为通过显示入学率的相对变化可以最好地说明这一点。但是,我不知道是否应该将其与年份(2006年第4组-2007年第4组)进行比较,而是按顺序进行比较。e、 g.四年级入学率(2006年)->五年级入学率(2007年)。想法?@IgorF.,你愿意将此作为答案提交吗?这很好-我将过滤一些结果,例如,仅查看某些年级-年级,看看它是否定义了我们在小学阶段看到的积极出勤率变化。谢谢,我要看一下上面关于显示线条或瓷砖的相对变化的建议,我认为这更符合我的目的。这很好,伽马射线是我一直绞尽脑汁想的东西,我真的不知道如何构造问题。谢谢@igor-f,我将在接下来的步骤中继续使用这种方法,为未来5、10年建立一个模型。
library(tidyverse)
nominal_tidy1 %>%
drop_na(Grade) %>%
ggplot(aes(x = Year, y = Grade, fill = Attendance)) +
geom_tile() +
scale_fill_viridis_c() +
theme_minimal(16) +
theme(legend.title = element_text(size = 14),
legend.text = element_text(size = 14),
axis.text.x = element_text(angle = 90),
text = element_text(family="Lato"),
plot.title = element_text(size=18, hjust = 0.5),
plot.caption = element_text(size = 12, hjust = 1),
axis.text.y = element_text(hjust = 0),
panel.grid = element_line(colour = "#F0F0F0"),
plot.margin = unit(c(1,1,0.5,1), "cm")) +
labs(title = "Nominal Roll, 1991 - 2018")
e(g, t) = e(g-1, t-1) * \gamma(g, t) +\epsilon
gamma <- nominal_roll1[2:nrow(nominal_roll1), 3:ncol(nominal_roll1)] /
nominal_roll1[1:(nrow(nominal_roll1)-1), 2:(ncol(nominal_roll1)-1)]
gamma$intoGrade <- nominal_roll1$Grade[2:nrow(nominal_roll1)]
library(tidyverse)
gamma_tidy <- gamma %>%
mutate(FakeCrudeBirthRate = rnorm(nrow(.), mean = 12.5, sd = .5),
FakeFertilityRate = rnorm(nrow(.), mean = 2.2, sd = .05)) %>%
gather(Year, AttndRise, `1992-93`:`2017-18`) %>%
mutate(Year_ = as.numeric(str_trunc(.$Year, side = "right", width = 4, ellipsis = "")),
intoGrade = factor(intoGrade, levels = c("K5","Gr. 1","Gr. 2","Gr. 3","Gr. 4",
"Gr. 5","Gr. 6","Gr. 7","Gr. 8","Sr. 1", "Sr. 2", "Sr. 3", "Sr. 4", "MSP")))
gamma_tidy$AttndRise[is.infinite(gamma_tidy$AttndRise)] = NA
gamma_tidy %>%
drop_na(intoGrade) %>%
ggplot(aes(x = Year, y = intoGrade, fill = AttndRise)) +
geom_tile() +
scale_fill_viridis_c() +
theme_minimal(16) +
theme(legend.title = element_text(size = 14),
legend.text = element_text(size = 14),
axis.text.x = element_text(angle = 90),
text = element_text(family="Lato"),
plot.title = element_text(size=18, hjust = 0.5),
plot.caption = element_text(size = 12, hjust = 1),
axis.text.y = element_text(hjust = 0),
panel.grid = element_line(colour = "#F0F0F0"),
plot.margin = unit(c(1,1,0.5,1), "cm")) +
labs(title = "Rise in Roll, 1992 - 2018")