R-计数到一个新列
我试图在这条消息的底部制作一些虚拟数据图表,并提出一些问题 是否建议生成一个包含汇总统计信息的新数据框,以便年份列变得唯一,第二列提供总计数,或者我可以按原样处理数据 与此相关的是,如果我真的想创建一个新的数据框架,那么最好的方法是什么使它具有:年份、总计数、每学期计数、每个社会计数 我的dummyyearcount数据帧是使用以下方法创建的:R-计数到一个新列,r,ggplot2,R,Ggplot2,我试图在这条消息的底部制作一些虚拟数据图表,并提出一些问题 是否建议生成一个包含汇总统计信息的新数据框,以便年份列变得唯一,第二列提供总计数,或者我可以按原样处理数据 与此相关的是,如果我真的想创建一个新的数据框架,那么最好的方法是什么使它具有:年份、总计数、每学期计数、每个社会计数 我的dummyyearcount数据帧是使用以下方法创建的: dummyyearcount <- count(dummydata, 'Year') dput()的输出: 我最终想要实现的外观的示例图: 我
dummyyearcount <- count(dummydata, 'Year')
dput()的输出:
我最终想要实现的外观的示例图:
我对R还是一个新手,因此非常感谢您的帮助。我喜欢使用
data.table
包这样做,因为它对我来说非常容易理解(但这不是唯一的方法):
我喜欢使用
data.table
包这样做,因为它对我来说非常容易处理(但这不是唯一的方法):
使用
plyr
软件包中的count
功能来计算发生次数
#dummy data
df <- data.frame(Year = sample(1984:2014, 200, replace = TRUE), Title = sample(c("Paper A","Paper B","Paper C","Paper D","Paper E","Paper F","Paper G"), 200, replace = TRUE),Authors = sample(c("Stuart","Jerry","Kevin","Phil","Gru","Nefario","Phil","Josh"),200,replace = TRUE), Society = sample(c("lab1","lab2","lab3","lab4","lab5"),200,replace = TRUE),Term = sample(c("1st","2nd","3rd","4th"),200,replace = TRUE))
#grouping data based on society and year
library(plyr)
df.1 <- count(df, vars = c("Society","Year"))
#plotting the respective line plot
library(ggplot2)
p <- ggplot(df.1,aes(x = Year, y = freq, color = Society, group = Society)) + geom_line() + geom_point() + scale_x_continuous(breaks = df.1$Year)
p
使用
plyr
软件包中的count
功能来计算发生次数
#dummy data
df <- data.frame(Year = sample(1984:2014, 200, replace = TRUE), Title = sample(c("Paper A","Paper B","Paper C","Paper D","Paper E","Paper F","Paper G"), 200, replace = TRUE),Authors = sample(c("Stuart","Jerry","Kevin","Phil","Gru","Nefario","Phil","Josh"),200,replace = TRUE), Society = sample(c("lab1","lab2","lab3","lab4","lab5"),200,replace = TRUE),Term = sample(c("1st","2nd","3rd","4th"),200,replace = TRUE))
#grouping data based on society and year
library(plyr)
df.1 <- count(df, vars = c("Society","Year"))
#plotting the respective line plot
library(ggplot2)
p <- ggplot(df.1,aes(x = Year, y = freq, color = Society, group = Society)) + geom_line() + geom_point() + scale_x_continuous(breaks = df.1$Year)
p
您是否可以提供一个例子来说明您想要完成的任务?@Ashish据我所知,OP确实提供了MWE。您是否可以提供一个例子来说明您想要完成的任务?@Ashish据我所知,OP确实提供了MWE。
require(data.table)
# Turn data.frame into a data.table with term and year as group identifiers
setDT(dummydata ,key = c("Term","Year"))
# Get number of records in each group
dummydata[ , N := .N , by = .(Year,Term) ]
# Plot
ggplot( dummydata , aes( x = Year , y = cumsum(N) , colour = Term ) ) +
geom_line()
#dummy data
df <- data.frame(Year = sample(1984:2014, 200, replace = TRUE), Title = sample(c("Paper A","Paper B","Paper C","Paper D","Paper E","Paper F","Paper G"), 200, replace = TRUE),Authors = sample(c("Stuart","Jerry","Kevin","Phil","Gru","Nefario","Phil","Josh"),200,replace = TRUE), Society = sample(c("lab1","lab2","lab3","lab4","lab5"),200,replace = TRUE),Term = sample(c("1st","2nd","3rd","4th"),200,replace = TRUE))
#grouping data based on society and year
library(plyr)
df.1 <- count(df, vars = c("Society","Year"))
#plotting the respective line plot
library(ggplot2)
p <- ggplot(df.1,aes(x = Year, y = freq, color = Society, group = Society)) + geom_line() + geom_point() + scale_x_continuous(breaks = df.1$Year)
p
df.2 <- count(df, vars = c("Society","Year","Term"))
p2 <- ggplot(df.2,aes(x = Year, y = freq, color = Society, group = Society, shape = Term)) + geom_line() + geom_point(aes(size = Term)) + scale_x_continuous(breaks = df.2$Year)
p2