R-计数到一个新列_R_Ggplot2 - Fatal编程技术网

R-计数到一个新列

R-计数到一个新列,r,ggplot2,R,Ggplot2,我试图在这条消息的底部制作一些虚拟数据图表，并提出一些问题是否建议生成一个包含汇总统计信息的新数据框，以便年份列变得唯一，第二列提供总计数，或者我可以按原样处理数据与此相关的是，如果我真的想创建一个新的数据框架，那么最好的方法是什么使它具有：年份、总计数、每学期计数、每个社会计数我的dummyyearcount数据帧是使用以下方法创建的： dummyyearcount <- count(dummydata, 'Year') dput（）的输出：我最终想要实现的外观的示例图：我

我试图在这条消息的底部制作一些虚拟数据图表，并提出一些问题

是否建议生成一个包含汇总统计信息的新数据框，以便年份列变得唯一，第二列提供总计数，或者我可以按原样处理数据

与此相关的是，如果我真的想创建一个新的数据框架，那么最好的方法是什么使它具有：年份、总计数、每学期计数、每个社会计数

我的dummyyearcount数据帧是使用以下方法创建的：

dummyyearcount <- count(dummydata, 'Year')

dput（）的输出：

我最终想要实现的外观的示例图：

我对R还是一个新手，因此非常感谢您的帮助。

我喜欢使用

data.table

包这样做，因为它对我来说非常容易理解（但这不是唯一的方法）：

我喜欢使用

data.table

包这样做，因为它对我来说非常容易处理（但这不是唯一的方法）：

使用

plyr

软件包中的

count

功能来计算发生次数

#dummy data

df <- data.frame(Year = sample(1984:2014, 200, replace = TRUE), Title = sample(c("Paper A","Paper B","Paper C","Paper D","Paper E","Paper F","Paper G"), 200, replace = TRUE),Authors = sample(c("Stuart","Jerry","Kevin","Phil","Gru","Nefario","Phil","Josh"),200,replace = TRUE), Society = sample(c("lab1","lab2","lab3","lab4","lab5"),200,replace = TRUE),Term = sample(c("1st","2nd","3rd","4th"),200,replace = TRUE))

#grouping data based on society and year

library(plyr)
df.1 <- count(df, vars = c("Society","Year"))

#plotting the respective line plot

library(ggplot2)
p <- ggplot(df.1,aes(x = Year, y = freq, color = Society, group = Society)) + geom_line() + geom_point() + scale_x_continuous(breaks = df.1$Year)
p

使用

plyr

软件包中的

count

功能来计算发生次数

#dummy data

df <- data.frame(Year = sample(1984:2014, 200, replace = TRUE), Title = sample(c("Paper A","Paper B","Paper C","Paper D","Paper E","Paper F","Paper G"), 200, replace = TRUE),Authors = sample(c("Stuart","Jerry","Kevin","Phil","Gru","Nefario","Phil","Josh"),200,replace = TRUE), Society = sample(c("lab1","lab2","lab3","lab4","lab5"),200,replace = TRUE),Term = sample(c("1st","2nd","3rd","4th"),200,replace = TRUE))

#grouping data based on society and year

library(plyr)
df.1 <- count(df, vars = c("Society","Year"))

#plotting the respective line plot

library(ggplot2)
p <- ggplot(df.1,aes(x = Year, y = freq, color = Society, group = Society)) + geom_line() + geom_point() + scale_x_continuous(breaks = df.1$Year)
p

您是否可以提供一个例子来说明您想要完成的任务？@Ashish据我所知，OP确实提供了MWE。您是否可以提供一个例子来说明您想要完成的任务？@Ashish据我所知，OP确实提供了MWE。

require(data.table)
#  Turn data.frame into a data.table with term and year as group identifiers
setDT(dummydata ,key = c("Term","Year")) 
#  Get number of records in each group
dummydata[ ,  N := .N , by = .(Year,Term) ]
#  Plot
ggplot( dummydata , aes( x = Year  , y = cumsum(N) , colour = Term ) ) +
geom_line()

#dummy data

df <- data.frame(Year = sample(1984:2014, 200, replace = TRUE), Title = sample(c("Paper A","Paper B","Paper C","Paper D","Paper E","Paper F","Paper G"), 200, replace = TRUE),Authors = sample(c("Stuart","Jerry","Kevin","Phil","Gru","Nefario","Phil","Josh"),200,replace = TRUE), Society = sample(c("lab1","lab2","lab3","lab4","lab5"),200,replace = TRUE),Term = sample(c("1st","2nd","3rd","4th"),200,replace = TRUE))

#grouping data based on society and year

library(plyr)
df.1 <- count(df, vars = c("Society","Year"))

#plotting the respective line plot

library(ggplot2)
p <- ggplot(df.1,aes(x = Year, y = freq, color = Society, group = Society)) + geom_line() + geom_point() + scale_x_continuous(breaks = df.1$Year)
p

df.2 <- count(df, vars = c("Society","Year","Term"))

p2 <- ggplot(df.2,aes(x = Year, y = freq, color = Society, group = Society, shape = Term)) + geom_line() + geom_point(aes(size = Term)) + scale_x_continuous(breaks = df.2$Year)

p2