R 如何按季度对数据帧中的行进行分组?

R 如何按季度对数据帧中的行进行分组?,r,dataframe,R,Dataframe,我有一个数据框,有213行和2列(日期和文章)。最终目的是通过按季度对日期进行分组来减少行数。显然,我希望将文章列中的文本进行相应的合并 让我们举个例子 Date <- c("2000-01-05", "2000-02-03", "2000-03-02", "2000-03-30", "2000-04-13", "2000-05-11", "2000-06-08", "2000-07-06", "2000-09-14", "2000-10-05", "2000-10-19", "2000-

我有一个数据框,有213行和2列(日期和文章)。最终目的是通过按季度对日期进行分组来减少行数。显然,我希望将文章列中的文本进行相应的合并

让我们举个例子

Date <- c("2000-01-05", "2000-02-03", "2000-03-02", "2000-03-30", "2000-04-13", "2000-05-11", "2000-06-08", "2000-07-06", "2000-09-14", "2000-10-05", "2000-10-19", "2000-11-02", "2000-12-14")
Article <- c("Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text")

Date <- data.frame(Date)
Article <- data.frame(Article)

df <- cbind(Date, Article)

#Dataframe

Date           Article
1  2000-01-05 Long Text
2  2000-02-03 Long Text
3  2000-03-02 Long Text
4  2000-03-30 Long Text
5  2000-04-13 Long Text
6  2000-05-11 Long Text
7  2000-06-08 Long Text
8  2000-07-06 Long Text
9  2000-09-14 Long Text
10 2000-10-05 Long Text
11 2000-10-19 Long Text
12 2000-11-02 Long Text
13 2000-12-14 Long Text
基本上,这些行是按四分之一以及相应的文本分组在一起的

我试着四处看看,但我不知道怎么做,毫不犹豫

有人能帮我吗


谢谢

一个
dplyr
lubridate
选项可以是:

df %>%
 group_by(Date = as.character(lubridate::quarter(ymd(Date), with_year = TRUE))) %>%
 summarise(Article = paste0(Article, collapse = ",")) 

  Date   Article                                
  <chr>  <chr>                                  
1 2000.1 Long Text,Long Text,Long Text,Long Text
2 2000.2 Long Text,Long Text,Long Text          
3 2000.3 Long Text,Long Text                    
4 2000.4 Long Text,Long Text,Long Text,Long Text
df%>%
分组依据(日期=as.character(lubridate::quarter(ymd(日期),年份=TRUE)))%>%
总结(Article=paste0(Article,collapse=“,”))
日期文章
1 2000.1长文本,长文本,长文本,长文本
2000.2长文本,长文本,长文本
3.2000.3长文本,长文本
4.2000.4长文本,长文本,长文本,长文本

我们可以使用
动物园中的
as.yearqtr
进行总结

library(zoo)
library(data.table)
setDT(df)[, .(Article = toString(Article)),.(Date = as.yearqtr(as.IDate(Date)))]
#   Date                                    Article
#1: 2000 Q1 Long Text, Long Text, Long Text, Long Text
#2: 2000 Q2            Long Text, Long Text, Long Text
#3: 2000 Q3                       Long Text, Long Text
#4: 2000 Q4 Long Text, Long Text, Long Text, Long Text
基本R解决方案:

# Row-wise concatenate Article vec by the group of year & qtr: 

aggregate(list(Article = df$Article),

          by = list(Date = paste(gsub("[-].*", "", df$Date), quarters(df$Date), sep = " ")),

          paste, sep = ", ")
数据:

df <- data.frame(Date = as.Date(c("2000-01-05",
                                   "2000-02-03",
                                   "2000-03-02",
                                   "2000-03-30", "2000-04-13", "2000-05-11", "2000-06-08",
                                   "2000-07-06", "2000-09-14", "2000-10-05", "2000-10-19",
                                   "2000-11-02", "2000-12-14"),
                                 "%Y-%m-%d"),
            Article = c("Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text"))

df谢谢!它在示例中起作用,但在我的实际数据帧中,我得到以下错误:
as.Date.default(x,tz=tz,…)中的错误:不知道如何将“x”转换为类“Date”
。有什么想法吗?@Arma_91如果
Date
格式是
%Y-%m-%d
格式,您不必在
as.Date
as.IDate
中指定
format
参数(这里是默认的日期格式),但如果它不同,则指定
格式
或使用
库(随时)
as.yearqtr(anydate(Date))`如果需要自动提取格式并进行转换(假设该格式在
getFormats()
中可用,那么我的格式确实是
%Y-%d-%m
。因此它应该是这样的:
setDT(df)[,(Article=toString(Article)),(Date=as.yearqtr(as.IDate(Date)),格式=%Y-%d-%m)]
?我没有办法让它工作。只是得到了同样的错误。@Arma_91
格式应该在
as.IDate
中,即
as.IDate(Date,format=“%Y-%d-%m”)
明白了!我必须更改一列的名称。非常感谢!Life saver!谢谢!它在示例中起作用,但在实际日期中它说:“警告消息:无法分析所有格式。未找到任何格式。”有什么想法吗?它基于您的示例数据。如果您使用另一种日期格式,则需要指定它。您的实际日期格式是什么?非常感谢!我按照您告诉我的进行了排序!再次感谢您的帮助
df <- data.frame(Date = as.Date(c("2000-01-05",
                                   "2000-02-03",
                                   "2000-03-02",
                                   "2000-03-30", "2000-04-13", "2000-05-11", "2000-06-08",
                                   "2000-07-06", "2000-09-14", "2000-10-05", "2000-10-19",
                                   "2000-11-02", "2000-12-14"),
                                 "%Y-%m-%d"),
            Article = c("Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text"))