R 在数据框中组合文本元素并删除文本来自的行_R_Copy Paste

R 在数据框中组合文本元素并删除文本来自的行

R 在数据框中组合文本元素并删除文本来自的行,r,copy-paste,R,Copy Paste,此玩具数据框表示人员输入的时间。我可以使用的格式有多个文本条目，用于完全随机的模式中的同一个人和同一天。同一个人和同一天最多可以有15个文本条目。对于多文本条目，行中没有人员条目 structure(list(Date = structure(c(1514764800, 1514764800, NA, 1517443200, 1519862400, NA, NA, NA, 1519862400, NA, NA), class = c("POSIXct", "POSIXt"), tzone =

此玩具数据框表示人员输入的时间。我可以使用的格式有多个文本条目，用于完全随机的模式中的同一个人和同一天。同一个人和同一天最多可以有15个文本条目。对于多文本条目，行中没有人员条目

structure(list(Date = structure(c(1514764800, 1514764800, NA, 
1517443200, 1519862400, NA, NA, NA, 1519862400, NA, NA), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Person = c("FMC", "ABC", NA, "FMC", 
"ABC", NA, NA, NA, "RWM", NA, NA), Text = c("work on request", 
"More text", "third line", "email to re: summary", "work on loan documents", 
"sixth line of text", "text seven", "eighth in a series", "conferences with working group", 
"line ten", "review and provide comments")), row.names = c(NA, 
-11L), class = c("tbl_df", "tbl", "data.frame"))

如何组合文本元素，使每个人每天只有一行输入，，删除不需要的行（一旦文本粘贴在一起）并到达以下对象

编辑后的问题省略了我尝试过但未成功的循环的


必须有一种方法将给定人员在给定日期的所有文本合并成一行（例如，ABC在2018年1月1日有两个条目），并删除合并文本来自的行。
库（dplyr）
library(dplyr)

merge_lines <- function(x) paste(x, collapse = ' ')

df %>% 
  zoo::na.locf(.) %>%
  group_by(Person) %>%
  summarise_at(vars(Text), (funs(merge_lines)))

合并行%
动物园：：纳。洛夫（%）>%
分组单位（人）%>%
总结（变量（文本），（funs（合并行）））

结果:
# A tibble: 4 x 2
  Person Text                                                                   
  <chr>  <chr>                                                                  
1 ABC    More text third line                                                   
2 FMC    work on request email to re: summary                                   
3 HIL    work on loan documents sixth line of text text seven eighth in a series
4 RWM    conferences with working group line ten review and provide comments    

#一个tible:4 x 2
人物文本
1 ABC更多文本第三行
2 FMC工作请求电子邮件回复：总结
3 HIL贷款文件工作系列第六行第七行第八行
4次RWM会议，工作组第十行审查并提供意见
库（dplyr）
合并行%
动物园：：纳。洛夫（%）>%
分组单位（人）%>%
总结（变量（文本），（funs（合并行）））

结果:
# A tibble: 4 x 2
  Person Text                                                                   
  <chr>  <chr>                                                                  
1 ABC    More text third line                                                   
2 FMC    work on request email to re: summary                                   
3 HIL    work on loan documents sixth line of text text seven eighth in a series
4 RWM    conferences with working group line ten review and provide comments    

#一个tible:4 x 2
人物文本
1 ABC更多文本第三行
2 FMC工作请求电子邮件回复：总结
3 HIL贷款文件工作系列第六行第七行第八行
4次RWM会议，工作组第十行审查并提供意见
我们可以使用na.locf
用最后一个非缺失值填充缺失值（na
），然后通过连续出现Person
对U进行分组，并通过粘贴对文本进行汇总
library(dplyr)
library(zoo)
library(data.table)

df %>%
  na.locf(.) %>%
  group_by(group = rleid(Person)) %>%
  summarise(Text = paste0(Text, collapse = " "))


#  group Text                                                                   
#  <int> <chr>                                                                  
#1     1 work on request                                                        
#2     2 More text third line                                                   
#3     3 email to re: summary                                                   
#4     4 work on loan documents sixth line of text text seven eighth in a series
#5     5 conferences with working group line ten review and provide comments 

我们可以使用na.locf
用最后一个非缺失值填充缺失值（na
），然后通过连续出现Person
对U进行分组，并通过粘贴对文本进行汇总
library(dplyr)
library(zoo)
library(data.table)

df %>%
  na.locf(.) %>%
  group_by(group = rleid(Person)) %>%
  summarise(Text = paste0(Text, collapse = " "))


#  group Text                                                                   
#  <int> <chr>                                                                  
#1     1 work on request                                                        
#2     2 More text third line                                                   
#3     3 email to re: summary                                                   
#4     4 work on loan documents sixth line of text text seven eighth in a series
#5     5 conferences with working group line ten review and provide comments 

不需要太复杂，只需使用tidyverse

根据问题的更改进行调整：
library(tidyverse)

> df%>%
   fill(Date:Person, Date:Person) %>% # Fills missing values in using the previous entry.
   group_by(Date, Person) %>%
   summarise(Text = paste(Text, collapse = ' '))

# A tibble: 5 x 3
  Date                Person Text                                                                   
  <dttm>              <chr>  <chr>                                                                  
1 2018-01-01 00:00:00 ABC    More text third line                                                   
2 2018-01-01 00:00:00 FMC    work on request                                                        
3 2018-02-01 00:00:00 FMC    email to re: summary                                                   
4 2018-03-01 00:00:00 ABC    work on loan documents sixth line of text text seven eighth in a series
5 2018-03-01 00:00:00 RWM    conferences with working group line ten review and provide comments   

# A tibble: 11 x 3
   Date                Person Text                          
   <dttm>              <chr>  <chr>                         
 1 2018-01-01 00:00:00 FMC    work on request               
 2 2018-01-01 00:00:00 ABC    More text                     
 3 NA                  NA     third line                    
 4 2018-02-01 00:00:00 FMC    email to re: summary          
 5 2018-03-01 00:00:00 ABC    work on loan documents        
 6 NA                  NA     sixth line of text            
 7 NA                  NA     text seven                    
 8 NA                  NA     eighth in a series            
 9 2018-03-01 00:00:00 RWM    conferences with working group
10 NA                  NA     line ten                      
11 NA                  NA     review and provide comments   

库（tidyverse）
>df%>%
fill（Date:Person，Date:Person）%>%#使用上一个条目填充缺少的值。
分组单位（日期、人员）%>%
摘要（文本=粘贴（文本，折叠=“”））
#一个tibble:5x3
日期人文本
2018-01-01 00:00:00 ABC更多文本第三行
2018-01-01 00:00:00 FMC应要求工作
3 2018-02-01 00:00:00 FMC电子邮件回复：摘要
4 2018-03-01 00:00:00 ABC贷款文件工作系列第六行第七行第八行
5 2018-03-01 00:00:00 RWM会议，工作组十号线审查并提供意见

数据：
library(tidyverse)

> df%>%
   fill(Date:Person, Date:Person) %>% # Fills missing values in using the previous entry.
   group_by(Date, Person) %>%
   summarise(Text = paste(Text, collapse = ' '))

# A tibble: 5 x 3
  Date                Person Text                                                                   
  <dttm>              <chr>  <chr>                                                                  
1 2018-01-01 00:00:00 ABC    More text third line                                                   
2 2018-01-01 00:00:00 FMC    work on request                                                        
3 2018-02-01 00:00:00 FMC    email to re: summary                                                   
4 2018-03-01 00:00:00 ABC    work on loan documents sixth line of text text seven eighth in a series
5 2018-03-01 00:00:00 RWM    conferences with working group line ten review and provide comments   

# A tibble: 11 x 3
   Date                Person Text                          
   <dttm>              <chr>  <chr>                         
 1 2018-01-01 00:00:00 FMC    work on request               
 2 2018-01-01 00:00:00 ABC    More text                     
 3 NA                  NA     third line                    
 4 2018-02-01 00:00:00 FMC    email to re: summary          
 5 2018-03-01 00:00:00 ABC    work on loan documents        
 6 NA                  NA     sixth line of text            
 7 NA                  NA     text seven                    
 8 NA                  NA     eighth in a series            
 9 2018-03-01 00:00:00 RWM    conferences with working group
10 NA                  NA     line ten                      
11 NA                  NA     review and provide comments   

#一个tible:11 x 3
日期人文本
2018-01-01 00:00:00 FMC应要求工作
2018-01-01 00:00:00 ABC更多文本
3 NA NA第三行
4 2018-02-01 00:00:00 FMC电子邮件回复：摘要
5 2018-03-01 00:00:00 ABC贷款文件工作
6 NA NA第六行文字
7 NA NA文本7
8 NA NA系列中的第八个
9 2018-03-01 00:00:00 RWM与工作组的会议
10 NA NA十号线
11不适用审查并提供意见
无需复杂化，只需使用tidyverse

根据问题的更改进行调整：
library(tidyverse)

> df%>%
   fill(Date:Person, Date:Person) %>% # Fills missing values in using the previous entry.
   group_by(Date, Person) %>%
   summarise(Text = paste(Text, collapse = ' '))

# A tibble: 5 x 3
  Date                Person Text                                                                   
  <dttm>              <chr>  <chr>                                                                  
1 2018-01-01 00:00:00 ABC    More text third line                                                   
2 2018-01-01 00:00:00 FMC    work on request                                                        
3 2018-02-01 00:00:00 FMC    email to re: summary                                                   
4 2018-03-01 00:00:00 ABC    work on loan documents sixth line of text text seven eighth in a series
5 2018-03-01 00:00:00 RWM    conferences with working group line ten review and provide comments   

# A tibble: 11 x 3
   Date                Person Text                          
   <dttm>              <chr>  <chr>                         
 1 2018-01-01 00:00:00 FMC    work on request               
 2 2018-01-01 00:00:00 ABC    More text                     
 3 NA                  NA     third line                    
 4 2018-02-01 00:00:00 FMC    email to re: summary          
 5 2018-03-01 00:00:00 ABC    work on loan documents        
 6 NA                  NA     sixth line of text            
 7 NA                  NA     text seven                    
 8 NA                  NA     eighth in a series            
 9 2018-03-01 00:00:00 RWM    conferences with working group
10 NA                  NA     line ten                      
11 NA                  NA     review and provide comments   

库（tidyverse）
>df%>%
fill（Date:Person，Date:Person）%>%#使用上一个条目填充缺少的值。
分组单位（日期、人员）%>%
摘要（文本=粘贴（文本，折叠=“”））
#一个tibble:5x3
日期人文本
2018-01-01 00:00:00 ABC更多文本第三行
2018-01-01 00:00:00 FMC应要求工作
3 2018-02-01 00:00:00 FMC电子邮件回复：摘要
4 2018-03-01 00:00:00 ABC贷款文件工作系列第六行第七行第八行
5 2018-03-01 00:00:00 RWM会议，工作组十号线审查并提供意见

数据：
library(tidyverse)

> df%>%
   fill(Date:Person, Date:Person) %>% # Fills missing values in using the previous entry.
   group_by(Date, Person) %>%
   summarise(Text = paste(Text, collapse = ' '))

# A tibble: 5 x 3
  Date                Person Text                                                                   
  <dttm>              <chr>  <chr>                                                                  
1 2018-01-01 00:00:00 ABC    More text third line                                                   
2 2018-01-01 00:00:00 FMC    work on request                                                        
3 2018-02-01 00:00:00 FMC    email to re: summary                                                   
4 2018-03-01 00:00:00 ABC    work on loan documents sixth line of text text seven eighth in a series
5 2018-03-01 00:00:00 RWM    conferences with working group line ten review and provide comments   

# A tibble: 11 x 3
   Date                Person Text                          
   <dttm>              <chr>  <chr>                         
 1 2018-01-01 00:00:00 FMC    work on request               
 2 2018-01-01 00:00:00 ABC    More text                     
 3 NA                  NA     third line                    
 4 2018-02-01 00:00:00 FMC    email to re: summary          
 5 2018-03-01 00:00:00 ABC    work on loan documents        
 6 NA                  NA     sixth line of text            
 7 NA                  NA     text seven                    
 8 NA                  NA     eighth in a series            
 9 2018-03-01 00:00:00 RWM    conferences with working group
10 NA                  NA     line ten                      
11 NA                  NA     review and provide comments   

#一个tible:11 x 3
日期人文本
2018-01-01 00:00:00 FMC应要求工作
2018-01-01 00:00:00 ABC更多文本
3 NA NA第三行