根据R中的条件合并字符串和时间戳_R_Conditional Statements

根据R中的条件合并字符串和时间戳

根据R中的条件合并字符串和时间戳,r,conditional-statements,R,Conditional Statements,我有带时间戳的演讲稿： df line speaker utterance timestamp 1 0001 ID16.1 ah-ha 00:00:07.060 - 00:00:07.660 3 0002 <NA> yes 00

我有带时间戳的演讲稿：

df
   line speaker                                utterance                   timestamp
1  0001  ID16.1                                    ah-ha 00:00:07.060 - 00:00:07.660
3  0002    <NA>                                      yes 00:00:07.964 - 00:00:08.610
5  0003    <NA> okay so where do we know each other from 00:00:16.350 - 00:00:22.170
7  0004  ID16.2        U uh Upper Rhine Cruises? maybe?  00:00:23.400 - 00:00:26.600
9  0005  ID16.3           yeah? ((pause)) well I do n't- 00:00:26.305 - 00:00:28.210
11 0006  ID16.1                               (...) Meg? 00:00:27.385 - 00:00:29.305
13 0007    <NA>                         do you know Meg? 00:00:29.100 - 00:00:33.879

我一直在尝试使用paste0、dplyr:：lag和dplyr:lead来解决这个问题，但还没有取得进展

可复制数据：

df <- structure(list(line = c("0001", "0002", "0003", "0004", "0005", 
                    "0006", "0007"), speaker = c("ID16.1", NA, NA, "ID16.2", 
                                                 "ID16.3", "ID16.1", NA), utterance = c("ah-ha", "yes", 
                                                                                              "okay so where do we know each other from", 
                                                                                              "U uh Upper Rhine Cruises? maybe? ", "yeah? ((pause)) well I do n't-", 
                                                                                              "(...) Meg?", "do you know Meg?"
                                                 ), timestamp = c("00:00:07.060 - 00:00:07.660", "00:00:07.964 - 00:00:08.610", 
                                                                  "00:00:16.350 - 00:00:22.170", "00:00:23.400 - 00:00:26.600", 
                                                                  "00:00:26.305 - 00:00:28.210", "00:00:27.385 - 00:00:29.305", 
                                                                  "00:00:29.100 - 00:00:33.879")), row.names = c(1L, 3L, 5L, 7L, 
                                                                                                                 9L, 11L, 13L), class = "data.frame")

请尝试dplyr:：group\u by。仅供参考，显示的数据与df不同，df会更改聚合

图书馆弹琴 df%>% 组_bynotna=cumsum！is.naspeaker%>% 总结行=第一行，演讲者=第一演讲者，发音=粘贴发音，折叠=，时间戳=PasteUnlistrSplitimestamp，[-]+[c1，n*2]，折叠=-,， .组=下降 %>% 选择notna `总结“使用“.groups”参数取消分组输出覆盖” 一个tibble:4x4 行说话人话语时间戳 10001 ID16.1啊哈，是的，好的，那么我们从00:00:07.060到00:00:22.170在哪里认识 20004 ID16.2 U uh上莱茵河游轮？大概00:00:23.400 - 00:00:26.600 3005ID16.3是吗？暂停一下，我不知道-00:00:26.305-00:00:28.210 40006 ID16.1。。。梅格？你认识梅格吗？00:00:27.385 - 00:00:33.879

谢谢你的提示。我已经更改了显示的数据。

df <- structure(list(line = c("0001", "0002", "0003", "0004", "0005", 
                    "0006", "0007"), speaker = c("ID16.1", NA, NA, "ID16.2", 
                                                 "ID16.3", "ID16.1", NA), utterance = c("ah-ha", "yes", 
                                                                                              "okay so where do we know each other from", 
                                                                                              "U uh Upper Rhine Cruises? maybe? ", "yeah? ((pause)) well I do n't-", 
                                                                                              "(...) Meg?", "do you know Meg?"
                                                 ), timestamp = c("00:00:07.060 - 00:00:07.660", "00:00:07.964 - 00:00:08.610", 
                                                                  "00:00:16.350 - 00:00:22.170", "00:00:23.400 - 00:00:26.600", 
                                                                  "00:00:26.305 - 00:00:28.210", "00:00:27.385 - 00:00:29.305", 
                                                                  "00:00:29.100 - 00:00:33.879")), row.names = c(1L, 3L, 5L, 7L, 
                                                                                                                 9L, 11L, 13L), class = "data.frame")