Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/wcf/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 连接通话记录中的多个字符串_R - Fatal编程技术网

R 连接通话记录中的多个字符串

R 连接通话记录中的多个字符串,r,R,我有一个数据集,看起来像下面的一千行: dat = c("Speaker 1: ONE TWO THREE | Speaker 2: FOUR FIVE SIX SEVEN | Speaker 1: EIGHT NINE TEN | Speaker 2: ELEVEN* TWELVE THIRTEEN | Speaker 1: FOURTEEN FIFTEEN","Speaker 1: ONE TWO") dat[1]: Four five six seven. Eleven twelve t

我有一个数据集,看起来像下面的一千行:

dat = c("Speaker 1: ONE TWO THREE | Speaker 2: FOUR FIVE SIX SEVEN | Speaker 1: EIGHT NINE TEN | Speaker 2: ELEVEN* TWELVE THIRTEEN | Speaker 1: FOURTEEN FIFTEEN","Speaker 1: ONE TWO")
dat[1]:
Four five six seven. Eleven twelve thirteen.
dat[2]:
NA #(or blank)
dat=tolowerdat小写 dat=gsub\\\*,dat带星号 我正试图让它看起来像下面这样:

dat = c("Speaker 1: ONE TWO THREE | Speaker 2: FOUR FIVE SIX SEVEN | Speaker 1: EIGHT NINE TEN | Speaker 2: ELEVEN* TWELVE THIRTEEN | Speaker 1: FOURTEEN FIFTEEN","Speaker 1: ONE TWO")
dat[1]:
Four five six seven. Eleven twelve thirteen.
dat[2]:
NA #(or blank)
也就是说,我想删除演讲者1中的任何内容,删除星号,将剩余内容更改为句子大小写,并在每个语句的末尾加上句号


非常感谢您的帮助,尤其是如果此解决方案存在于此处,但我未能找到它。

因为您需要对同一对象应用多个操作,并且您需要str_trim函数,最好使用tidyverse:


使用base R,您可以执行以下操作:

a = gsub(".*?2:\\s*([^|]*)\\b|(?:(?!Speaker 2).)*","\\L\\1. ", dat, perl = T)
b = gsub("\\*", "", sub("(?|(?<=^)|(?<=\\W))\\W*$", '', a, perl = T))
`is.na<-`(b,nchar(b)==0)


[1] "four five six seven. eleven twelve thirteen."
[2] NA