从R中的文本文档中删除前n个单词
我在R中遇到问题,在Stackoverflow中找不到类似的解决方案 我有一个包含很多不同文本文档的数据框架。我尝试gsub在特定模式后从文本文档中删除一些字符。这很好,但是现在我有一个问题,我想从每个文本文档中删除前5个单词 举例如下:从R中的文本文档中删除前n个单词,r,string,text,tidyverse,R,String,Text,Tidyverse,我在R中遇到问题,在Stackoverflow中找不到类似的解决方案 我有一个包含很多不同文本文档的数据框架。我尝试gsub在特定模式后从文本文档中删除一些字符。这很好,但是现在我有一个问题,我想从每个文本文档中删除前5个单词 举例如下: “嘿,我是汤姆,我喜欢香蕉” “嘿,我是莫里茨,我喜欢巧克力” 解决办法应该是: “我喜欢香蕉” “我喜欢巧克力” 这在R中是否可能具有特定功能? 这对我有很大帮助 亲切问候,, Tom我们可以使用strsplit、sapply和粘贴 xx <- c(&
Tom我们可以使用
strsplit
、sapply
和粘贴
xx <- c("Hey I am Tom and I like Bananas", "Hey I am Moritz and I like Chocolate")
sapply(strsplit(xx, split = " "),
FUN = function(x) paste(x[6:length(x)], collapse = " "))
# [1] "I like Bananas" "I like Chocolate"
xx我们可以使用strsplit
、sapply
和paste
xx <- c("Hey I am Tom and I like Bananas", "Hey I am Moritz and I like Chocolate")
sapply(strsplit(xx, split = " "),
FUN = function(x) paste(x[6:length(x)], collapse = " "))
# [1] "I like Bananas" "I like Chocolate"
xx尝试gsub
如下
> gsub("(\\w+\\s+){5}", "", s)
[1] "I like Bananas" "I like Chocolate"
数据
s <- c(
"Hey I am Tom and I like Bananas",
"Hey I am Moritz and I like Chocolate"
)
s尝试gsub
如下
> gsub("(\\w+\\s+){5}", "", s)
[1] "I like Bananas" "I like Chocolate"
数据
s <- c(
"Hey I am Tom and I like Bananas",
"Hey I am Moritz and I like Chocolate"
)
s与stru-remove类似的选项
library(stringr)
str_remove(s, '(\\w+\\s+){5}')
#[1] "I like Bananas" "I like Chocolate"
数据
s与stru-remove类似的选项
library(stringr)
str_remove(s, '(\\w+\\s+){5}')
#[1] "I like Bananas" "I like Chocolate"
数据
sstringr
选项:
library(stringr)
s <- c("Hey I am Tom and I like Bananas", "Hey I am Moritz and I like Chocolate")
word(s, 6, str_count(s, '\\s')+1)
#[1] "I like Bananas" "I like Chocolate"
库(stringr)
sstringr
选项:
library(stringr)
s <- c("Hey I am Tom and I like Bananas", "Hey I am Moritz and I like Chocolate")
word(s, 6, str_count(s, '\\s')+1)
#[1] "I like Bananas" "I like Chocolate"
库(stringr)
s