R 如何将文本拆分为向量,其中每个条目对应于分配给每个唯一单词的索引值?

R 如何将文本拆分为向量,其中每个条目对应于分配给每个唯一单词的索引值?,r,dplyr,word,stringi,R,Dplyr,Word,Stringi,假设我有一个文档,其中包含一些文本,如以下所示: doc <- 'Questions with similar titles have frequently been downvoted and/or closed. Consider using a title that more accurately describes your question.' 像这样的dply方法不起作用 有没有更有效的方法 为了给出一个更简单的示例来显示预期的输出,我想要一个如下所示的数据帧: word

假设我有一个文档,其中包含一些文本,如以下所示:

doc <- 'Questions with similar titles have frequently been downvoted and/or closed. Consider using a title that more accurately describes your question.'
像这样的dply方法不起作用

有没有更有效的方法

为了给出一个更简单的示例来显示预期的输出,我想要一个如下所示的数据帧:

  words id
1     to  1
2     row  2
3     zip  3
4     zip  3

其中,我的起始词向量是:
doc使用
sapply的廉价方式

数据

doc <- 'Questions with with titles have frequently been downvoted and/or closed. Consider using a title that more accurately describes your question.'

使用sapply的廉价方式

数据

doc <- 'Questions with with titles have frequently been downvoted and/or closed. Consider using a title that more accurately describes your question.'

如果可能的话,可以在这里添加您的预期输出。我甚至不确定你的问题是否是个骗局,但如果你展示了你想做的事情,有人可以编辑你的标题。@TimBiegeleisen我已经添加了。只需
dfall$id@DavidArenburg就可以了。。而且比sapply快得多。如果你发帖子,我会接受。如果可能的话,可以在这里添加你的预期输出。我甚至不确定你的问题是否是个骗局,但如果你展示了你想做的事情,有人可以编辑你的标题。@TimBiegeleisen我已经添加了。只需
dfall$id@DavidArenburg就可以了。。而且比sapply快得多。我会接受的,如果你发布。看起来有重复的吗?你这样做是为了演示吗?是的,我是在紫色中做的,因为你的例子没有任何重复。我是个笨蛋。谢谢,看起来像是复制的?你这样做是为了演示吗?是的,我是在紫色中做的,因为你的例子没有任何重复。我是个笨蛋。非常感谢。
  words id
1     to  1
2     row  2
3     zip  3
4     zip  3
doc <- 'Questions with with titles have frequently been downvoted and/or closed. Consider using a title that more accurately describes your question.'
alldf=cbind(dfall,sapply(1:nrow(dfall),function(x) which(uniquedf$words==dfall$words[x])))

colnames(alldf)=c("words","id")
> alldf
        words id
1   questions  1
2        with  2
3        with  2
4      titles  3
5        have  4
6  frequently  5
7        been  6
8   downvoted  7
9         and  8
10         or  9
11     closed 10
12   consider 11
13      using 12
14          a 13
15      title 14
16       that 15
17       more 16
18 accurately 17
19  describes 18
20       your 19
21   question 20