R 如何从TDM矩阵中分割出双图的数值向量

R 如何从TDM矩阵中分割出双图的数值向量,r,vector,n-gram,term-document-matrix,R,Vector,N Gram,Term Document Matrix,我在R中有一个大的数字(46201个元素,3.3MB) 我希望将每一个分解,因此输出类似于: "i" "know" "46" "i" "dont" "42" 像这样的 > top_pairs <- structure(c(46, 42), .Names = c("i know", "i dont")) > do.call(rbind, strsplit(paste(names(top_pairs), top_pairs), " ")) [,1] [,2] [

我在R中有一个大的数字(46201个元素,3.3MB)

我希望将每一个分解,因此输出类似于:

 "i" "know" "46"
 "i" "dont" "42"
像这样的

> top_pairs <- structure(c(46, 42), .Names = c("i know", "i dont"))
> do.call(rbind, strsplit(paste(names(top_pairs), top_pairs), " "))
     [,1] [,2]   [,3]
[1,] "i"  "know" "46"
[2,] "i"  "dont" "42"
像这样的

> top_pairs <- structure(c(46, 42), .Names = c("i know", "i dont"))
> do.call(rbind, strsplit(paste(names(top_pairs), top_pairs), " "))
     [,1] [,2]   [,3]
[1,] "i"  "know" "46"
[2,] "i"  "dont" "42"
像这样的

> top_pairs <- structure(c(46, 42), .Names = c("i know", "i dont"))
> do.call(rbind, strsplit(paste(names(top_pairs), top_pairs), " "))
     [,1] [,2]   [,3]
[1,] "i"  "know" "46"
[2,] "i"  "dont" "42"
像这样的

> top_pairs <- structure(c(46, 42), .Names = c("i know", "i dont"))
> do.call(rbind, strsplit(paste(names(top_pairs), top_pairs), " "))
     [,1] [,2]   [,3]
[1,] "i"  "know" "46"
[2,] "i"  "dont" "42"

由于文件很大,您可能需要使用
stringi

library(stringi)
data.frame(stri_split_fixed(names(top_pairs), " ", simplify=T),
    count=top_pairs, row.names=seq_along(top_pairs))

#   X1   X2 count
# 1  i know    46
# 2  i dont    42

由于文件很大,您可能需要使用
stringi

library(stringi)
data.frame(stri_split_fixed(names(top_pairs), " ", simplify=T),
    count=top_pairs, row.names=seq_along(top_pairs))

#   X1   X2 count
# 1  i know    46
# 2  i dont    42

由于文件很大,您可能需要使用
stringi

library(stringi)
data.frame(stri_split_fixed(names(top_pairs), " ", simplify=T),
    count=top_pairs, row.names=seq_along(top_pairs))

#   X1   X2 count
# 1  i know    46
# 2  i dont    42

由于文件很大,您可能需要使用
stringi

library(stringi)
data.frame(stri_split_fixed(names(top_pairs), " ", simplify=T),
    count=top_pairs, row.names=seq_along(top_pairs))

#   X1   X2 count
# 1  i know    46
# 2  i dont    42

请给我们dput(head(sort(top\u pairs,discreating=T),2))的输出。您是否尝试过
unlist(strsplit(names(top\u pairs),“”)
@richardscriben如果我删除as.character,我会收到错误:“non character argument”@nongkrong-返回单词,但丢失pairs@RichardScriven-结构(c(46,42),.Names=c(“我知道”、“我不知道”))请给我们从
dput(head(sort(top\u pairs,discreating=T),2)得到的输出。
你有没有尝试
unlist(strsplit(names(top\u pairs),”)
@richardscri即使我删除了as.character,我也会收到错误:“noncharacter argument”@nongkrong-返回单词,但会丢失数据pairs@RichardScriven-结构(c)(46,42),.Names=c(“我知道”,“我不知道”))请给我们从
dput(head(sort(top\u pairs,discreating=T),2)中的输出。
您是否尝试
取消列表(strsplit(Names(top\u pairs),”)
@richardscrible即使我删除了as.character,我也会收到错误:“非字符参数”“@nongkrong-返回单词,但丢失pairs@RichardScriven-structure(c(46,42),.Names=c(“我知道”,“我不知道”))请给我们从
dput(head(sort(top\u pairs,discreating=T),2)的输出。
您是否尝试过
取消列表(strsplit(Names(top\u pairs),”)
@richardscri即使我删除了as.character,我也会收到错误:“非字符参数”@nongkrong-返回单词,但丢失pairs@RichardScriven-结构(c(46,42),.Names=c(“我知道”,“我不知道”))