R 如何基于字符串进行组合

R 如何基于字符串进行组合,r,R,我有一个包含许多列的数据框,如下所示 Column1 Column2 Column3 Q9Y6Y8 P28074 Q9Y6A4 Q9Y6W5 P28066 Q9Y623 Q9Y6H1 P27695 Q9Y5W9 Q5T1J5 P25786;Q9Y623 Q9Y6A4 Q9Y623;P27695;Q9Y

我有一个包含许多列的数据框,如下所示

Column1           Column2           Column3
Q9Y6Y8             P28074           Q9Y6A4
Q9Y6W5             P28066           Q9Y623
Q9Y6H1             P27695           Q9Y5W9
Q5T1J5             P25786;Q9Y623 
Q9Y6A4
Q9Y623;P27695;Q9Y623
Q9Y5W9
Q9Y6Y8
所以我想先把它们放在一起,得到它们的独特之处,如下图所示

Q9Y6Y8                        
Q9Y6W5                     
Q9Y6H1                       
Q5T1J5             
Q9Y6A4
Q9Y623
P27695
Q9Y623
Q9Y5W9
Q9Y6Y8 
P25786
P28074
P28066   
Q9Y6Y8 Q9Y6W5   
Q9Y6Y8 Q9Y6H1                       
Q9Y6Y8 Q9Y6A4                           
Q9Y6Y8 Q5T1J5             
Q9Y6Y8 Q9Y6A4
Q9Y6Y8 Q9Y623
Q9Y6Y8 P27695
Q9Y6Y8 Q9Y623
    .
    .
    .
Q9Y6W5 Q9Y6H1
Q9Y6W5 Q9Y6A4
Q9Y6W5 Q5T1J5 
    .
    .
    .
然后我想要一个所有字符串的组合,两个接两个,如下所示

Q9Y6Y8                        
Q9Y6W5                     
Q9Y6H1                       
Q5T1J5             
Q9Y6A4
Q9Y623
P27695
Q9Y623
Q9Y5W9
Q9Y6Y8 
P25786
P28074
P28066   
Q9Y6Y8 Q9Y6W5   
Q9Y6Y8 Q9Y6H1                       
Q9Y6Y8 Q9Y6A4                           
Q9Y6Y8 Q5T1J5             
Q9Y6Y8 Q9Y6A4
Q9Y6Y8 Q9Y623
Q9Y6Y8 P27695
Q9Y6Y8 Q9Y623
    .
    .
    .
Q9Y6W5 Q9Y6H1
Q9Y6W5 Q9Y6A4
Q9Y6W5 Q5T1J5 
    .
    .
    .

直到所有字符串都在巴黎一次

之前,我们可以通过
将data.frame(因为data.frame是
列表
)取消列表
向量
,按
拆分
,然后
取消列出
列表
输出(来自
strsplit
),并获得
唯一的
元素作为
向量

Un1 <- unique(unlist(strsplit(unlist(df1), ";")))
或者,如果我们只需要有限的组合,则可以使用
combn

t(combn(Un1, 2))
#        [,1]     [,2]    
# [1,] "Q9Y6Y8" "Q9Y6W5"
# [2,] "Q9Y6Y8" "Q9Y6H1"
# [3,] "Q9Y6Y8" "Q5T1J5"
# [4,] "Q9Y6Y8" "Q9Y6A4"
# [5,] "Q9Y6Y8" "Q9Y623"
# [6,] "Q9Y6Y8" "P27695"
# [7,] "Q9Y6Y8" "Q9Y5W9"
# [8,] "Q9Y6Y8" "P28074"
# [9,] "Q9Y6Y8" "P28066"
#[10,] "Q9Y6Y8" "P25786"
#[11,] "Q9Y6W5" "Q9Y6H1"
#[12,] "Q9Y6W5" "Q5T1J5"
#[13,] "Q9Y6W5" "Q9Y6A4"
#[14,] "Q9Y6W5" "Q9Y623"
#[15,] "Q9Y6W5" "P27695"
#[16,] "Q9Y6W5" "Q9Y5W9"
#[17,] "Q9Y6W5" "P28074"
#[18,] "Q9Y6W5" "P28066"
#[19,] "Q9Y6W5" "P25786"
#[20,] "Q9Y6H1" "Q5T1J5"
#[21,] "Q9Y6H1" "Q9Y6A4"
#[22,] "Q9Y6H1" "Q9Y623"
#[23,] "Q9Y6H1" "P27695"
#[24,] "Q9Y6H1" "Q9Y5W9"
#[25,] "Q9Y6H1" "P28074"
#[26,] "Q9Y6H1" "P28066"
#[27,] "Q9Y6H1" "P25786"
#[28,] "Q5T1J5" "Q9Y6A4"
#[29,] "Q5T1J5" "Q9Y623"
#[30,] "Q5T1J5" "P27695"
#[31,] "Q5T1J5" "Q9Y5W9"
#[32,] "Q5T1J5" "P28074"
#[33,] "Q5T1J5" "P28066"
#[34,] "Q5T1J5" "P25786"
#[35,] "Q9Y6A4" "Q9Y623"
#[36,] "Q9Y6A4" "P27695"
#[37,] "Q9Y6A4" "Q9Y5W9"
#[38,] "Q9Y6A4" "P28074"
#[39,] "Q9Y6A4" "P28066"
#[40,] "Q9Y6A4" "P25786"
#[41,] "Q9Y623" "P27695"
#[42,] "Q9Y623" "Q9Y5W9"
#[43,] "Q9Y623" "P28074"
#[44,] "Q9Y623" "P28066"
#[45,] "Q9Y623" "P25786"
#[46,] "P27695" "Q9Y5W9"
#[47,] "P27695" "P28074"
#[48,] "P27695" "P28066"
#[49,] "P27695" "P25786"
#[50,] "Q9Y5W9" "P28074"
#[51,] "Q9Y5W9" "P28066"
#[52,] "Q9Y5W9" "P25786"
#[53,] "P28074" "P28066"
#[54,] "P28074" "P25786"
#[55,] "P28066" "P25786"

注意:这里我假设所有列都是
字符
类。

@nik您的列是
因子
。所以
strsplit(as.character(unlist(df1)),“,”
我喜欢你的答案,但我必须等待2分钟,然后接受it@nik添加了一些描述。@nik
res是的,我刚刚检查了combn函数,这是有史以来最伟大的函数