R 创建相似矩阵
我有一个如下所示的矩阵:R 创建相似矩阵,r,matrix,transformation,R,Matrix,Transformation,我有一个如下所示的矩阵: col_1 col_2 value A B 2.1 A C 1.3 B C 4.6 A D 1.4 .... 我想得到一个相似性矩阵: A B C D A X 2.1 1.3 1.4 B 2.1 X 4.6 ... C ... ... X ... D ... ... ... X 因此,行
col_1 col_2 value
A B 2.1
A C 1.3
B C 4.6
A D 1.4
....
我想得到一个相似性矩阵:
A B C D
A X 2.1 1.3 1.4
B 2.1 X 4.6 ...
C ... ... X ...
D ... ... ... X
因此,行和列名是A、B、C、D,它从第三列获取值并将其添加到矩阵中
问题还在于原始矩阵的长度约为10000行。您可以按以下方法执行。 我用Python编写代码,因为没有指定语言
#I assume that your data is in a python pandas dataframe called df
df = ..load your data
list_of_labels = [ 'A','B','C','D' ]
nb_labels = len(list_of_labels)
similarity = np.zeros( (nb_labels,nb_labels) )
for l1, l2, val in zip( df['col_1'] , df['col_2'] , df['value'] ):
i = list_of_labels.index( l1 )
j = list_of_labels.index( l2 )
similarity[i][j] = val
similarity_df = pd.DataFrame(data=similarity, index=list_of_labels, columns=list_of_labels)
你可以用下面的方法来做。 我用Python编写代码,因为没有指定语言
#I assume that your data is in a python pandas dataframe called df
df = ..load your data
list_of_labels = [ 'A','B','C','D' ]
nb_labels = len(list_of_labels)
similarity = np.zeros( (nb_labels,nb_labels) )
for l1, l2, val in zip( df['col_1'] , df['col_2'] , df['value'] ):
i = list_of_labels.index( l1 )
j = list_of_labels.index( l2 )
similarity[i][j] = val
similarity_df = pd.DataFrame(data=similarity, index=list_of_labels, columns=list_of_labels)
正如罗兰所建议的,您可以使用
dcast()
:
其中:
df <- data.frame(
col_1 = c("A", "A", "B", "A"),
col_2 = c("B","C", "C", "D"),
value = c(2.1, 1.3, 4.6, 1.4)
)
df正如罗兰所建议的,您可以使用dcast()
:
其中:
df <- data.frame(
col_1 = c("A", "A", "B", "A"),
col_2 = c("B","C", "C", "D"),
value = c(2.1, 1.3, 4.6, 1.4)
)
df带有xtabs
和在处变异sparse=TRUE
将输出转换为sparseMatrix:
library(dplyr)
mat <- df %>%
mutate_at(1:2, factor, levels = unique(c(levels(.$col_1), levels(.$col_2)))) %>%
xtabs(value ~ col_1 + col_2, data=., sparse = TRUE)
mat[lower.tri(mat)] <- mat[upper.tri(mat)]
使用xtabs
和mutate_at
sparse=TRUE
将输出转换为sparseMatrix:
library(dplyr)
mat <- df %>%
mutate_at(1:2, factor, levels = unique(c(levels(.$col_1), levels(.$col_2)))) %>%
xtabs(value ~ col_1 + col_2, data=., sparse = TRUE)
mat[lower.tri(mat)] <- mat[upper.tri(mat)]
您使用哪种编程语言?R。谢谢您的提问<代码>库(2);帮助(“dcast”)
或谷歌“r重塑宽格式”。您使用哪种编程语言?r。谢谢您的询问<代码>库(2);帮助(“dcast”)
或谷歌“r重塑宽格式”。嗨,对不起,我忘了提到我正在使用r!但是谢谢你的回复!你认为在R中也能做到吗?CheersHi,对不起,我忘了说我在用R!但是谢谢你的回复!你认为在R中也能做到吗?非常感谢非常感谢非常感谢