Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/ant/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
将R中矩阵中出现次数最少的字符串转换为出现次数次高的字符串_R_Matrix - Fatal编程技术网

将R中矩阵中出现次数最少的字符串转换为出现次数次高的字符串

将R中矩阵中出现次数最少的字符串转换为出现次数次高的字符串,r,matrix,R,Matrix,我有以下矩阵,每列中有不同的字符串。一列中不同字符串的最小数量为2,有些列最多有20个不同的字符串,有些列只有两个不同的字符串。我想转换频率为1或2(2)的字符串。因此,在一个因子只出现一次或两次的列中,应将同一列中频率更高的因子进行转换。如果下一个更高的频率(>2)出现两次,则只需选择其中任何一个,无论是哪一个。保留所有其他列的原样。在每列中,总有一个频率因子大于2 我的示例矩阵是: n.mat <- structure(c("M", "M", &

我有以下矩阵,每列中有不同的字符串。一列中不同字符串的最小数量为2,有些列最多有20个不同的字符串,有些列只有两个不同的字符串。我想转换频率为1或2(
2
)的字符串。因此,在一个因子只出现一次或两次的列中,应将同一列中频率更高的因子进行转换。如果下一个更高的频率(
>2
)出现两次,则只需选择其中任何一个,无论是哪一个。保留所有其他列的原样。在每列中,总有一个频率因子大于2

我的示例矩阵是:

n.mat <- structure(c("M", "M", "M", "M", "M", "M", "Y", "Y", "M", "M", 
"Y", "Y", "F", "F", "F", "F", "M", "M", "X", "Y", "Y", "F", "F", 
"F", "A", "A", "A", "A", "A", "A", "A", "B", "A", "A", "A", "A", 
"A", "B", "B", "B", "C", "D", "D", "D", "E", "E", "E", "G"), .Dim = c(8L, 
6L), .Dimnames = list(c("r1", "r2", "r3", "r4", "r5", "r6", "r7", 
"r8"), NULL))

  [,1] [,2] [,3] [,4] [,5] [,6]
r1 "M"  "M"  "M"  "A"  "A"  "C" 
r2 "M"  "M"  "M"  "A"  "A"  "D" 
r3 "M"  "Y"  "X"  "A"  "A"  "D" 
r4 "M"  "Y"  "Y"  "A"  "A"  "D" 
r5 "M"  "F"  "Y"  "A"  "A"  "E" 
r6 "M"  "F"  "F"  "A"  "B"  "E" 
r7 "Y"  "F"  "F"  "A"  "B"  "E" 
r8 "Y"  "F"  "F"  "B"  "B"  "G" 
试试这个

apply(n.mat, 2, function(x) {
  tx <- sort(table(x), decreasing=TRUE)
  x[x %in% names(tx[tx <=2])] <- names(rev(tx[names(tx[tx > 2])])[1])
  x
})
#   [,1] [,2] [,3] [,4] [,5] [,6]
# r1 "M"  "F"  "F"  "A"  "A"  "E" 
# r2 "M"  "F"  "F"  "A"  "A"  "D" 
# r3 "M"  "F"  "F"  "A"  "A"  "D" 
# r4 "M"  "F"  "F"  "A"  "A"  "D" 
# r5 "M"  "F"  "F"  "A"  "A"  "E" 
# r6 "M"  "F"  "F"  "A"  "B"  "E" 
# r7 "M"  "F"  "F"  "A"  "B"  "E" 
# r8 "M"  "F"  "F"  "A"  "B"  "E" 
应用(n.mat,2,函数(x){

tx嗨,非常感谢,这几乎是我想要的,只是它将字符串转换为最高频率的字符串,但我想将它们转换为下一个更高频率的字符串。你明白我的意思吗?@Luker354我明白。但是你如何区分例如
“D”“D”“D”“e”“e”“e”“D”
“E”“D”“D”“E”“E”“E”“E”“E”“E”“E”“E”“E”
最后一列,即当最高频率和第二高频率相等时?@Luker354下次更新代码,这对您有用吗?太好了,这正是我想要的。非常感谢!
n.mat<-n.mat[, apply(n.mat, 2, function(x) sort(table(x), decreasing = TRUE)]
apply(n.mat, 2, function(x) {
  tx <- sort(table(x), decreasing=TRUE)
  x[x %in% names(tx[tx <=2])] <- names(rev(tx[names(tx[tx > 2])])[1])
  x
})
#   [,1] [,2] [,3] [,4] [,5] [,6]
# r1 "M"  "F"  "F"  "A"  "A"  "E" 
# r2 "M"  "F"  "F"  "A"  "A"  "D" 
# r3 "M"  "F"  "F"  "A"  "A"  "D" 
# r4 "M"  "F"  "F"  "A"  "A"  "D" 
# r5 "M"  "F"  "F"  "A"  "A"  "E" 
# r6 "M"  "F"  "F"  "A"  "B"  "E" 
# r7 "M"  "F"  "F"  "A"  "B"  "E" 
# r8 "M"  "F"  "F"  "A"  "B"  "E"