R 按组列出的最频繁值（模式）_R

R 按组列出的最频繁值（模式）

R 按组列出的最频繁值（模式）,r,R,我试图按组查找最频繁的值。在以下示例数据帧中： df<-data.frame(a=c(1,1,1,1,2,2,2,3,3),b=c(2,2,1,2,3,3,1,1,2)) > df a b 1 1 2 2 1 2 3 1 1 4 1 2 5 2 3 6 2 3 7 2 1 8 3 1 9 3 2 我试着用table和tapply，但没做对。有没有快速的方法可以做到这一点？谢谢我们可以使用ave Mode <- func

我试图按组查找最频繁的值。在以下示例数据帧中：

df<-data.frame(a=c(1,1,1,1,2,2,2,3,3),b=c(2,2,1,2,3,3,1,1,2))  
> df  
  a b  
1 1 2  
2 1 2  
3 1 1  
4 1 2  
5 2 3  
6 2 3  
7 2 1  
8 3 1  
9 3 2

我试着用table和tapply，但没做对。有没有快速的方法可以做到这一点？

谢谢

我们可以使用

ave

 Mode <- function(x) {
 ux <- unique(x)
 ux[which.max(tabulate(match(x, ux)))]
}

df$c <-  with(df, ave(b, a, FUN=Mode))
df$c
#[1] 2 2 2 2 3 3 3 1 1

基于Davids评论，您的解决方案如下：

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

library(dplyr)
df %>% group_by(a) %>% mutate(c=Mode(b))

模式%变异（c=模式（b））

请注意，当

df$a

为

时，对于tie，

的模式为

，这里有一个基本的R方法，它使用

表格

计算交叉表，

max.col

来查找每组的模式，和

rep

与

rle

一起填写跨组模式

# calculate a cross tab, frequencies by group
myTab <- table(df$a, df$b)
# repeat the mode for each group, as calculated by colnames(myTab)[max.col(myTab)] 
# repeating by the number of times the group ID is observed
df$c <- rep(colnames(myTab)[max.col(myTab)], rle(df$a)$length)

df
  a b c
1 1 2 2
2 1 2 2
3 1 1 2
4 1 2 2
5 2 3 3
6 2 3 3
7 2 1 3
8 3 1 2
9 3 2 2

#按组计算交叉表频率
myTab这与感谢@akrun密切相关！要从模式功能中排除NAs，请将第二行更改为“ux”
Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

library(dplyr)
df %>% group_by(a) %>% mutate(c=Mode(b))

# calculate a cross tab, frequencies by group
myTab <- table(df$a, df$b)
# repeat the mode for each group, as calculated by colnames(myTab)[max.col(myTab)] 
# repeating by the number of times the group ID is observed
df$c <- rep(colnames(myTab)[max.col(myTab)], rle(df$a)$length)

df
  a b c
1 1 2 2
2 1 2 2
3 1 1 2
4 1 2 2
5 2 3 3
6 2 3 3
7 2 1 3
8 3 1 2
9 3 2 2