在数据帧中创建一个列,将R中的其他两列混合在一起
我想在数据框上添加排名列,我的数据框如下所示:在数据帧中创建一个列,将R中的其他两列混合在一起,r,dataframe,rank,R,Dataframe,Rank,我想在数据框上添加排名列,我的数据框如下所示: df <- data.frame(category = rep(c('c1','c2','c3'), each =3), id = seq(1:9), count = c(10,10,10,9,8,8,7,6,4)) 我希望根据不同的类别进行排名,然后根据它们的数量,从高到低。这里有一种方法 但需要两份订单 library(data.table) df <- data.table(
df <- data.frame(category = rep(c('c1','c2','c3'), each =3),
id = seq(1:9),
count = c(10,10,10,9,8,8,7,6,4))
我希望根据不同的类别进行排名,然后根据它们的数量,从高到低。这里有一种方法 但需要两份订单
library(data.table)
df <- data.table(category = rep(c('c1','c2','c3'), each =3),
id = seq(1:9),
count = c(10,10,10,9,8,8,7,6,4))
setorder(df,category,-count)
df[,r1 := seq_len(.N),by=category]
setorder(df,r1)
df[,rank := rev(seq_len(.N))]
库(data.table)
df我们可以试试
library(data.table)
setDT(df)[order(-count), N:=1:.N, by = category]
df[order(N)][, rank:=.N:1][, N:= NULL][]
下面是一个秩/顺序补偿的可能实现
library(data.table)
indx <- setDT(df)[, frank(-count, ties.method = "first"), by = category]$V1
df[order(indx)][, Rank := .N:1][]
# category id count Rank
# 1: c1 1 10 9
# 2: c2 4 9 8
# 3: c3 7 7 7
# 4: c1 2 10 6
# 5: c2 5 8 5
# 6: c3 8 6 4
# 7: c1 3 10 3
# 8: c2 6 8 2
# 9: c3 9 4 1
库(data.table)
indx我想补充一点,理想情况下,对于转换数百万条记录,解决方案将相对较快。
library(dplyr)
df %>%
group_by(category) %>%
arrange(desc(count)) %>%
mutate(n = row_number()) %>%
arrange(n) %>%
ungroup() %>%
mutate(rank = rev(row_number()))
# category id count n rank
# (fctr) (int) (dbl) (int) (int)
# 1 c1 1 10 1 9
# 2 c1 2 10 2 8
# 3 c1 3 10 3 7
# 4 c2 4 9 1 6
# 5 c2 5 8 2 5
# 6 c2 6 8 3 4
# 7 c3 7 7 1 3
# 8 c3 8 6 2 2
# 9 c3 9 4 3 1
library(data.table)
indx <- setDT(df)[, frank(-count, ties.method = "first"), by = category]$V1
df[order(indx)][, Rank := .N:1][]
# category id count Rank
# 1: c1 1 10 9
# 2: c2 4 9 8
# 3: c3 7 7 7
# 4: c1 2 10 6
# 5: c2 5 8 5
# 6: c3 8 6 4
# 7: c1 3 10 3
# 8: c2 6 8 2
# 9: c3 9 4 1