Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 选择一年中的一个月进行排名,然后将结果排名给其余年份_R_Data.table_Conditional Statements_Rank - Fatal编程技术网

R 选择一年中的一个月进行排名,然后将结果排名给其余年份

R 选择一年中的一个月进行排名,然后将结果排名给其余年份,r,data.table,conditional-statements,rank,R,Data.table,Conditional Statements,Rank,样本数据: df1 <- data.frame(id=c("A","A","A","A","B","B","B","B"), year=c(2014,2014,2015,2015), month=c(1,2), new.employee=c(4,6,2,6,23,2,5,34)) id year month new.employee 1 A

样本数据:

df1 <- data.frame(id=c("A","A","A","A","B","B","B","B"),
                        year=c(2014,2014,2015,2015),
                        month=c(1,2),
                        new.employee=c(4,6,2,6,23,2,5,34))

  id year month new.employee
1  A 2014     1            4
2  A 2014     2            6
3  A 2015     1            2
4  A 2015     2            6
5  B 2014     1           23
6  B 2014     2            2
7  B 2015     1            5
8  B 2015     2           34

df1这里有一个基于
dplyr
的解决方案。其思想是将数据减少到您想要比较的部分,进行比较,然后将结果连接回原始数据集中,将其扩展以填充所有相关插槽。请注意对用于创建示例数据的代码所做的编辑

df1 <- data.frame(id=c("A","A","A","A","B","B","B","B"),
                        year=rep(c(2014,2014,2015,2015), 2),
                        month=rep(c(1,2), 4),
                        new.employee=c(4,6,2,6,23,2,5,34))

library(dplyr)

df1 %>%
  # Reduce the data to the slices (months) you want to compare
  filter(month==2) %>%
  # Group the data by year, so the comparisons are within and not across years
  group_by(year) %>%
  # Create a variable that indicates the rankings within years in descending order
  mutate(rank = rank(-new.employee)) %>%
  # To prepare for merging, reduce the new data to just that ranking var plus id and year
  select(id, year, rank) %>%
  # Use left_join to merge the new data (.) with the original df, expanding the
  # new data to fill all rows with id-year matches
  left_join(df1, .) %>%
  # Order the data by id, year, and month to make it easier to review
  arrange(id, year, month)

您已经尝试了一个
data.table
解决方案,下面是如何使用
data.table

library(data.table) # V1.9.6+
temp <- setDT(df1)[month == 2L, .(id, frank(-new.employee)), by = year]
df1[temp, new.employee.rank := i.V2, on = c("year", "id")]
df1
#    id year month new.employee new.employee.rank
# 1:  A 2014     1            4                 1
# 2:  A 2014     2            6                 1
# 3:  A 2015     1            2                 2
# 4:  A 2015     2            6                 2
# 5:  B 2014     1           23                 2
# 6:  B 2014     2            2                 2
# 7:  B 2015     1            5                 1
# 8:  B 2015     2           34                 1
library(data.table)#V1.9.6+

很好的解决方案。你能简单地解释一下为什么我们在2后面加上“L”,在V2前面加上“i”,最后一个是格式(id…)。对不起,我是新手。非常感谢@David,在这种情况下,你能帮我创建自定义函数吗。我正在尝试对许多项目进行排名。我认为这个函数有点像这个函数(df,item_to_rank)。我试过了,但我甚至不能返回因子“temp”。你能帮我把成功的命令嵌套起来吗。谢谢,当你能准确描述你需要什么的时候,在链接这个问题的时候,可能会发布一个新的问题。基于评论,我无法帮助你。我想我在新问题上帮助了你,但没有收到你的任何反馈。亲爱的大卫,我目前正在重新安装我的窗口,因此我没有机会查看新帖子。我不知道为什么这篇文章没有通知邮件。非常感谢,我会尽快给出反馈。亲爱的ulfelder,它对我的样本有效,但我无法将其转化为我的真实案例,因为我是一个新手。我不能完全理解你的语法。然而,我非常感激。我改天再谈你的建议。现在,我将首先使用data.table方法。如果有帮助,我将添加一些注释,解释每个步骤的作用。
df1 <- data.frame(id=c("A","A","A","A","B","B","B","B"),
                        year=rep(c(2014,2014,2015,2015), 2),
                        month=rep(c(1,2), 4),
                        new.employee=c(4,6,2,6,23,2,5,34))

library(dplyr)

df1 %>%
  # Reduce the data to the slices (months) you want to compare
  filter(month==2) %>%
  # Group the data by year, so the comparisons are within and not across years
  group_by(year) %>%
  # Create a variable that indicates the rankings within years in descending order
  mutate(rank = rank(-new.employee)) %>%
  # To prepare for merging, reduce the new data to just that ranking var plus id and year
  select(id, year, rank) %>%
  # Use left_join to merge the new data (.) with the original df, expanding the
  # new data to fill all rows with id-year matches
  left_join(df1, .) %>%
  # Order the data by id, year, and month to make it easier to review
  arrange(id, year, month)
Joining by: c("id", "year")
  id year month new.employee rank
1  A 2014     1            4    1
2  A 2014     2            6    1
3  A 2015     1            2    2
4  A 2015     2            6    2
5  B 2014     1           23    2
6  B 2014     2            2    2
7  B 2015     1            5    1
8  B 2015     2           34    1
library(data.table) # V1.9.6+
temp <- setDT(df1)[month == 2L, .(id, frank(-new.employee)), by = year]
df1[temp, new.employee.rank := i.V2, on = c("year", "id")]
df1
#    id year month new.employee new.employee.rank
# 1:  A 2014     1            4                 1
# 2:  A 2014     2            6                 1
# 3:  A 2015     1            2                 2
# 4:  A 2015     2            6                 2
# 5:  B 2014     1           23                 2
# 6:  B 2014     2            2                 2
# 7:  B 2015     1            5                 1
# 8:  B 2015     2           34                 1