R 如何添加对值进行排序的列?
不久前我问了一个类似的问题,但后来意识到我的问题其实更复杂。抱歉再次询问R 如何添加对值进行排序的列?,r,dataframe,ranking,R,Dataframe,Ranking,不久前我问了一个类似的问题,但后来意识到我的问题其实更复杂。抱歉再次询问 df <- data.frame( comp_name = c("A","A","B","B","A","A","B","B","C","C","D","D","C&
df <- data.frame(
comp_name = c("A","A","B","B","A","A","B","B","C","C","D","D","C","C","D","D"),
country = c("US","US","US","US","US","US","US","US","France","France","France","France","France","France","France","France"),
year = c("2018","2018","2018","2018","2019","2019","2019","2019","2018","2018","2018","2018","2019","2019","2019","2019"),
type = c("profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue","profit", "revenue"),
value = c(10,20,30,40,20,30,40,50,140,150,120,130,100,110,80,90)
)
我想添加一个列,如下所示:
comp_name country year type value rank
1 A US 2018 profit 10
2 A US 2018 revenue 20
3 B US 2018 profit 30
4 B US 2018 revenue 40
5 A US 2019 profit 20 2
6 A US 2019 revenue 30
7 B US 2019 profit 40 1
8 B US 2019 revenue 50
9 C France 2018 profit 140
10 C France 2018 revenue 150
11 D France 2018 profit 120
12 D France 2018 revenue 130
13 C France 2019 profit 100 1
14 C France 2019 revenue 110
15 D France 2019 profit 80 2
16 D France 2019 revenue 90
我想只考虑2019的利润,按每个国家的利润对公司进行排名。
当我之前问这个问题时,@KarthikS提供了以下解决方案:
library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(value)))
但是,我现在添加了更多的变量(年份和类型),这也是我想考虑的。
如果问题不清楚,请告诉我。我是R的新手,任何帮助都将不胜感激。谢谢大家! 计算所有年份、所有类型、所有年份的等级,然后删除不需要的值。(或者保留它们。)
库(dplyr)
df%>%
按(国家、年份、类型)分组%>%
变异(秩=秩(描述(值)))%>%
解组()%>%
变化(排名=如果其他(年份==2019年&类型==“利润”,排名,不真实))
##tibble:16 x 6
#公司名称国家年份类型值排名
#
#1美国2018年利润10 NA
#2 A美国2018年收入20 NA
#3 B美国2018年利润30 NA
#4 B美国2018年收入40 NA
#5 A 2019年美国利润20 2
#6 A美国2019年收入30 NA
#7 B 2019年美国利润40 1
#8亿美元2019年收入50纳
#9 C法国2018年利润140 NA
#10 C法国2018年收入150纳
#11 D法国2018年利润120 NA
#12 D法国2018年收入130 NA
#13 C法国2019年利润100 1
#14 C法国2019年收入110纳
#15 D法国2019年利润80 2
#16 D法国2019年收入90 NA
如果您只想要2019年的数据,那么请先过滤(年份==2019)顺便说一句,请注意您提供的代码/数据:您的示例数据中有一个额外的逗号,您的上一个答案代码缺少一个括号。虽然修复起来很容易,但这让我怀疑在这个问题上我们是否还遗漏了什么。谢谢@r2evans我想我会把它作为最后的手段(过滤年份和类型)。然而,我拥有的实际数据集有数百万行,其中包含了10年来许多公司和国家的数据。理想情况下,我只想在此数据集中添加一列,而不是创建一个带有过滤器的新列。@r2evans感谢您指出这一点-抱歉。我已经修好了!这很有效,非常感谢你的帮助,非常感谢!
library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(value)))