R 如何根据另一列中某个值的单个出现,使一列中的所有值相同?
因此,我有一个数据框架,其中有a-E中的物种名称和等级归属于每个物种,有时会出现具有不同等级的相同物种,但我想要以下信息: 如果一个物种甚至有一次出现在X级,那么该物种的所有其他出现也必须是X级。这是我的数据框:R 如何根据另一列中某个值的单个出现,使一列中的所有值相同?,r,dplyr,R,Dplyr,因此,我有一个数据框架,其中有a-E中的物种名称和等级归属于每个物种,有时会出现具有不同等级的相同物种,但我想要以下信息: 如果一个物种甚至有一次出现在X级,那么该物种的所有其他出现也必须是X级。这是我的数据框: species | grade | ----------------------------------- Tilapia guineensis | B | Tilapia guineensis | E | Tilapia zillii
species | grade |
-----------------------------------
Tilapia guineensis | B |
Tilapia guineensis | E |
Tilapia zillii | A |
Fundulus rubrifrons | A |
Eutrigla gurnardus | D |
Sprattus sprattus | A |
Gadus morhua | E |
Gadus morhua | B |
Tilapia zillii | C |
Gadus morhua | B |
Eutrigla gurnardus | C |
到目前为止,我以E级为例尝试了以下方法:
df<-df%>% left_join(df%>%
group_by(species) %>%
summarize(sum_e = sum(grade=='E')),by='species') %>%
mutate(grade = ifelse(sum_e>0,"E",grade))
我想要的输出基本上是这样的:
species | grade |
-----------------------------------
Tilapia guineensis | E |
Tilapia guineensis | E |
Tilapia zillii | C |
Fundulus rubrifrons | A |
Eutrigla gurnardus | D |
Sprattus sprattus | A |
Gadus morhua | E |
Gadus morhua | E |
Tilapia zillii | C |
Gadus morhua | B |
Eutrigla gurnardus | D |
下面是我将如何使用
data.table
package来实现这一点。我认为如果更改为dplyr
的话,阶段会相似,只是写得不同而已
# solution using data.table package
library(data.table)
# fake data, replace with yours
df <- data.frame(species=c("a", "a", "b", "b"),
grade=c("A", "E", "B", "C"))
# select your grade
dominant_grade <- "E"
# convert to data.table
dt <- as.data.table(df)
# search over species, add a column that checks if any of the grades is equal
# to the dominant one
dt[, contains_dominant := any(grade == dominant_grade), by=species]
# For cases where the dominant one is present, set all the grades to the dominant
# one
dt[contains_dominant == TRUE, grade := dominant_grade]
# convert back to data frame and trim for output
out <- setDF(dt[, .(species, grade)])
out
#使用data.table包的解决方案
库(数据表)
#伪造数据,替换为您的数据
df谢谢,我认为这是有效的!但接下来,我该如何更新这个代码:“哦,对了,你可以使用out为什么zillii罗非鱼的等级从a,C
改为C,C
?
# solution using data.table package
library(data.table)
# fake data, replace with yours
df <- data.frame(species=c("a", "a", "b", "b"),
grade=c("A", "E", "B", "C"))
# select your grade
dominant_grade <- "E"
# convert to data.table
dt <- as.data.table(df)
# search over species, add a column that checks if any of the grades is equal
# to the dominant one
dt[, contains_dominant := any(grade == dominant_grade), by=species]
# For cases where the dominant one is present, set all the grades to the dominant
# one
dt[contains_dominant == TRUE, grade := dominant_grade]
# convert back to data frame and trim for output
out <- setDF(dt[, .(species, grade)])
out