Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/ssl/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何根据另一列中某个值的单个出现,使一列中的所有值相同?_R_Dplyr - Fatal编程技术网

R 如何根据另一列中某个值的单个出现,使一列中的所有值相同?

R 如何根据另一列中某个值的单个出现,使一列中的所有值相同?,r,dplyr,R,Dplyr,因此,我有一个数据框架,其中有a-E中的物种名称和等级归属于每个物种,有时会出现具有不同等级的相同物种,但我想要以下信息: 如果一个物种甚至有一次出现在X级,那么该物种的所有其他出现也必须是X级。这是我的数据框: species | grade | ----------------------------------- Tilapia guineensis | B | Tilapia guineensis | E | Tilapia zillii

因此,我有一个数据框架,其中有a-E中的物种名称和等级归属于每个物种,有时会出现具有不同等级的相同物种,但我想要以下信息: 如果一个物种甚至有一次出现在X级,那么该物种的所有其他出现也必须是X级。这是我的数据框:

     species        |    grade      | 
-----------------------------------
Tilapia guineensis  | B |
Tilapia guineensis  | E |
Tilapia zillii      | A |
Fundulus rubrifrons | A |
Eutrigla gurnardus  | D |
Sprattus sprattus   | A |
Gadus morhua        | E |
Gadus morhua        | B |
Tilapia zillii      | C |
Gadus morhua        | B | 
Eutrigla gurnardus  | C |
到目前为止,我以E级为例尝试了以下方法:

 df<-df%>% left_join(df%>% 
                                   group_by(species) %>% 
                                   summarize(sum_e = sum(grade=='E')),by='species') %>%
    mutate(grade = ifelse(sum_e>0,"E",grade))
我想要的输出基本上是这样的:

     species        |    grade      | 
-----------------------------------
Tilapia guineensis  | E |
Tilapia guineensis  | E |
Tilapia zillii      | C |
Fundulus rubrifrons | A |
Eutrigla gurnardus  | D |
Sprattus sprattus   | A |
Gadus morhua        | E |
Gadus morhua        | E |
Tilapia zillii      | C |
Gadus morhua        | B | 
Eutrigla gurnardus  | D |

下面是我将如何使用
data.table
package来实现这一点。我认为如果更改为
dplyr
的话,阶段会相似,只是写得不同而已

# solution using data.table package
library(data.table)

# fake data, replace with yours
df <- data.frame(species=c("a", "a", "b", "b"),
                 grade=c("A", "E", "B", "C"))

# select your grade
dominant_grade <- "E"
# convert to data.table
dt <- as.data.table(df)
# search over species, add a column that checks if any of the grades is equal
# to the dominant one
dt[, contains_dominant := any(grade == dominant_grade), by=species]
# For cases where the dominant one is present, set all the grades to the dominant
# one
dt[contains_dominant == TRUE, grade := dominant_grade]

# convert back to data frame and trim for output
out <- setDF(dt[, .(species, grade)])
out
#使用data.table包的解决方案
库(数据表)
#伪造数据,替换为您的数据

df谢谢,我认为这是有效的!但接下来,我该如何更新这个代码:“哦,对了,你可以使用
out为什么
zillii罗非鱼的等级从
a,C
改为
C,C
# solution using data.table package
library(data.table)

# fake data, replace with yours
df <- data.frame(species=c("a", "a", "b", "b"),
                 grade=c("A", "E", "B", "C"))

# select your grade
dominant_grade <- "E"
# convert to data.table
dt <- as.data.table(df)
# search over species, add a column that checks if any of the grades is equal
# to the dominant one
dt[, contains_dominant := any(grade == dominant_grade), by=species]
# For cases where the dominant one is present, set all the grades to the dominant
# one
dt[contains_dominant == TRUE, grade := dominant_grade]

# convert back to data frame and trim for output
out <- setDF(dt[, .(species, grade)])
out