Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 根据条件更改分组中的值_R_Dataframe_Dplyr - Fatal编程技术网

R 根据条件更改分组中的值

R 根据条件更改分组中的值,r,dataframe,dplyr,R,Dataframe,Dplyr,我从以下数据开始: df <- data.frame(Person=c("Ada","Ada","Bob","Bob","Carl","Carl"), Day=c(1,2,2,1,1,2), Fruit=c("Apple","X","Apple","X","X","Orange")) Person Day Fruit 1 Ada 1 Apple 2 Ada 2 X 3 Bob 2 Apple 4 Bob 1 X 5

我从以下数据开始:

df <- data.frame(Person=c("Ada","Ada","Bob","Bob","Carl","Carl"), Day=c(1,2,2,1,1,2), Fruit=c("Apple","X","Apple","X","X","Orange"))

  Person Day  Fruit
1    Ada   1  Apple
2    Ada   2      X
3    Bob   2  Apple
4    Bob   1      X
5   Carl   1      X
6   Carl   2 Orange
  • 有什么方向的建议吗

dplyr
使用
case\u的另一种解决方案:

library(dplyr)

# Changing datatypes to character instead of factor
df[] <- lapply(df, as.character)

# Optional, but this line will convert all columns to appropriate datatype, eg. Day will be integer
df <- readr::type_convert(df)

df %>%
  group_by(Person) %>%
  mutate(
    Contains_Apple = any(Fruit == "Apple"),
    Contains_Orange = any(Fruit == "Orange"),
    Fruit = case_when(
      Fruit == "X" & Contains_Apple == F ~ "Apple",
      Fruit == "X" & Contains_Orange == F ~ "Orange",
      TRUE ~ Fruit
    )
  )

# A tibble: 6 x 5
# Groups: Person [3]
  Person   Day Fruit  Contains_Apple Contains_Orange
  <chr>  <int> <chr>  <lgl>          <lgl>          
1 Ada        1 Apple  T              F              
2 Ada        2 Orange T              F              
3 Bob        2 Apple  T              F              
4 Bob        1 Orange T              F              
5 Carl       1 Apple  F              T              
6 Carl       2 Orange F              T    

这里有一个想法,当
检查每组是否已经有了“苹果”或“橙色”,然后如果水果是“X”,则分配相反的值

请注意,我在创建示例数据框时添加了
stringsAsFactors=FALSE
,目的是避免创建因子列

library(dplyr)
library(tidyr)

df %>%
  group_by(Person) %>%
  mutate(Fruit = case_when(
    Fruit %in% "X" & any(Fruit %in% "Apple")  ~ "Orange",
    Fruit %in% "X" & any(Fruit %in% "Orange") ~ "Apple",
    TRUE                                      ~ Fruit
  )) %>%
  ungroup()   

# # A tibble: 6 x 3
#   Person   Day Fruit 
#   <chr>  <dbl> <chr> 
# 1 Ada     1.00 Apple 
# 2 Ada     2.00 Orange
# 3 Bob     2.00 Apple 
# 4 Bob     1.00 Orange
# 5 Carl    1.00 Apple 
# 6 Carl    2.00 Orange
库(dplyr)
图书馆(tidyr)
df%>%
分组单位(人)%>%
变异(果=情况)(
水果%在%“X”和任何(水果%在%“苹果”)~“橙色”,
水果%在%“X”和任何(水果%在%“橙”)~“苹果”,
真的~水果
)) %>%
解组()
##tibble:6 x 3
#人日水果
#       
#1 Ada 1.00苹果
#2 Ada 2.00橙色
#3鲍勃2.00苹果
#4鲍勃1.00橙色
#5卡尔1.00苹果
#6卡尔2.00橙色
数据

df <- data.frame(Person=c("Ada","Ada","Bob","Bob","Carl","Carl"), 
                 Day=c(1,2,2,1,1,2), 
                 Fruit=c("Apple","X","Apple","X","X","Orange"),
                 stringsAsFactors = FALSE)
df简单循环:

fruity_loop <- function(frame) { 
    ops <- c('Apple', 'Orange')
    for(x in 1:nrow(frame)) {
    if(frame[x,]['Fruit'] == 'X') { 
      if(frame[x-1,]['Fruit'] == ops[1]) { frame[x,]['Fruit'] <- ops[2] } else { frame[x,]['Fruit'] <- ops[1] } } 
    }
    return(frame)
}

谢谢,我喜欢这种方法!尽管出于某种原因,我无法复制你的结果;“Carl”两天都有橙子。试着找出示例数据集和真实数据集之间的区别。非常感谢!我无法复制你的结果(我开始认为我这方面有问题,因为我在另一个回复中遇到了同样的复制问题)。前三列保持不变,两个新列“Contains_…”用真值填充。嗯,我不知道是什么导致了这个问题。如有疑问,请重新启动R会话并再次运行代码。否则,如果可以的话,可以更新
dplyr
?另外,尝试用您正在使用的代码更新您的问题,以获得此结果。我会看看我是否遇到了同样的问题。一个简单的重新启动实际上做到了,谢谢!工作起来很有魅力!
df <- data.frame(Person=c("Ada","Ada","Bob","Bob","Carl","Carl"), 
                 Day=c(1,2,2,1,1,2), 
                 Fruit=c("Apple","X","Apple","X","X","Orange"),
                 stringsAsFactors = FALSE)
fruity_loop <- function(frame) { 
    ops <- c('Apple', 'Orange')
    for(x in 1:nrow(frame)) {
    if(frame[x,]['Fruit'] == 'X') { 
      if(frame[x-1,]['Fruit'] == ops[1]) { frame[x,]['Fruit'] <- ops[2] } else { frame[x,]['Fruit'] <- ops[1] } } 
    }
    return(frame)
}
fruity_loop(df)