R 计算最小项目值以按组获得平均阈值

R 计算最小项目值以按组获得平均阈值,r,dplyr,data.table,R,Dplyr,Data.table,我有一个数据样本,其中的值按ID和项目分组,我需要确定一个最小值,通过该值增加单个项目,以满足ID为0.90的总体平均阈值 数据: 我可以通过以下语法获得每个项目的差值: library(dplyr) SampDF2<-SampDF %>% group_by(ID,Item,CurrentAvg) %>% mutate(Value.1.Increase = 0.90-Value.1) 但对于将CurrentAvg按ID增加到0.90阈值的项目值增加,此结果不正确 是否有办

我有一个数据样本,其中的值按ID和项目分组,我需要确定一个最小值,通过该值增加单个项目,以满足ID为0.90的总体平均阈值

数据:

我可以通过以下语法获得每个项目的差值:

library(dplyr)
SampDF2<-SampDF %>% 
group_by(ID,Item,CurrentAvg) %>% 
mutate(Value.1.Increase = 0.90-Value.1)
但对于将CurrentAvg按ID增加到0.90阈值的项目值增加,此结果不正确

是否有办法做到这一点,并添加两个新列(一个显示值增加的值增加列和一个确认新平均值满足0.90阈值的新平均值列)

如果我的手动计算正确,这将是理想的结果:

structure(list(ID = structure(c(1L, 2L, 2L), .Label = c("A1", 
"A2"), class = "factor"), Item = structure(c(1L, 2L, 1L), .Label = c("Item1", 
"Item2"), class = "factor"), Value.1 = c(0.7894, 0.95, 0.8393
), CurrentAvg = c(0.7894, 0.8697, 0.8697), ValueIncrease = c(0.1106, 
0.04999, 0.04999), NewAvg = c(0.9, 0.89465, 0.89465)), class = "data.frame",  row.names = c(NA, 
-3L))

@Rui Barradas,似乎每行的VAUE为0.90。我编辑了我的问题,试图澄清我需要知道项目增加值是多少,以获得ID为0.90的总体平均值。理想情况下,会有两个额外的列,一个用于增加值,一个用于新的平均值。结果是每行
0.9
,因为分组后,每组只有一个数据点,如果长度(x)为
1
,则
mean(x)=0.9=>x==0.9
。如果数据点及其平均值相等,
value.1-平均值(value.1)
为零。然后添加
0.9
。对于一个真实的、更大的数据集,这不会发生。@Rui Barradas我刚才添加的帮助是否达到了预期的效果?我尝试手动计算ID A2的任何一项的增加量,以使总体平均值达到该ID的0.90阈值。是的,这很有帮助。但是,您只需按
ID
进行分组。类似于
SampDF2%group\u by(ID)%%>%mutate(ValueIncrease=0.9-mean(Value.1),NewAvg=mean(Value.1)+ValueIncrease)
@Rui Barradas我们很接近,但ValueIncrease列中的结果在添加到任何项目的Value1得分时,不会将当前平均值增加到0.90。如果CurrentAvg+值增加,则结果为0.90,这并没有告诉我任何单个值1得分增加了多少,以使CurrentAvg达到0.90。可以手动增加任何Value1分数,尝试将CurrentAvg提高到0.90,但我有1500多个ID。
structure(list(ID = structure(c(1L, 2L, 2L), .Label = c("A1", 
"A2"), class = "factor"), Item = structure(c(1L, 2L, 1L), .Label = c("Item1", 
"Item2"), class = "factor"), Value.1 = c(0.7894, 0.95, 0.7894
), CurrentAvg = c(0.7894, 0.8697, 0.8697), Value.1.Increase = c(0.1106, 
-0.0499999999999999, 0.1106)), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -3L), vars = c("ID", 
"Item", "CurrentAvg"), labels = structure(list(ID = structure(c(1L, 
2L, 2L), .Label = c("A1", "A2"), class = "factor"), Item = structure(c(1L, 
1L, 2L), .Label = c("Item1", "Item2"), class = "factor"), CurrentAvg = c(0.7894, 
0.8697, 0.8697)), class = "data.frame", row.names = c(NA, -3L
), vars = c("ID", "Item", "CurrentAvg"), drop = TRUE), indices = list(
0L, 2L, 1L), drop = TRUE, group_sizes = c(1L, 1L, 1L), biggest_group_size = 1L)
structure(list(ID = structure(c(1L, 2L, 2L), .Label = c("A1", 
"A2"), class = "factor"), Item = structure(c(1L, 2L, 1L), .Label = c("Item1", 
"Item2"), class = "factor"), Value.1 = c(0.7894, 0.95, 0.8393
), CurrentAvg = c(0.7894, 0.8697, 0.8697), ValueIncrease = c(0.1106, 
0.04999, 0.04999), NewAvg = c(0.9, 0.89465, 0.89465)), class = "data.frame",  row.names = c(NA, 
-3L))