R dplyr使用不同的最小值和最大值为每组执行插值_R_Dplyr_Interpolation

R dplyr使用不同的最小值和最大值为每组执行插值

R dplyr使用不同的最小值和最大值为每组执行插值,r,dplyr,interpolation,R,Dplyr,Interpolation,每个段都有不同的范围，例如A从1到3，而C从1到7。对于每个线段，可能会缺少我要执行插值的时间（线性、样条曲线等）我如何在dplyr内完成它 have <- data.frame(time =c(1,3,1,2,5,1,3,5,7), segment=c('A','A','B','B','B','C','C','C','C'), toInterpolate= c(0.12,0.31,0.15,0.24,0.55,0

每个段都有不同的

范围

，例如A从1到3，而C从1到7。对于每个线段，可能会缺少我要执行插值的时间（线性、样条曲线等）

我如何在dplyr内完成它

have <- data.frame(time   =c(1,3,1,2,5,1,3,5,7),
                 segment=c('A','A','B','B','B','C','C','C','C'),
                 toInterpolate= c(0.12,0.31,0.15,0.24,0.55,0.11,0.35,0.53,0.79))
have


want <- data.frame(time   =c(1,2,3,1,2,3,4,5,1,2,3,4,5,6,7),
                   segment=c('A','A','A','B','B','B','B','B','C','C','C','C','C','C','C'),
                   Interpolated= c(0.12,0.21,0.31,0.15,0.24,0.34,0.41,0.55,0.11,0.28,0.35,0.45,0.53,0.69,0.79))
# note that the interpolated values here are just randomnly put, (not based on actual linear/spline interpolation)

want

have我们可以使用complete
完成序列，使用na.spline
从zoo
进行插值
library(dplyr)
library(tidyr)
library(zoo)

have %>%
  group_by(segment) %>%
  complete(time = min(time):max(time)) %>%
  mutate(toInterpolate = na.spline(toInterpolate))

#  segment  time toInterpolate
#   <chr>   <dbl>         <dbl>
# 1 A           1         0.12 
# 2 A           2         0.215
# 3 A           3         0.31 
# 4 B           1         0.15 
# 5 B           2         0.24 
# 6 B           3         0.337
# 7 B           4         0.44 
# 8 B           5         0.55 
# 9 C           1         0.11 
#10 C           2         0.246
#11 C           3         0.35 
#12 C           4         0.439
#13 C           5         0.53 
#14 C           6         0.641
#15 C           7         0.79 

谢谢@Ronak Shah，有没有办法控制粒度？例如，如果我想将粒度增加到1个小数点，比如1到2之间，我们将有1.0,1.1,1.2,1.3，…1.9.2.0等等，您可以使用seq
seq（1,2,0.1）
或者在每个组中类似的东西seq（min（toInterpolate），max（toInterpolate），length.out=n（））
您能用seq
示例更新上面的内容吗？我试着使用full_seq
，但不确定如何正确地将其包含在dplyr动词中。您希望seq的行为如何？你的预期产出是多少？对于第一组，时间=1为0.12，时间=3为0.31。时间=2的值是多少？我会让函数对值进行插值，但我知道要插值的值太多，我猜一些插值方法不支持这种间隙？
have %>%
  group_by(segment) %>%
  complete(time = min(time):max(time)) %>%
  mutate(toInterpolate = na.spline(toInterpolate)) %>%
  complete(time = seq(min(time), max(time), 0.1))