R dplyr使用不同的最小值和最大值为每组执行插值
每个段都有不同的R dplyr使用不同的最小值和最大值为每组执行插值,r,dplyr,interpolation,R,Dplyr,Interpolation,每个段都有不同的范围,例如A从1到3,而C从1到7。 对于每个线段,可能会缺少我要执行插值的时间(线性、样条曲线等) 我如何在dplyr内完成它 have <- data.frame(time =c(1,3,1,2,5,1,3,5,7), segment=c('A','A','B','B','B','C','C','C','C'), toInterpolate= c(0.12,0.31,0.15,0.24,0.55,0
范围
,例如A从1到3,而C从1到7。
对于每个线段,可能会缺少我要执行插值的时间(线性、样条曲线等)
我如何在dplyr内完成它
have <- data.frame(time =c(1,3,1,2,5,1,3,5,7),
segment=c('A','A','B','B','B','C','C','C','C'),
toInterpolate= c(0.12,0.31,0.15,0.24,0.55,0.11,0.35,0.53,0.79))
have
want <- data.frame(time =c(1,2,3,1,2,3,4,5,1,2,3,4,5,6,7),
segment=c('A','A','A','B','B','B','B','B','C','C','C','C','C','C','C'),
Interpolated= c(0.12,0.21,0.31,0.15,0.24,0.34,0.41,0.55,0.11,0.28,0.35,0.45,0.53,0.69,0.79))
# note that the interpolated values here are just randomnly put, (not based on actual linear/spline interpolation)
want
have我们可以使用complete
完成序列,使用na.spline
从zoo
进行插值
library(dplyr)
library(tidyr)
library(zoo)
have %>%
group_by(segment) %>%
complete(time = min(time):max(time)) %>%
mutate(toInterpolate = na.spline(toInterpolate))
# segment time toInterpolate
# <chr> <dbl> <dbl>
# 1 A 1 0.12
# 2 A 2 0.215
# 3 A 3 0.31
# 4 B 1 0.15
# 5 B 2 0.24
# 6 B 3 0.337
# 7 B 4 0.44
# 8 B 5 0.55
# 9 C 1 0.11
#10 C 2 0.246
#11 C 3 0.35
#12 C 4 0.439
#13 C 5 0.53
#14 C 6 0.641
#15 C 7 0.79
谢谢@Ronak Shah,有没有办法控制粒度?例如,如果我想将粒度增加到1个小数点,比如1到2之间,我们将有1.0,1.1,1.2,1.3,…1.9.2.0等等,您可以使用seq
seq(1,2,0.1)
或者在每个组中类似的东西seq(min(toInterpolate),max(toInterpolate),length.out=n())
您能用seq
示例更新上面的内容吗?我试着使用full_seq
,但不确定如何正确地将其包含在dplyr动词中。您希望seq
的行为如何?你的预期产出是多少?对于第一组,时间=1为0.12,时间=3为0.31。时间=2的值是多少?我会让函数对值进行插值,但我知道要插值的值太多,我猜一些插值方法不支持这种间隙?
have %>%
group_by(segment) %>%
complete(time = min(time):max(time)) %>%
mutate(toInterpolate = na.spline(toInterpolate)) %>%
complete(time = seq(min(time), max(time), 0.1))