r中列间的线性插值

r中列间的线性插值,r,interpolation,R,Interpolation,我正在处理一些温度数据,其中我有特定深度的温度,例如0.9米、2.5米和5米。我想对这些值进行插值,以获得每米的温度,例如1m、2m和3m。原始数据如下所示: df # A tibble: 5 x 3 date d_0.9 d_2.5 <dttm> <dbl> <dbl> 1 2004-01-05 03:00:00 7 8 2 2004-01-05 04:00:00

我正在处理一些温度数据,其中我有特定深度的温度,例如0.9米、2.5米和5米。我想对这些值进行插值,以获得每米的温度,例如1m、2m和3m。原始数据如下所示:

df
# A tibble: 5 x 3
  date                d_0.9 d_2.5  
  <dttm>              <dbl> <dbl> 
1 2004-01-05 03:00:00  7     8        
2 2004-01-05 04:00:00  7.5   9      
3 2004-01-05 05:00:00  7     8        
4 2004-01-05 06:00:00  6.92  NA      
df
#一个tibble:5x3
日期d_0.9 d_2.5
1 2004-01-05 03:00:00  7     8        
2 2004-01-05 04:00:00  7.5   9      
3 2004-01-05 05:00:00  7     8        
4 2004-01-05 06:00:00 6.92北美
我想得到的是:

df_int
# A tibble: 5 x 5
  date                 d_0.9   d_1      d_2      d_2.5  
  <dttm>              <dbl>   <dbl>     <dbl>    <dbl>
1 2004-01-05 03:00:00  7       7.0625   7.6875   8     
2 2004-01-05 04:00:00  7.5     7.59375  8.53125  9      
3 2004-01-05 05:00:00  7       7.0625   7.6875   8  
4 2004-01-05 06:00:00  6.92    NA       NA       NA 
df_int
#一个tibble:5x5
日期d_0.9 d_1 d_2 d_2.5
1 2004-01-05 03:00:00  7       7.0625   7.6875   8     
2 2004-01-05 04:00:00  7.5     7.59375  8.53125  9      
3 2004-01-05 05:00:00  7       7.0625   7.6875   8  
4 2004-01-05 06:00:00 6.92纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳纳
我必须对一个非常大的数据帧执行此操作。有没有一种有效的方法


非常感谢您提前

一个选项是将数据转换为长格式,使用连接为我们要插值的深度添加行,然后使用
近似值
进行插值:

library(tidyverse)

# Data
df = tibble(date=seq(as.POSIXct("2004-01-05 03:00:00"),
                     as.POSIXct("2004-01-05 06:00:00"),
                     by="1 hour"),
            d_0.9 = c(7,7.5,7,6.92),
            d_2.5 = c(8,NA,8,NA),
            d_5.0 = c(10,10.5,9.4,NA))

# Create a data frame with all of the times and depths we want to interpolate at
depths = sort(unique(c(c(0.9, 2.5, 5), seq(ceiling(0.9), floor(5), 1))))
depths = crossing(date=unique(df$date), depth = depths)

# Convert data to long format, join to add interpolation depths, then interpolate
df.interp = df %>% 
  gather(depth, value, -date) %>% 
  mutate(depth = as.numeric(gsub("d_", "", depth))) %>% 
  full_join(depths) %>% 
  arrange(date, depth) %>% 
  group_by(date) %>% 
  mutate(value.interp = if(length(na.omit(value)) > 1) {
    approx(depth, value, xout=depth)$y
  } else {
    value
  })
在上面的代码中,包含了
if
语句,以防止
approx
在给定的
date
只有一个非缺失值时抛出错误

df.interp
                  date depth value value.interp
1  2004-01-05 03:00:00   0.9  7.00     7.000000
2  2004-01-05 03:00:00   1.0    NA     7.062500
3  2004-01-05 03:00:00   2.0    NA     7.687500
4  2004-01-05 03:00:00   2.5  8.00     8.000000
5  2004-01-05 03:00:00   3.0    NA     8.400000
6  2004-01-05 03:00:00   4.0    NA     9.200000
7  2004-01-05 03:00:00   5.0 10.00    10.000000
8  2004-01-05 04:00:00   0.9  7.50     7.500000
9  2004-01-05 04:00:00   1.0    NA     7.573171
10 2004-01-05 04:00:00   2.0    NA     8.304878
11 2004-01-05 04:00:00   2.5    NA     8.670732
12 2004-01-05 04:00:00   3.0    NA     9.036585
13 2004-01-05 04:00:00   4.0    NA     9.768293
14 2004-01-05 04:00:00   5.0 10.50    10.500000
15 2004-01-05 05:00:00   0.9  7.00     7.000000
16 2004-01-05 05:00:00   1.0    NA     7.062500
17 2004-01-05 05:00:00   2.0    NA     7.687500
18 2004-01-05 05:00:00   2.5  8.00     8.000000
19 2004-01-05 05:00:00   3.0    NA     8.280000
20 2004-01-05 05:00:00   4.0    NA     8.840000
21 2004-01-05 05:00:00   5.0  9.40     9.400000
22 2004-01-05 06:00:00   0.9  6.92     6.920000
23 2004-01-05 06:00:00   1.0    NA           NA
24 2004-01-05 06:00:00   2.0    NA           NA
25 2004-01-05 06:00:00   2.5    NA           NA
26 2004-01-05 06:00:00   3.0    NA           NA
27 2004-01-05 06:00:00   4.0    NA           NA
28 2004-01-05 06:00:00   5.0    NA           NA