Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/variables/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 使用配置单元QL按时间间隔对时间序列进行采样,并计算跳跃_Hadoop_Hive_Time Series_Hiveql - Fatal编程技术网

Hadoop 使用配置单元QL按时间间隔对时间序列进行采样,并计算跳跃

Hadoop 使用配置单元QL按时间间隔对时间序列进行采样,并计算跳跃,hadoop,hive,time-series,hiveql,Hadoop,Hive,Time Series,Hiveql,我有一个表中的时间序列数据。基本上每行都有一个时间戳和一个值。 数据的频率是绝对随机的 我想以给定的频率对其进行采样,并针对每个频率提取相关信息:最小、最大、最后、更改(相对上一次)、返回(更改/上一次)以及可能更多(计数…) 这是我的意见: 08:00:10, 1 08:01:20, 2 08:01:21, 3 08:01:24, 5 08:02:24, 2 我想得到1分钟采样的以下结果(ts,min,max,last,change,return): 您可以这样做(注释内联): 结果: +-

我有一个表中的时间序列数据。基本上每行都有一个时间戳和一个值。 数据的频率是绝对随机的

我想以给定的频率对其进行采样,并针对每个频率提取相关信息:最小、最大、最后、更改(相对上一次)、返回(更改/上一次)以及可能更多(计数…)

这是我的意见:

08:00:10, 1
08:01:20, 2
08:01:21, 3
08:01:24, 5
08:02:24, 2
我想得到1分钟采样的以下结果(ts,min,max,last,change,return):


您可以这样做(注释内联):

结果:

+----------+----+----+---+------+------+
| min      | mn | mx | l | c    | r    |
+----------+----+----+---+------+------+
| 08:01:00 | 1  | 1  | 1 | NULL | NULL |
| 08:02:00 | 2  | 5  | 5 | 4    | 4    |
| 08:03:00 | 2  | 2  | 2 | -3   | -0.6 |
+----------+----+----+---+------+------+
SELECT
    min
  , mn
  , mx
  , l
  , l - LAG(l, 1) OVER (ORDER BY min) c
    -- This might not be the right calculation. Unsure how -0.25 was derived in question.
  , (l - LAG(l, 1) OVER (ORDER BY min)) / (LAG(l, 1) OVER (ORDER BY min)) r
FROM
(
  SELECT
      min
    , MIN(val) mn
    , MAX(val) mx
    -- We can take MAX here because all l's (last values) for the minute are the same.
    , MAX(l) l
  FROM
  (
    SELECT
        min
      , val
      -- The last value of the minute, ordered by the timestamp, using all rows.
      , LAST_VALUE(val) OVER (PARTITION BY min ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) l
    FROM
    (
      SELECT
          ts
        -- Drop the seconds and go back one minute by converting to seconds,
        -- subtracting 60, and then going back to a shorter string format.
        -- 2000-01-01 is a dummy date just to enable the conversion.
        , CONCAT(FROM_UNIXTIME(UNIX_TIMESTAMP(CONCAT("2000-01-01 ", ts), "yyyy-MM-dd HH:mm:ss") + 60, "HH:mm"), ":00") min       
        , val
      FROM
        -- As from the question.
        21908430_input a
    ) val_by_min
  ) val_by_min_with_l
  GROUP BY min
) min_with_l_m_M
ORDER BY min
;
+----------+----+----+---+------+------+
| min      | mn | mx | l | c    | r    |
+----------+----+----+---+------+------+
| 08:01:00 | 1  | 1  | 1 | NULL | NULL |
| 08:02:00 | 2  | 5  | 5 | 4    | 4    |
| 08:03:00 | 2  | 2  | 2 | -3   | -0.6 |
+----------+----+----+---+------+------+