Google bigquery BQ:将金额分配到数组中的行上+;要匹配的插件
我需要在数组的行上分布一个聚合的数量。我从这里开始:Google bigquery BQ:将金额分配到数组中的行上+;要匹配的插件,google-bigquery,Google Bigquery,我需要在数组的行上分布一个聚合的数量。我从这里开始: WITH a AS ( SELECT 1 AS key, 8.55 AS tot_tax , ARRAY( SELECT AS STRUCT 'item1' AS descr, 9.6 AS amt UNION ALL SELECT AS STRUCT 'item2', 183.5 UNION ALL SELECT AS STRUCT 'item3'
WITH a AS (
SELECT 1 AS key, 8.55 AS tot_tax
, ARRAY(
SELECT AS STRUCT 'item1' AS descr, 9.6 AS amt
UNION ALL
SELECT AS STRUCT 'item2', 183.5
UNION ALL
SELECT AS STRUCT 'item3', 26.5
) items
)
--query:
SELECT * EXCEPT(items)
, ARRAY(
SELECT AS STRUCT *
, ROUND(amt/(SELECT SUM(amt) FROM UNNEST(items)) * tot_tax, 2) AS tax
FROM UNNEST(items)
) items
FROM a
但是,由于绕行(必需),因此SUM(tax)
tot_tax
。因此,我想将微小的差异插入最大的税额中,使之匹配。
我可以在另一个查询中这样做:
SELECT * EXCEPT(items)
, ARRAY(
SELECT AS STRUCT * EXCEPT(tax)
, IF(o = 0, ROUND(tax + (SELECT tot_tax - SUM(tax) FROM UNNEST(items)), 2), tax) AS tax
FROM UNNEST(items) WITH OFFSET o
) items
FROM
(SELECT * EXCEPT(items)
, ARRAY(
SELECT AS STRUCT *
, ROUND(amt/(select SUM(amt) FROM UNNEST(items)) * tot_tax, 2) AS tax
FROM UNNEST(items) ORDER BY amt DESC
) items
FROM a)
工作很好,但很麻烦。
单次查询或使用UDF(js/SQL)是否可以更好地做到这一点(清晰度+性能)?下面是针对BigQuery标准SQL的
#standardSQL
SELECT * REPLACE(
ARRAY(
SELECT AS STRUCT * EXCEPT(tax, pos),
IF(ROW_NUMBER() OVER(ORDER BY amt DESC) = 1,
ROUND(tax + tot_tax - SUM(tax) OVER(), 2), tax
) AS tax
FROM (
SELECT * EXCEPT(ratio), ROUND(amt * ratio, 2) AS tax
FROM UNNEST(items) WITH OFFSET AS pos,
(SELECT tot_tax / SUM(amt) AS ratio FROM UNNEST(items))
)
ORDER BY pos
) AS items)
FROM a
如果要应用于问题中的样本数据,则结果为
Row key tot_tax items.descr items.amt items.tax
1 1 8.55 item1 9.6 0.37
item2 183.5 7.15
item3 26.5 1.03
下面是进一步简化/重构的版本(但可能不太友好)
输出完全相同谢谢Mikhail。我喜欢窗口函数而不是不需要的数组-不要忘记。由于记录集非常大,删除了按pos下单以加快速度
#standardSQL
SELECT * REPLACE(
ARRAY(
SELECT AS STRUCT * EXCEPT(ratio, pos),
ROUND(ROUND(amt * ratio, 2) +
IF(ROW_NUMBER() OVER(ORDER BY amt DESC) = 1, tot_tax - SUM(ROUND(amt * ratio, 2)) OVER(), 0)
, 2) AS tax
FROM UNNEST(items) WITH OFFSET AS pos,
(SELECT tot_tax / SUM(amt) AS ratio FROM UNNEST(items))
ORDER BY pos) AS items)
FROM a