Mysql 在多个表上选择“最大值”,不进行两次计数
我正在做一个查询,允许我按分数订购食谱 表结构 结构是,传单包含一个或多个Mysql 在多个表上选择“最大值”,不进行两次计数,mysql,sql,Mysql,Sql,我正在做一个查询,允许我按分数订购食谱 表结构 结构是,传单包含一个或多个传单项目,可以包含一个或多个成分(此表将成分链接到传单项目)。另一个表配料到配方链接了相同的配料,但链接到一个或多个配方。最后包含指向.sql文件的链接 示例查询 我想得到配方id和配方中每种成分的最大价格权重之和(由成分链接到配方),但如果一个配方中有多个成分属于同一传单项目,则应计算一次 SELECT itr.recipe_id, SUM(itr.weight), SUM(max_price
传单项目
,可以包含一个或多个成分
(此表将成分链接到传单项目)。另一个表配料到配方
链接了相同的配料,但链接到一个或多个配方。最后包含指向.sql文件的链接
示例查询
我想得到配方id和配方中每种成分的最大价格权重之和(由成分链接到配方),但如果一个配方中有多个成分属于同一传单项目,则应计算一次
SELECT itr.recipe_id,
SUM(itr.weight),
SUM(max_price_weight),
SUM(itr.weight + max_price_weight) AS score
FROM
( SELECT MAX(itf.max_price_weight) AS max_price_weight,
itf.flyer_item_id,
itf.ingredient_id
FROM
(SELECT ifi.ingredient_id,
MAX(i.price_weight) AS max_price_weight,
ifi.flyer_item_id
FROM flyer_items i
JOIN ingredient_to_flyer_item ifi ON i.id = ifi.flyer_item_id
WHERE i.flyer_id IN (1,
2)
GROUP BY ifi.ingredient_id ) itf
GROUP BY itf.flyer_item_id) itf2
JOIN `ingredient_to_recipe` AS itr ON itf2.`ingredient_id` = itr.`ingredient_id`
WHERE recipe_id = 5730
GROUP BY itr.`recipe_id`
ORDER BY score DESC
LIMIT 0,10
查询几乎可以正常工作,因为大多数结果都是好的,但是对于某些行,一些成分被忽略,并且没有从分数中计算出来
测试用例
我很难找到一个简单的解释方法。如果还有什么可以帮忙的,请告诉我
以下是指向演示数据库的链接,用于运行查询、测试示例和测试用例:
多谢各位
更新(如Rick James所问):
这是我能做的最远的事了。结果总是很好的,在子查询中也是如此,但是,我已经完全按照“flyer\u item\u id”删除了组。通过这个查询,我得到了好的分数,但是如果配方中的许多成分是相同的传单项目,它们将被计数多次(比如配方id=10557的分数是59,而不是好的56,因为价值3的2种成分在相同的传单项目中)。我唯一需要更多的是计算每个配方的每个传单项目id的最大值(价格重量)(我最初尝试通过“传单项目id”分组,而不是通过成分id分组)
SELECT itr.recipe_id,
SUM(itr.weight) as total_ingredient_weight,
SUM(itf.price_weight) as total_price_weight,
SUM(itr.weight+itf.price_weight) as score
FROM
(SELECT fi1.id, MAX(fi1.price_weight) as price_weight, ingredient_to_flyer_item.ingredient_id as ingredient_id, recipe_id
FROM flyer_items fi1
INNER JOIN (
SELECT flyer_items.id as id, MAX(price_weight) as price_weight, ingredient_to_flyer_item.ingredient_id as ingredient_id
FROM flyer_items
JOIN ingredient_to_flyer_item ON flyer_items.id = ingredient_to_flyer_item.flyer_item_id
GROUP BY id
) fi2 ON fi1.id = fi2.id AND fi1.price_weight = fi2.price_weight
JOIN ingredient_to_flyer_item ON fi1.id = ingredient_to_flyer_item.flyer_item_id
JOIN ingredient_to_recipe ON ingredient_to_flyer_item.ingredient_id = ingredient_to_recipe.ingredient_id
GROUP BY ingredient_to_flyer_item.ingredient_id) AS itf
INNER JOIN `ingredient_to_recipe` AS `itr` ON `itf`.`ingredient_id` = `itr`.`ingredient_id`
GROUP BY `itr`.`recipe_id`
ORDER BY `score` DESC
LIMIT 10
下面是解释,但我不确定它是否有用,因为最后一个工作部分仍然缺失:
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |
|----|-------------|--------------------------|------------|--------|-------------------------------|---------------|---------|-------------------------------------------------------|--------|----------|---------------------------------|---|
| 1 | PRIMARY | itr | NULL | ALL | recipe_id,ingredient_id | NULL | NULL | NULL | 151800 | 100.00 | Using temporary; Using filesort | |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 4 | metadata3.itr.ingredient_id | 10 | 100.00 | NULL | |
| 2 | DERIVED | ingredient_to_flyer_item | NULL | ALL | NULL | NULL | NULL | NULL | 249 | 100.00 | Using temporary; Using filesort | |
| 2 | DERIVED | fi1 | NULL | eq_ref | id_2,id,price_weight | id_2 | 4 | metadata3.ingredient_to_flyer_item.flyer_item_id | 1 | 100.00 | NULL | |
| 2 | DERIVED | <derived3> | NULL | ref | <auto_key0> | <auto_key0> | 9 | metadata3.ingredient_to_flyer_item.flyer_item_id,m... | 10 | 100.00 | NULL | |
| 2 | DERIVED | ingredient_to_recipe | NULL | ref | ingredient_id | ingredient_id | 4 | metadata3.ingredient_to_flyer_item.ingredient_id | 40 | 100.00 | NULL | |
| 3 | DERIVED | ingredient_to_flyer_item | NULL | ALL | NULL | NULL | NULL | NULL | 249 | 100.00 | Using temporary; Using filesort | |
| 3 | DERIVED | flyer_items | NULL | eq_ref | id_2,id,flyer_id,price_weight | id_2 | 4 | metadata3.ingredient_to_flyer_item.flyer_item_id | 1 | 100.00 | NULL | |
以下是解释:
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |
|----|-------------|--------------------------|------------|-------|----------------------------------------------|---------------|---------|---------------------|------|----------|---------------------------------|---|
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 1318 | 100.00 | Using temporary; Using filesort | |
| 2 | DERIVED | <derived4> | NULL | ALL | NULL | NULL | NULL | NULL | 37 | 100.00 | Using temporary | |
| 2 | DERIVED | itr | NULL | ref | ingredient_id | ingredient_id | 4 | itfin.ingredient_id | 35 | 100.00 | NULL | |
| 4 | DERIVED | <derived5> | NULL | ALL | NULL | NULL | NULL | NULL | 249 | 100.00 | Using temporary; Using filesort | |
| 4 | DERIVED | ifi1 | NULL | ref | ingredient_id,itx_full,price_weight,flyer_id | ingredient_id | 4 | ifi2.ingredient_id | 1 | 12.50 | Using where | |
| 5 | DERIVED | ingredient_to_flyer_item | NULL | index | ingredient_id,itx_full | ingredient_id | 4 | NULL | 249 | 100.00 | NULL | |
| id |选择|类型|表格|分区|类型|可能的|键|键|列|参考|行|过滤|额外||
|----|-------------|--------------------------|------------|-------|----------------------------------------------|---------------|---------|---------------------|------|----------|---------------------------------|---|
|1 | PRIMARY | NULL | ALL | NULL | NULL | NULL | NULL | 1318 | 100.00 |使用临时;使用文件排序||
|2 |派生|空|所有|空|空|空|空| 37 | 100.00 |使用临时||
|2 |派生| itr | NULL | ref |配料| id |配料| id | 4 | itfin.配料| id | 35 | 100.00 | NULL ||
|4 |派生| | NULL | ALL | NULL | NULL | NULL | NULL | 249 | 100.00 |使用临时;使用文件排序||
|4 |派生| ifi1 |空|参考|成分id,itx |完整,价格|重量,传单| id |成分| id | 4 | ifi2.成分| id | 1 | 12.50 |使用where ||
|5 |衍生|成分|到|传单|项目|空|索引|成分| id,itx |完整|成分| id | 4 |空| 249 | 100.00 |空||
我不确定自己是否完全理解了这个问题。在我看来,您按错误的列传单项目.id
进行分组。您应该改为按列成分id
进行分组。如果您这样做,它(对我)更有意义。以下是我的看法:
select
itr.recipe_id,
sum(itr.weight),
sum(max_price_weight),
sum(itr.weight + max_price_weight) as score
from (
select
ifi.ingredient_id,
max(price_weight) as max_price_weight
from flyer_items i
join ingredients_to_flyer_item ifi on i.id = ifi.flyer_item_id
where flyer_id in (1, 2)
group by ifi.ingredient_id
) itf
join `ingredient_to_recipe` as itr on itf.`ingredient_id` = itr.`ingredient_id`
group by itr.`recipe_id`
order by score desc
limit 0,10;
我希望它能有所帮助。听起来像是“爆炸内爆”。这就是查询的连接和分组方式
JOIN
从连接的表中收集适当的行组合;然后
分组依据
计数
,求和
等,为聚合提供膨胀值
有两种常见的修复方法,都涉及将聚合与连接分开进行
案例1:
SELECT ...
( SELECT SUM(x) FROM t2 WHERE id = ... ) AS sum_x,
...
FROM t1 ...
如果您需要来自t2的多个聚合,那么这种情况会变得很笨拙,因为它一次只允许一个聚合
案例2:
SELECT ...
FROM ( SELECT grp,
SUM(x) AS sum_x,
COUNT(*) AS ct
FROM t2 ) AS s
JOIN t1 ON t1.grp = s.grp
您有2个连接
和3个分组方式
,因此我建议您从内到外调试(并重写)查询
SELECT ifi.ingredient_id,
MAX(price_weight) as max_price_weight,
flyer_item_id
from flyer_items i
join ingredient_to_flyer_item ifi ON i.id = ifi.flyer_item_id
where flyer_id in (1, 2)
group by ifi.ingredient_id
但是我帮不了你,因为你没有根据它所在的表(或别名)限定price\u weight
(其他一些列也一样)
(实际上,MAX
和MIN
不会得到夸大的值;AVG
会得到稍微错误的值;COUNT
和SUM
会得到“错误”的值。)
因此,我将把其余部分作为“练习”留给读者”
索引
itr: (ingredient_id, recipe_id) -- for the JOIN and WHERE and GROUP BY
itr: (recipe_id, ingredient_id, weight) -- for 1st Update
(There is no optimization available for the ORDER BY and LIMIT)
flyer_items: (flyer_id, price_weight) -- unless flyer_id is the PRIMARY KEY
ifi: (flyer_item_id, ingredient_id)
ifi: (ingredient_id, flyer_item_id) -- for 1st Update
请为相关表格提供“显示创建表格”
请提供解释选择…
如果配料到传单项目
是一个多:多映射表,请按照提示操作。配料到配方
按itf分组。传单项目id
可能无效,因为它不包括非聚合的ifi.成分id
。请参阅“仅按完整分组”
重新编制
完成对索引的评估后,请尝试以下操作。注意:我不知道它是否能正常工作
到
并更改初始的SELECT
以将SUMs
替换为这些计算出的总和。(我怀疑我没有正确处理成分id
)
什么版本
SELECT ifi.ingredient_id,
MAX(price_weight) as max_price_weight,
flyer_item_id
from flyer_items i
join ingredient_to_flyer_item ifi ON i.id = ifi.flyer_item_id
where flyer_id in (1, 2)
group by ifi.ingredient_id
itr: (ingredient_id, recipe_id) -- for the JOIN and WHERE and GROUP BY
itr: (recipe_id, ingredient_id, weight) -- for 1st Update
(There is no optimization available for the ORDER BY and LIMIT)
flyer_items: (flyer_id, price_weight) -- unless flyer_id is the PRIMARY KEY
ifi: (flyer_item_id, ingredient_id)
ifi: (ingredient_id, flyer_item_id) -- for 1st Update
JOIN `ingredient_to_recipe` AS itr ON itf2.`ingredient_id` = itr.`ingredient_id`
JOIN ( SELECT recipe_id,
ingredient_id,
SUM(weight) AS sum_weight
FROM ingredient_to_recipe ) AS itr
SELECT recipe_id, SUM(weight) AS weight, SUM(max_price_weight) AS price_weight, SUM(weight + max_price_weight) AS score
FROM (SELECT recipe_id, ingredient_id, MAX(weight) AS weight, MAX(price_weight) AS max_price_weight
FROM (SELECT itr.recipe_id, MIN(itr.ingredient_id) AS ingredient_id, MAX(itr.weight) AS weight, fi.id, MAX(fi.price_weight) AS price_weight
FROM ingredient_to_recipe itr
JOIN ingredient_to_flyer_item itfi ON itfi.ingredient_id = itr.ingredient_id
JOIN flyer_items fi ON fi.id = itfi.flyer_item_id
GROUP BY itr.recipe_id, fi.id) ri
GROUP BY recipe_id, ingredient_id) r
GROUP BY recipe_id
ORDER BY score DESC
LIMIT 10
HAVING recipe_id IN (8376, 3152, 4771, 10230, 8958, 4656, 11338)
recipe_id weight price_weight score
8376 10 41 51
4771 5 40 45
10230 10 30 40
8958 15 24 39
4656 15 19 34
3152 0 18 18
11338 0 10 10