Mysql SQL加入大数据
我有这样的想法:Mysql SQL加入大数据,mysql,sql,join,inner-join,Mysql,Sql,Join,Inner Join,我有这样的想法: SELECT a.* FROM `entities_instance` a INNER JOIN `instance_fields_data` artists ON ( a.entity_instance_id = artists.entity_instance_id AND artists.`dynamic_field_name` = 'artists' AND artists.`field_value` like '%8605:%')
SELECT a.*
FROM `entities_instance` a
INNER JOIN `instance_fields_data` artists
ON ( a.entity_instance_id = artists.entity_instance_id
AND artists.`dynamic_field_name` = 'artists'
AND artists.`field_value` like '%8605:%')
INNER JOIN `instance_fields_data` _enddate
ON (a.entity_instance_id = _enddate.entity_instance_id)
AND (_enddate.`dynamic_field_name` = 'enddate')
WHERE a.`entity_id` = 67
GROUP BY a.entity_instance_id
ORDER BY _enddate.field_value DESC
LIMIT 100
它需要返回大约5个结果,但我得到了以下错误:
The SELECT would examine more than MAX_JOIN_SIZE rows;
check your WHERE and use SET SQL_BIG_SELECTS=1 or
SET MAX_JOIN_SIZE=# if the SELECT is okay
有人能帮我改进这个问题吗?
我不想使用SQL\u BIG\u SELECTS命令
谢谢你网络上的许多资源建议在主查询之前运行
SQL\u BIG\u SELECTS=1
不过,我在想,也许您可以尝试使用别名表来执行数据量较小的联接。不确定,但这可能会有帮助
示例:
select a.* from
(select * from `entities_instance` test WHERE test.`entity_id` = 67) as a
INNER JOIN `instance_fields_data` artists
ON ( a.entity_instance_id = artists.entity_instance_id
AND artists.`dynamic_field_name` = 'artists'
AND artists.`field_value` like '%8605:%')
INNER JOIN `instance_fields_data` _enddate
ON (a.entity_instance_id = _enddate.entity_instance_id)
AND (_enddate.`dynamic_field_name` = 'enddate')
GROUP BY a.entity_instance_id
ORDER BY _enddate.field_value DESC
LIMIT 100;
有一个数据和表结构的示例将有助于提供更多帮助。
我们甚至不知道哪个表的数据最多
示例2:
select a.* from
(select * from `entities_instance` test WHERE test.`entity_id` = 67) as a
INNER JOIN `instance_fields_data` artists
ON ( a.entity_instance_id = artists.entity_instance_id
AND artists.`dynamic_field_name` = 'artists'
AND artists.`field_value` like '%8605:%')
INNER JOIN `instance_fields_data` _enddate
ON (a.entity_instance_id = _enddate.entity_instance_id)
AND (_enddate.`dynamic_field_name` = 'enddate')
GROUP BY a.entity_instance_id
ORDER BY _enddate.field_value DESC
LIMIT 100;
这里,我为3个表中的每个表使用where语句,并在别名派生表中使用它们
SELECT a.*
FROM
(SELECT *
FROM `entities_instance` test
WHERE test.`entity_id` = 67) AS a,
(SELECT *
FROM `instance_fields_data` test2
WHERE test2.`dynamic_field_name` = 'artists'
AND test2.`field_value` LIKE '%8605:%') AS artists,
(SELECT *
FROM `instance_fields_data` test3
WHERE test3.`dynamic_field_name` = 'enddate') AS _enddate
WHERE a.entity_instance_id = artists.entity_instance_id
AND a.entity_instance_id = _enddate.entity_instance_id
GROUP BY a.entity_instance_id
ORDER BY _enddate.field_value DESC LIMIT 100;
示例3:
select a.* from
(select * from `entities_instance` test WHERE test.`entity_id` = 67) as a
INNER JOIN `instance_fields_data` artists
ON ( a.entity_instance_id = artists.entity_instance_id
AND artists.`dynamic_field_name` = 'artists'
AND artists.`field_value` like '%8605:%')
INNER JOIN `instance_fields_data` _enddate
ON (a.entity_instance_id = _enddate.entity_instance_id)
AND (_enddate.`dynamic_field_name` = 'enddate')
GROUP BY a.entity_instance_id
ORDER BY _enddate.field_value DESC
LIMIT 100;
最后,在进一步优化中,我们使用EXISTS
进一步过滤实例字段\数据的两个子查询的结果。我相信这可以进一步简化/优化。还要确保您的键上有索引(例如实体\u实例\u id),否则这会更慢
SELECT a.*
FROM
(SELECT *
FROM `entities_instance` test
WHERE test.`entity_id` = 67) AS a,
(SELECT *
FROM `instance_fields_data` test2
WHERE test2.`dynamic_field_name` = 'artists'
AND test2.`field_value` LIKE '%8605:%'
AND EXISTS
(SELECT *
FROM `entities_instance` z
WHERE z.entity_instance_id = test2.entity_instance_id AND
z.`entity_id` = 67)) AS artists,
(SELECT *
FROM `instance_fields_data` test3
WHERE test3.`dynamic_field_name` = 'enddate'
AND EXISTS
(SELECT *
FROM `entities_instance` z
WHERE z.entity_instance_id = test3.entity_instance_id AND
z.`entity_id` = 67)) AS _enddate
WHERE a.entity_instance_id = artists.entity_instance_id
AND a.entity_instance_id = _enddate.entity_instance_id
GROUP BY a.entity_instance_id
ORDER BY _enddate.field_value DESC LIMIT 100;
我会构造一个伪标准化的物化视图,然后处理它-你知道选择一个*MAX(CASE…WHEN…then…END)艺术家。等。
应用group by通常需要对查询中的列进行一些聚合,如min()、max()、avg()等。。。。但是,在两个相应的表上有什么索引。。。你有WHERE和JOIN,但看起来没问题。在实例字段数据中的实体实例id、动态字段名称
上的复合索引可能会有所帮助。如果你给我们提供表结构和示例数据,我们可以帮到更多。。。