如何在MySQL中执行组聚合,而无需嵌套查询?
我的桌子是这样的:如何在MySQL中执行组聚合,而无需嵌套查询?,mysql,sql,Mysql,Sql,我的桌子是这样的: CREATE TABLE USER_TRANSACTIONS ( START_TIME BIGINT UNSIGNED NOT NULL, APPLICATION_ID CHAR(64) BINARY NOT NULL, ENTRY_POINT CHAR(255) BINARY NOT NULL, USER_ID CHAR(64) BINARY NOT NULL, ERROR_VIOLATION
CREATE TABLE USER_TRANSACTIONS (
START_TIME BIGINT UNSIGNED NOT NULL,
APPLICATION_ID CHAR(64) BINARY NOT NULL,
ENTRY_POINT CHAR(255) BINARY NOT NULL,
USER_ID CHAR(64) BINARY NOT NULL,
ERROR_VIOLATION BIT(1) NOT NULL,
LATENCY_VIOLATION BIT(1) NOT NULL,
PRIMARY KEY (START_TIME, APPLICATION_ID, ENTRY_POINT, USER_ID)
)
我想做的是如下总结:每个入口点我想看看有多少独特的用户,他们中有多少人有错误和延迟问题
例如:
ENTRY_POINT | TOTAL_USERS | TOTAL_ERRORS | TOTAL_LATENCY
page1 | 2 | 2 | 1
page2 | 1 | 1 | 1
我可以通过以下查询实现此目标:
SELECT UT.ENTRY_POINT, COUNT(USER_ID) AS TOTAL_USERS, SUM(EXP_ERRORS) AS TOTAL_ERRORS, SUM(EXP_LATENCY) AS TOTAL_LATENCY
FROM (
SELECT ENTRY_POINT, USER_ID,
BIT_OR(ERROR_VIOLATION) AS EXP_ERRORS,
BIT_OR(LATENCY_VIOLATION) AS EXP_LATENCY
FROM user_transactions
GROUP BY ENTRY_POINT, USER_ID
) AS UT
GROUP BY UT.ENTRY_POINT;
嵌套查询用于总结用户是否遇到了错误或延迟问题,但在包含大量数据的表上,我遇到了性能问题
我的问题是如何优化此查询以避免使用内部子查询?您不能使用以下内容吗:
SELECT
ENTRY_POINT
,COUNT(USER_ID) AS TOTAL_USERS
,SUM(EXP_ERRORS) AS TOTAL_ERRORS
,SUM(EXP_LATENCY) AS TOTAL_LATENCY
FROM user_transactions
GROUP BY ENTRY_POINT
使用计数(不同)
。以下是编写查询的一种方法:
SELECT ENTRY_POINT, COUNT(DISTINCT USER_ID),
SUM(ERROR_VIOLATION > 0) AS TOTAL_ERRORS,
SUM(LATENCY_VIOLATION > 0) AS TOTAL_LATENCY
FROM user_transactions
GROUP BY ENTRY_POINT;
如果希望用户出现错误而不是全部错误:
SELECT ENTRY_POINT, COUNT(DISTINCT USER_ID),
COUNT(DISTINCT CASE WHEN ERROR_VIOLATION > 0 THEN USER_ID END) AS TOTAL_ERRORS,
COUNT(DISTINCT CASE WHEN LATENCY_VIOLATION > 0 THEN USER_ID END) AS TOTAL_LATENCY
FROM user_transactions
GROUP BY ENTRY_POINT;
对我来说,你的问题似乎是最合适的。由于您必须处理所有行,因此无论发生什么情况,大量数据都将非常昂贵。如果底层数据没有太多变化,您可以使用物化视图。这绝对不是等价的。Erwin是对的。这并不是每个用户的错误总数,而是第二个,而不是第一个。但是这会更快吗?第二个是。这相当于,大约快20%。谢谢你,戈登。第一个查询错误,应忽略。