Python MySQL聚合查询的聚合结果
我有一个包含数百万条呼叫记录的表,我需要对这些记录运行聚合统计。该表如下所示:Python MySQL聚合查询的聚合结果,python,mysql,Python,Mysql,我有一个包含数百万条呼叫记录的表,我需要对这些记录运行聚合统计。该表如下所示: +----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+ | id
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| id | calltype | client_id | extension_number | flow | partyid | start | answer | duration | disposion | sipcallid | did |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| 35080566 | out | 139 | 2222*050 | in | 01123334455 | 2015-11-12 17:11:10 | 2015-11-12 17:11:10 | 4 | ANSWERED | 20202911-3656337069-994458@sip.example.com | 01932855644 |
| 35077822 | out | 139 | 2222*603 | in | 02114455784 | 2015-11-12 16:37:41 | 2015-11-12 16:37:41 | 27 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077821 | out | 139 | 2222*603 | in | 02114455784 | 2015-11-12 16:38:08 | 2015-11-12 16:38:08 | 80 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077820 | local | 139 | 2222*747 | in | 2222*605 | 2015-11-12 16:38:09 | 2015-11-12 16:38:09 | 79 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077346 | out | 139 | 2222*603 | in | 07841254789 | 2015-11-12 16:26:15 | 2015-11-12 16:26:15 | 27 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077345 | out | 139 | 2222*603 | in | 07841254789 | 2015-11-12 16:26:42 | 2015-11-12 16:26:42 | 527 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077344 | local | 139 | 2222*746 | in | 2222*609 | 2015-11-12 16:26:43 | 2015-11-12 16:26:43 | 526 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35065079 | out | 139 | 2222*603 | in | 02415785414 | 2015-11-12 14:37:21 | 2015-11-12 14:37:21 | 21 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065078 | out | 139 | 2222*603 | in | 02415785414 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 | 776 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065077 | local | 139 | 2222*744 | in | 2222*604 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 | 776 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
SELECT ch.did "Inbound DDI",
DATE(ch.start) Date,
IF((MAX(answer)-MIN(start)) < 5, 1 , 0),
IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0),
IF((MAX(answer)-MIN(start)) > 10, 1 , 0)
sipcallid
FROM
call_history ch
WHERE
flow = 'in'
AND ch.did <> ""
AND ch.client_client_id = 1207
AND ch.duration > 0
AND ch.disposition = "ANSWERED"
AND DATE(start) = DATE_SUB(CURDATE(), INTERVAL 2 DAY)
GROUP BY
ch.sipcallid;
SELECT
agg.Date,
SUM(agg.short),
SUM(agg.long)
FROM (
SELECT ch.did "Inbound DDI",
DATE(ch.start) Date,
IF((MAX(answer)-MIN(start)) < 5, 1 , 0) as 'short',
IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0) as 'avg',
IF((MAX(answer)-MIN(start)) > 10, 1 , 0) as 'long',
sipcallid
FROM
call_history ch
WHERE
flow = 'in'
AND ch.did <> ""
AND ch.duration > 0
AND ch.disposion = "ANSWERED"
GROUP BY
ch.sipcallid
) agg
GROUP BY agg.Date
我需要每天运行一个查询来聚合数据。这应该很简单,但正如您从数据中看到的,一个公共呼叫有多行。例如,底部三行是同一呼叫的不同分支-这很明显,因为它们都具有相同的SIP呼叫id。呼叫开始的时间(即振铃)是开始时间,应答时间是应答时间
我需要生成以下统计数据:
每个DID的呼叫总数
接听电话数<5秒
接听电话数>10秒
我有一个计算MAXanswer MAXstart的查询,它给出了在相关时间段内回答的那些问题的统计结果,但我无法计算出如何汇总这些问题的输出,以提供每日数字
我的查询如下所示:
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| id | calltype | client_id | extension_number | flow | partyid | start | answer | duration | disposion | sipcallid | did |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| 35080566 | out | 139 | 2222*050 | in | 01123334455 | 2015-11-12 17:11:10 | 2015-11-12 17:11:10 | 4 | ANSWERED | 20202911-3656337069-994458@sip.example.com | 01932855644 |
| 35077822 | out | 139 | 2222*603 | in | 02114455784 | 2015-11-12 16:37:41 | 2015-11-12 16:37:41 | 27 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077821 | out | 139 | 2222*603 | in | 02114455784 | 2015-11-12 16:38:08 | 2015-11-12 16:38:08 | 80 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077820 | local | 139 | 2222*747 | in | 2222*605 | 2015-11-12 16:38:09 | 2015-11-12 16:38:09 | 79 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077346 | out | 139 | 2222*603 | in | 07841254789 | 2015-11-12 16:26:15 | 2015-11-12 16:26:15 | 27 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077345 | out | 139 | 2222*603 | in | 07841254789 | 2015-11-12 16:26:42 | 2015-11-12 16:26:42 | 527 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077344 | local | 139 | 2222*746 | in | 2222*609 | 2015-11-12 16:26:43 | 2015-11-12 16:26:43 | 526 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35065079 | out | 139 | 2222*603 | in | 02415785414 | 2015-11-12 14:37:21 | 2015-11-12 14:37:21 | 21 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065078 | out | 139 | 2222*603 | in | 02415785414 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 | 776 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065077 | local | 139 | 2222*744 | in | 2222*604 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 | 776 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
SELECT ch.did "Inbound DDI",
DATE(ch.start) Date,
IF((MAX(answer)-MIN(start)) < 5, 1 , 0),
IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0),
IF((MAX(answer)-MIN(start)) > 10, 1 , 0)
sipcallid
FROM
call_history ch
WHERE
flow = 'in'
AND ch.did <> ""
AND ch.client_client_id = 1207
AND ch.duration > 0
AND ch.disposition = "ANSWERED"
AND DATE(start) = DATE_SUB(CURDATE(), INTERVAL 2 DAY)
GROUP BY
ch.sipcallid;
SELECT
agg.Date,
SUM(agg.short),
SUM(agg.long)
FROM (
SELECT ch.did "Inbound DDI",
DATE(ch.start) Date,
IF((MAX(answer)-MIN(start)) < 5, 1 , 0) as 'short',
IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0) as 'avg',
IF((MAX(answer)-MIN(start)) > 10, 1 , 0) as 'long',
sipcallid
FROM
call_history ch
WHERE
flow = 'in'
AND ch.did <> ""
AND ch.duration > 0
AND ch.disposion = "ANSWERED"
GROUP BY
ch.sipcallid
) agg
GROUP BY agg.Date
是否有一种方法可以聚合此文件的输出,或者我必须编写脚本?例如,我可以用Python来实现这一点。根据链接建议,您可以使用子查询来实现您的目标,并避免使用Python脚本。查询可能如下所示:
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| id | calltype | client_id | extension_number | flow | partyid | start | answer | duration | disposion | sipcallid | did |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| 35080566 | out | 139 | 2222*050 | in | 01123334455 | 2015-11-12 17:11:10 | 2015-11-12 17:11:10 | 4 | ANSWERED | 20202911-3656337069-994458@sip.example.com | 01932855644 |
| 35077822 | out | 139 | 2222*603 | in | 02114455784 | 2015-11-12 16:37:41 | 2015-11-12 16:37:41 | 27 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077821 | out | 139 | 2222*603 | in | 02114455784 | 2015-11-12 16:38:08 | 2015-11-12 16:38:08 | 80 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077820 | local | 139 | 2222*747 | in | 2222*605 | 2015-11-12 16:38:09 | 2015-11-12 16:38:09 | 79 | ANSWERED | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077346 | out | 139 | 2222*603 | in | 07841254789 | 2015-11-12 16:26:15 | 2015-11-12 16:26:15 | 27 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077345 | out | 139 | 2222*603 | in | 07841254789 | 2015-11-12 16:26:42 | 2015-11-12 16:26:42 | 527 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077344 | local | 139 | 2222*746 | in | 2222*609 | 2015-11-12 16:26:43 | 2015-11-12 16:26:43 | 526 | ANSWERED | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35065079 | out | 139 | 2222*603 | in | 02415785414 | 2015-11-12 14:37:21 | 2015-11-12 14:37:21 | 21 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065078 | out | 139 | 2222*603 | in | 02415785414 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 | 776 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065077 | local | 139 | 2222*744 | in | 2222*604 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 | 776 | ANSWERED | 19848872-3656327834-411032@sip.example.com | 01123334455 |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
SELECT ch.did "Inbound DDI",
DATE(ch.start) Date,
IF((MAX(answer)-MIN(start)) < 5, 1 , 0),
IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0),
IF((MAX(answer)-MIN(start)) > 10, 1 , 0)
sipcallid
FROM
call_history ch
WHERE
flow = 'in'
AND ch.did <> ""
AND ch.client_client_id = 1207
AND ch.duration > 0
AND ch.disposition = "ANSWERED"
AND DATE(start) = DATE_SUB(CURDATE(), INTERVAL 2 DAY)
GROUP BY
ch.sipcallid;
SELECT
agg.Date,
SUM(agg.short),
SUM(agg.long)
FROM (
SELECT ch.did "Inbound DDI",
DATE(ch.start) Date,
IF((MAX(answer)-MIN(start)) < 5, 1 , 0) as 'short',
IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0) as 'avg',
IF((MAX(answer)-MIN(start)) > 10, 1 , 0) as 'long',
sipcallid
FROM
call_history ch
WHERE
flow = 'in'
AND ch.did <> ""
AND ch.duration > 0
AND ch.disposion = "ANSWERED"
GROUP BY
ch.sipcallid
) agg
GROUP BY agg.Date
您可以在Argh尝试此查询。更多的搜索揭示了以下内容。我担心脚本可能是唯一的出路。我发现另一件似乎有帮助的事情是临时MySQL表,因此:我可以预见这样一种情况:我将第一次传递的聚合数据转储到临时表,然后查询该临时表-因此没有脚本。谢谢,这很有用,我唯一的问题是由于调用历史的大小,子查询可能太慢而无法使用。我已经尝试过了,但是到目前为止失败了。你也可以使用一种魔法,尝试使用像这样的sipcallid列计数,但我不确定这个解决方案