Python MySQL聚合查询的聚合结果

Python MySQL聚合查询的聚合结果,python,mysql,Python,Mysql,我有一个包含数百万条呼叫记录的表,我需要对这些记录运行聚合统计。该表如下所示: +----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+ | id

我有一个包含数百万条呼叫记录的表,我需要对这些记录运行聚合统计。该表如下所示:

+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| id       | calltype | client_id | extension_number | flow | partyid     | start               | answer              | duration | disposion | sipcallid                                  | did         |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| 35080566 | out      |       139 | 2222*050         | in   | 01123334455 | 2015-11-12 17:11:10 | 2015-11-12 17:11:10 |        4 | ANSWERED  | 20202911-3656337069-994458@sip.example.com | 01932855644 |
| 35077822 | out      |       139 | 2222*603         | in   | 02114455784 | 2015-11-12 16:37:41 | 2015-11-12 16:37:41 |       27 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077821 | out      |       139 | 2222*603         | in   | 02114455784 | 2015-11-12 16:38:08 | 2015-11-12 16:38:08 |       80 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077820 | local    |       139 | 2222*747         | in   | 2222*605    | 2015-11-12 16:38:09 | 2015-11-12 16:38:09 |       79 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077346 | out      |       139 | 2222*603         | in   | 07841254789 | 2015-11-12 16:26:15 | 2015-11-12 16:26:15 |       27 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077345 | out      |       139 | 2222*603         | in   | 07841254789 | 2015-11-12 16:26:42 | 2015-11-12 16:26:42 |      527 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077344 | local    |       139 | 2222*746         | in   | 2222*609    | 2015-11-12 16:26:43 | 2015-11-12 16:26:43 |      526 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35065079 | out      |       139 | 2222*603         | in   | 02415785414 | 2015-11-12 14:37:21 | 2015-11-12 14:37:21 |       21 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065078 | out      |       139 | 2222*603         | in   | 02415785414 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 |      776 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065077 | local    |       139 | 2222*744         | in   | 2222*604    | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 |      776 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
SELECT ch.did "Inbound DDI", 
  DATE(ch.start) Date, 
  IF((MAX(answer)-MIN(start)) < 5, 1 , 0),
  IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0),
  IF((MAX(answer)-MIN(start)) > 10, 1 , 0)
  sipcallid
FROM 
  call_history ch
WHERE 
  flow = 'in' 
  AND ch.did <> ""
  AND ch.client_client_id = 1207 
  AND ch.duration > 0
  AND ch.disposition = "ANSWERED"
  AND DATE(start) = DATE_SUB(CURDATE(), INTERVAL 2 DAY) 
GROUP BY 
  ch.sipcallid;
SELECT
  agg.Date,
  SUM(agg.short),
  SUM(agg.long)
FROM (
SELECT ch.did "Inbound DDI", 
  DATE(ch.start) Date, 
  IF((MAX(answer)-MIN(start)) < 5, 1 , 0) as 'short',
  IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0) as 'avg',
  IF((MAX(answer)-MIN(start)) > 10, 1 , 0) as 'long',
  sipcallid
FROM 
  call_history ch
WHERE 
  flow = 'in' 
  AND ch.did <> ""
  AND ch.duration > 0
  AND ch.disposion = "ANSWERED"
GROUP BY 
  ch.sipcallid
) agg
GROUP BY agg.Date
我需要每天运行一个查询来聚合数据。这应该很简单,但正如您从数据中看到的,一个公共呼叫有多行。例如,底部三行是同一呼叫的不同分支-这很明显,因为它们都具有相同的SIP呼叫id。呼叫开始的时间(即振铃)是开始时间,应答时间是应答时间

我需要生成以下统计数据:

每个DID的呼叫总数

接听电话数<5秒

接听电话数>10秒

我有一个计算MAXanswer MAXstart的查询,它给出了在相关时间段内回答的那些问题的统计结果,但我无法计算出如何汇总这些问题的输出,以提供每日数字

我的查询如下所示:

+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| id       | calltype | client_id | extension_number | flow | partyid     | start               | answer              | duration | disposion | sipcallid                                  | did         |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| 35080566 | out      |       139 | 2222*050         | in   | 01123334455 | 2015-11-12 17:11:10 | 2015-11-12 17:11:10 |        4 | ANSWERED  | 20202911-3656337069-994458@sip.example.com | 01932855644 |
| 35077822 | out      |       139 | 2222*603         | in   | 02114455784 | 2015-11-12 16:37:41 | 2015-11-12 16:37:41 |       27 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077821 | out      |       139 | 2222*603         | in   | 02114455784 | 2015-11-12 16:38:08 | 2015-11-12 16:38:08 |       80 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077820 | local    |       139 | 2222*747         | in   | 2222*605    | 2015-11-12 16:38:09 | 2015-11-12 16:38:09 |       79 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077346 | out      |       139 | 2222*603         | in   | 07841254789 | 2015-11-12 16:26:15 | 2015-11-12 16:26:15 |       27 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077345 | out      |       139 | 2222*603         | in   | 07841254789 | 2015-11-12 16:26:42 | 2015-11-12 16:26:42 |      527 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077344 | local    |       139 | 2222*746         | in   | 2222*609    | 2015-11-12 16:26:43 | 2015-11-12 16:26:43 |      526 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35065079 | out      |       139 | 2222*603         | in   | 02415785414 | 2015-11-12 14:37:21 | 2015-11-12 14:37:21 |       21 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065078 | out      |       139 | 2222*603         | in   | 02415785414 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 |      776 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065077 | local    |       139 | 2222*744         | in   | 2222*604    | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 |      776 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
SELECT ch.did "Inbound DDI", 
  DATE(ch.start) Date, 
  IF((MAX(answer)-MIN(start)) < 5, 1 , 0),
  IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0),
  IF((MAX(answer)-MIN(start)) > 10, 1 , 0)
  sipcallid
FROM 
  call_history ch
WHERE 
  flow = 'in' 
  AND ch.did <> ""
  AND ch.client_client_id = 1207 
  AND ch.duration > 0
  AND ch.disposition = "ANSWERED"
  AND DATE(start) = DATE_SUB(CURDATE(), INTERVAL 2 DAY) 
GROUP BY 
  ch.sipcallid;
SELECT
  agg.Date,
  SUM(agg.short),
  SUM(agg.long)
FROM (
SELECT ch.did "Inbound DDI", 
  DATE(ch.start) Date, 
  IF((MAX(answer)-MIN(start)) < 5, 1 , 0) as 'short',
  IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0) as 'avg',
  IF((MAX(answer)-MIN(start)) > 10, 1 , 0) as 'long',
  sipcallid
FROM 
  call_history ch
WHERE 
  flow = 'in' 
  AND ch.did <> ""
  AND ch.duration > 0
  AND ch.disposion = "ANSWERED"
GROUP BY 
  ch.sipcallid
) agg
GROUP BY agg.Date

是否有一种方法可以聚合此文件的输出,或者我必须编写脚本?例如,我可以用Python来实现这一点。

根据链接建议,您可以使用子查询来实现您的目标,并避免使用Python脚本。查询可能如下所示:

+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| id       | calltype | client_id | extension_number | flow | partyid     | start               | answer              | duration | disposion | sipcallid                                  | did         |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
| 35080566 | out      |       139 | 2222*050         | in   | 01123334455 | 2015-11-12 17:11:10 | 2015-11-12 17:11:10 |        4 | ANSWERED  | 20202911-3656337069-994458@sip.example.com | 01932855644 |
| 35077822 | out      |       139 | 2222*603         | in   | 02114455784 | 2015-11-12 16:37:41 | 2015-11-12 16:37:41 |       27 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077821 | out      |       139 | 2222*603         | in   | 02114455784 | 2015-11-12 16:38:08 | 2015-11-12 16:38:08 |       80 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077820 | local    |       139 | 2222*747         | in   | 2222*605    | 2015-11-12 16:38:09 | 2015-11-12 16:38:09 |       79 | ANSWERED  | 20138716-3656335055-417971@sip.example.com | 01123334455 |
| 35077346 | out      |       139 | 2222*603         | in   | 07841254789 | 2015-11-12 16:26:15 | 2015-11-12 16:26:15 |       27 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077345 | out      |       139 | 2222*603         | in   | 07841254789 | 2015-11-12 16:26:42 | 2015-11-12 16:26:42 |      527 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35077344 | local    |       139 | 2222*746         | in   | 2222*609    | 2015-11-12 16:26:43 | 2015-11-12 16:26:43 |      526 | ANSWERED  | 20113840-3656334365-407195@sip.example.com | 01123334455 |
| 35065079 | out      |       139 | 2222*603         | in   | 02415785414 | 2015-11-12 14:37:21 | 2015-11-12 14:37:21 |       21 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065078 | out      |       139 | 2222*603         | in   | 02415785414 | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 |      776 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
| 35065077 | local    |       139 | 2222*744         | in   | 2222*604    | 2015-11-12 14:37:42 | 2015-11-12 14:37:42 |      776 | ANSWERED  | 19848872-3656327834-411032@sip.example.com | 01123334455 |
+----------+----------+------------------+------------------+------+-------------+---------------------+---------------------+----------+-----------+-------------------------------------+-------------+
SELECT ch.did "Inbound DDI", 
  DATE(ch.start) Date, 
  IF((MAX(answer)-MIN(start)) < 5, 1 , 0),
  IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0),
  IF((MAX(answer)-MIN(start)) > 10, 1 , 0)
  sipcallid
FROM 
  call_history ch
WHERE 
  flow = 'in' 
  AND ch.did <> ""
  AND ch.client_client_id = 1207 
  AND ch.duration > 0
  AND ch.disposition = "ANSWERED"
  AND DATE(start) = DATE_SUB(CURDATE(), INTERVAL 2 DAY) 
GROUP BY 
  ch.sipcallid;
SELECT
  agg.Date,
  SUM(agg.short),
  SUM(agg.long)
FROM (
SELECT ch.did "Inbound DDI", 
  DATE(ch.start) Date, 
  IF((MAX(answer)-MIN(start)) < 5, 1 , 0) as 'short',
  IF((MAX(answer)-MIN(start)) BETWEEN 5 AND 10, 1 , 0) as 'avg',
  IF((MAX(answer)-MIN(start)) > 10, 1 , 0) as 'long',
  sipcallid
FROM 
  call_history ch
WHERE 
  flow = 'in' 
  AND ch.did <> ""
  AND ch.duration > 0
  AND ch.disposion = "ANSWERED"
GROUP BY 
  ch.sipcallid
) agg
GROUP BY agg.Date

您可以在

Argh尝试此查询。更多的搜索揭示了以下内容。我担心脚本可能是唯一的出路。我发现另一件似乎有帮助的事情是临时MySQL表,因此:我可以预见这样一种情况:我将第一次传递的聚合数据转储到临时表,然后查询该临时表-因此没有脚本。谢谢,这很有用,我唯一的问题是由于调用历史的大小,子查询可能太慢而无法使用。我已经尝试过了,但是到目前为止失败了。你也可以使用一种魔法,尝试使用像这样的sipcallid列计数,但我不确定这个解决方案