Python 使用Peewee ORM从行到列

Python 使用Peewee ORM从行到列,python,sql,sqlite,peewee,Python,Sql,Sqlite,Peewee,我有一个Peewee对象,看起来像: class Status(peewee.Model): host = peewee.ForeignKeyField( Host, backref='checks', on_delete='CASCADE') check_date = peewee.DateTimeField() status = peewee.TextField() 这将记录在多台主机上运行某些服务检查的结果。每行包含

我有一个Peewee对象,看起来像:

class Status(peewee.Model):
    host = peewee.ForeignKeyField(
        Host,
        backref='checks',
        on_delete='CASCADE')
    check_date = peewee.DateTimeField()
    status = peewee.TextField()
这将记录在多台主机上运行某些服务检查的结果。每行包含单个主机的单个结果,包括检查日期和状态。生成的表如下所示:

+----+---------+----------------------------+---------+
| id | host_id | check_date                 | status  |
+----+---------+----------------------------+---------+
|  1 | 123     | 2020-02-04 17:52:28.716036 | UP      |
|  2 | 321     | 2020-02-04 17:52:28.716036 | REFUSED |
|  3 | 555     | 2020-02-04 17:52:28.716036 | UP      |
...
| 50 | 123     | 2020-02-04 21:21:48.319062 | TIMEOUT |
| 51 | 321     | 2020-02-04 21:21:48.319062 | UNKNOWN |
| 52 | 555     | 2020-02-04 21:21:48.319062 | UP      |
+----+---------+----------------------------+---------+
我想生成一个摘要视图,如下所示:

+----------------------------+-----+---------+---------+---------+-------+
| check_date                 | UP  | REFUSED | TIMEOUT | UNKNOWN | TOTAL |
+----------------------------+-----+---------+---------+---------+-------+
| 2020-02-04 17:52:28.716036 | 221 | 34      | 10      | 2       | 267   |
| 2020-02-04 21:21:48.319062 | 230 | 30      | 15      | 4       | 279   |
+----------------------------+-----+---------+---------+---------+-------+
select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date
(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     fn.COUNT(Status.id).filter(Status.status == 'UP').alias('up'),
     fn.COUNT(Status.id).filter(Status.status == 'REFUSED').alias('refused'))
 .group_by(fn.date_trunc('day', Status.check_date)))
我可以在SQL中这样做:

+----------------------------+-----+---------+---------+---------+-------+
| check_date                 | UP  | REFUSED | TIMEOUT | UNKNOWN | TOTAL |
+----------------------------+-----+---------+---------+---------+-------+
| 2020-02-04 17:52:28.716036 | 221 | 34      | 10      | 2       | 267   |
| 2020-02-04 21:21:48.319062 | 230 | 30      | 15      | 4       | 279   |
+----------------------------+-----+---------+---------+---------+-------+
select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date
(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     fn.COUNT(Status.id).filter(Status.status == 'UP').alias('up'),
     fn.COUNT(Status.id).filter(Status.status == 'REFUSED').alias('refused'))
 .group_by(fn.date_trunc('day', Status.check_date)))
我将如何使用构造一个类似的查询?我知道可以通过
peewee.fn
名称空间访问sql函数,但我不确定是否可以使用该语法构造那些
过滤器
子查询

现在,我已经解决了这个问题,首先是:

status_summary = (
    Status.select(Status.check_date,
                  Status.status,
                  peewee.fn.Count(Status.id).alias('count'))
    .group_by(Status.check_date, Status.status)
    .order_by(Status.check_date, Status.status)
)
这让我感到:

+----------------------------+---------+-------+
| check_date                 | status  | count |
+----------------------------+---------+-------+
| 2020-02-04 17:52:28.716036 | UP      | 221   |
| 2020-02-04 17:52:28.716036 | REFUSED | 34    |
| 2020-02-04 17:52:28.716036 | TIMEOUT | 10    |
| 2020-02-04 17:52:28.716036 | UNKNOWN | 34    |
| 2020-02-04 21:21:48.319062 | UP      | 230   |
| 2020-02-04 21:21:48.319062 | REFUSED | 30    |
| 2020-02-04 21:21:48.319062 | TIMEOUT | 15    |
| 2020-02-04 21:21:48.319062 | UNKNOWN | 4     |
+----------------------------+---------+-------+
[
  {
    "date": "2020-02-04 17:52:28.716036",
    "summary": {
      "OPEN": 538,
      "REFUSED": 13,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-04 17:55:22.655965",
    "summary": {
      "OPEN": 533,
      "REFUSED": 15,
      "TIMEOUT": 42,
      "UNKNOWN": 5,
      "UNREACHABLE": 3
    }
  },
  {
    "date": "2020-02-04 18:51:31.937254",
    "summary": {
      "OPEN": 541,
      "REFUSED": 11,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 1
    }
  },
  {
    "date": "2020-02-04 21:21:48.319062",
    "summary": {
      "OPEN": 544,
      "REFUSED": 9,
      "TIMEOUT": 39,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-05 00:11:23.377746",
    "summary": {
      "OPEN": 547,
      "REFUSED": 8,
      "TIMEOUT": 37,
      "UNKNOWN": 5,
      "UNREACHABLE": 1
    }
  }
]
然后我使用Python中的
itertools.groupby

status_summary = itertools.groupby(status_summary, lambda x: x.check_date)
status_summary = [
    {'date': date, 'summary': {x.status: x.count for x in results}}
    for date, results in status_summary
]
这让我感到:

+----------------------------+---------+-------+
| check_date                 | status  | count |
+----------------------------+---------+-------+
| 2020-02-04 17:52:28.716036 | UP      | 221   |
| 2020-02-04 17:52:28.716036 | REFUSED | 34    |
| 2020-02-04 17:52:28.716036 | TIMEOUT | 10    |
| 2020-02-04 17:52:28.716036 | UNKNOWN | 34    |
| 2020-02-04 21:21:48.319062 | UP      | 230   |
| 2020-02-04 21:21:48.319062 | REFUSED | 30    |
| 2020-02-04 21:21:48.319062 | TIMEOUT | 15    |
| 2020-02-04 21:21:48.319062 | UNKNOWN | 4     |
+----------------------------+---------+-------+
[
  {
    "date": "2020-02-04 17:52:28.716036",
    "summary": {
      "OPEN": 538,
      "REFUSED": 13,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-04 17:55:22.655965",
    "summary": {
      "OPEN": 533,
      "REFUSED": 15,
      "TIMEOUT": 42,
      "UNKNOWN": 5,
      "UNREACHABLE": 3
    }
  },
  {
    "date": "2020-02-04 18:51:31.937254",
    "summary": {
      "OPEN": 541,
      "REFUSED": 11,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 1
    }
  },
  {
    "date": "2020-02-04 21:21:48.319062",
    "summary": {
      "OPEN": 544,
      "REFUSED": 9,
      "TIMEOUT": 39,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-05 00:11:23.377746",
    "summary": {
      "OPEN": 547,
      "REFUSED": 8,
      "TIMEOUT": 37,
      "UNKNOWN": 5,
      "UNREACHABLE": 1
    }
  }
]

这实际上是我想要的,但我觉得到达这里的过程是不必要的复杂。

我会使用CASE,但COUNT/FILTER应该适用于Postgres

select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date
Peewee应该支持以下内容:

+----------------------------+-----+---------+---------+---------+-------+
| check_date                 | UP  | REFUSED | TIMEOUT | UNKNOWN | TOTAL |
+----------------------------+-----+---------+---------+---------+-------+
| 2020-02-04 17:52:28.716036 | 221 | 34      | 10      | 2       | 267   |
| 2020-02-04 21:21:48.319062 | 230 | 30      | 15      | 4       | 279   |
+----------------------------+-----+---------+---------+---------+-------+
select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date
(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     fn.COUNT(Status.id).filter(Status.status == 'UP').alias('up'),
     fn.COUNT(Status.id).filter(Status.status == 'REFUSED').alias('refused'))
 .group_by(fn.date_trunc('day', Status.check_date)))
但实际上,您可以依靠“分组方式”为您提供更简单的服务:

(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     Status.status,
     fn.COUNT(Status.id).alias('count'))
 .group_by(fn.date_trunc('day', Status.check_date), Status.status))

这将为每个日期+每个状态提供一行,但更加灵活(因为您不必硬编码所有状态)。

谢谢您的回答!但如果我理解正确,这与我问题中的查询相同(
status\u summary=(status.select(status.check\u date,status.status,peewee.fn.Count(status.id)。别名('Count'))。分组依据(status.check\u date,status.status)。订单依据(status.check\u date,status.status))
,除此之外,它使用
date\u trunc
修改日期显示。第一个查询将摘要显示在一行中,第二个查询将为每个摘要显示一行。只需使用第一个查询。