Python 使用Peewee ORM从行到列_Python_Sql_Sqlite_Peewee

Python 使用Peewee ORM从行到列

python sql sqlite

Python 使用Peewee ORM从行到列,python,sql,sqlite,peewee,Python,Sql,Sqlite,Peewee,我有一个Peewee对象，看起来像： class Status(peewee.Model): host = peewee.ForeignKeyField( Host, backref='checks', on_delete='CASCADE') check_date = peewee.DateTimeField() status = peewee.TextField() 这将记录在多台主机上运行某些服务检查的结果。每行包含

我有一个Peewee对象，看起来像：

class Status(peewee.Model):
    host = peewee.ForeignKeyField(
        Host,
        backref='checks',
        on_delete='CASCADE')
    check_date = peewee.DateTimeField()
    status = peewee.TextField()

这将记录在多台主机上运行某些服务检查的结果。每行包含单个主机的单个结果，包括检查日期和状态。生成的表如下所示：

+----+---------+----------------------------+---------+
| id | host_id | check_date                 | status  |
+----+---------+----------------------------+---------+
|  1 | 123     | 2020-02-04 17:52:28.716036 | UP      |
|  2 | 321     | 2020-02-04 17:52:28.716036 | REFUSED |
|  3 | 555     | 2020-02-04 17:52:28.716036 | UP      |
...
| 50 | 123     | 2020-02-04 21:21:48.319062 | TIMEOUT |
| 51 | 321     | 2020-02-04 21:21:48.319062 | UNKNOWN |
| 52 | 555     | 2020-02-04 21:21:48.319062 | UP      |
+----+---------+----------------------------+---------+

我想生成一个摘要视图，如下所示：

+----------------------------+-----+---------+---------+---------+-------+
| check_date                 | UP  | REFUSED | TIMEOUT | UNKNOWN | TOTAL |
+----------------------------+-----+---------+---------+---------+-------+
| 2020-02-04 17:52:28.716036 | 221 | 34      | 10      | 2       | 267   |
| 2020-02-04 21:21:48.319062 | 230 | 30      | 15      | 4       | 279   |
+----------------------------+-----+---------+---------+---------+-------+

select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date

(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     fn.COUNT(Status.id).filter(Status.status == 'UP').alias('up'),
     fn.COUNT(Status.id).filter(Status.status == 'REFUSED').alias('refused'))
 .group_by(fn.date_trunc('day', Status.check_date)))

我可以在SQL中这样做：

+----------------------------+-----+---------+---------+---------+-------+
| check_date                 | UP  | REFUSED | TIMEOUT | UNKNOWN | TOTAL |
+----------------------------+-----+---------+---------+---------+-------+
| 2020-02-04 17:52:28.716036 | 221 | 34      | 10      | 2       | 267   |
| 2020-02-04 21:21:48.319062 | 230 | 30      | 15      | 4       | 279   |
+----------------------------+-----+---------+---------+---------+-------+

select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date

(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     fn.COUNT(Status.id).filter(Status.status == 'UP').alias('up'),
     fn.COUNT(Status.id).filter(Status.status == 'REFUSED').alias('refused'))
 .group_by(fn.date_trunc('day', Status.check_date)))

我将如何使用构造一个类似的查询？我知道可以通过

peewee.fn

名称空间访问sql函数，但我不确定是否可以使用该语法构造那些

过滤器

子查询

现在，我已经解决了这个问题，首先是：

status_summary = (
    Status.select(Status.check_date,
                  Status.status,
                  peewee.fn.Count(Status.id).alias('count'))
    .group_by(Status.check_date, Status.status)
    .order_by(Status.check_date, Status.status)
)

这让我感到：

+----------------------------+---------+-------+
| check_date                 | status  | count |
+----------------------------+---------+-------+
| 2020-02-04 17:52:28.716036 | UP      | 221   |
| 2020-02-04 17:52:28.716036 | REFUSED | 34    |
| 2020-02-04 17:52:28.716036 | TIMEOUT | 10    |
| 2020-02-04 17:52:28.716036 | UNKNOWN | 34    |
| 2020-02-04 21:21:48.319062 | UP      | 230   |
| 2020-02-04 21:21:48.319062 | REFUSED | 30    |
| 2020-02-04 21:21:48.319062 | TIMEOUT | 15    |
| 2020-02-04 21:21:48.319062 | UNKNOWN | 4     |
+----------------------------+---------+-------+

[
  {
    "date": "2020-02-04 17:52:28.716036",
    "summary": {
      "OPEN": 538,
      "REFUSED": 13,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-04 17:55:22.655965",
    "summary": {
      "OPEN": 533,
      "REFUSED": 15,
      "TIMEOUT": 42,
      "UNKNOWN": 5,
      "UNREACHABLE": 3
    }
  },
  {
    "date": "2020-02-04 18:51:31.937254",
    "summary": {
      "OPEN": 541,
      "REFUSED": 11,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 1
    }
  },
  {
    "date": "2020-02-04 21:21:48.319062",
    "summary": {
      "OPEN": 544,
      "REFUSED": 9,
      "TIMEOUT": 39,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-05 00:11:23.377746",
    "summary": {
      "OPEN": 547,
      "REFUSED": 8,
      "TIMEOUT": 37,
      "UNKNOWN": 5,
      "UNREACHABLE": 1
    }
  }
]

然后我使用Python中的

itertools.groupby

：

status_summary = itertools.groupby(status_summary, lambda x: x.check_date)
status_summary = [
    {'date': date, 'summary': {x.status: x.count for x in results}}
    for date, results in status_summary
]

这让我感到：

+----------------------------+---------+-------+
| check_date                 | status  | count |
+----------------------------+---------+-------+
| 2020-02-04 17:52:28.716036 | UP      | 221   |
| 2020-02-04 17:52:28.716036 | REFUSED | 34    |
| 2020-02-04 17:52:28.716036 | TIMEOUT | 10    |
| 2020-02-04 17:52:28.716036 | UNKNOWN | 34    |
| 2020-02-04 21:21:48.319062 | UP      | 230   |
| 2020-02-04 21:21:48.319062 | REFUSED | 30    |
| 2020-02-04 21:21:48.319062 | TIMEOUT | 15    |
| 2020-02-04 21:21:48.319062 | UNKNOWN | 4     |
+----------------------------+---------+-------+

[
  {
    "date": "2020-02-04 17:52:28.716036",
    "summary": {
      "OPEN": 538,
      "REFUSED": 13,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-04 17:55:22.655965",
    "summary": {
      "OPEN": 533,
      "REFUSED": 15,
      "TIMEOUT": 42,
      "UNKNOWN": 5,
      "UNREACHABLE": 3
    }
  },
  {
    "date": "2020-02-04 18:51:31.937254",
    "summary": {
      "OPEN": 541,
      "REFUSED": 11,
      "TIMEOUT": 41,
      "UNKNOWN": 4,
      "UNREACHABLE": 1
    }
  },
  {
    "date": "2020-02-04 21:21:48.319062",
    "summary": {
      "OPEN": 544,
      "REFUSED": 9,
      "TIMEOUT": 39,
      "UNKNOWN": 4,
      "UNREACHABLE": 2
    }
  },
  {
    "date": "2020-02-05 00:11:23.377746",
    "summary": {
      "OPEN": 547,
      "REFUSED": 8,
      "TIMEOUT": 37,
      "UNKNOWN": 5,
      "UNREACHABLE": 1
    }
  }
]

这实际上是我想要的，但我觉得到达这里的过程是不必要的复杂。

我会使用CASE，但COUNT/FILTER应该适用于Postgres

select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date

Peewee应该支持以下内容：

+----------------------------+-----+---------+---------+---------+-------+
| check_date                 | UP  | REFUSED | TIMEOUT | UNKNOWN | TOTAL |
+----------------------------+-----+---------+---------+---------+-------+
| 2020-02-04 17:52:28.716036 | 221 | 34      | 10      | 2       | 267   |
| 2020-02-04 21:21:48.319062 | 230 | 30      | 15      | 4       | 279   |
+----------------------------+-----+---------+---------+---------+-------+

select
  check_date, 
  count(*) filter (where status = 'UP') as UP,
  count(*) filter (where status = 'REFUSED') as REFUSED,
  count(*) filter (where status = 'TIMEOUT') as TIMEOUT,
  count(*) filter (where status = 'UNKNOWN') as UNKNOWN,
  count(*) as TOTAL
from status
group by check_date

(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     fn.COUNT(Status.id).filter(Status.status == 'UP').alias('up'),
     fn.COUNT(Status.id).filter(Status.status == 'REFUSED').alias('refused'))
 .group_by(fn.date_trunc('day', Status.check_date)))

但实际上，您可以依靠“分组方式”为您提供更简单的服务：

(Status
 .select(
     fn.date_trunc('day', Status.check_date).alias('date'),
     Status.status,
     fn.COUNT(Status.id).alias('count'))
 .group_by(fn.date_trunc('day', Status.check_date), Status.status))

这将为每个日期+每个状态提供一行，但更加灵活（因为您不必硬编码所有状态）。

谢谢您的回答！但如果我理解正确，这与我问题中的查询相同（

status\u summary=（status.select（status.check\u date，status.status，peewee.fn.Count（status.id）。别名（'Count'））。分组依据（status.check\u date，status.status）。订单依据（status.check\u date，status.status））

，除此之外，它使用

date\u trunc

修改日期显示。第一个查询将摘要显示在一行中，第二个查询将为每个摘要显示一行。只需使用第一个查询。