Python 扩展Postgresql收集器

Python 扩展Postgresql收集器,python,postgresql,graphite,Python,Postgresql,Graphite,我在同一个查询中得到两个不同的结果 我正在扩展Diamond PostgresqlCollector以跟踪新的度量 具体而言,我正在尝试实现此处指定的膨胀估算查询: 我遇到的问题是,当我从psql命令提示符运行查询时,我得到的结果包括“public”模式名。但是当查询由diamond运行时,没有包含“public”的结果。相反,Entries仅适用于pg_目录和信息_模式。我通过检查日志/var/log/upstart/diamond.log看到了这一点 我能想象的唯一原因是“diamond”用

我在同一个查询中得到两个不同的结果

我正在扩展Diamond PostgresqlCollector以跟踪新的度量

具体而言,我正在尝试实现此处指定的膨胀估算查询:

我遇到的问题是,当我从psql命令提示符运行查询时,我得到的结果包括“public”模式名。但是当查询由diamond运行时,没有包含“public”的结果。相反,Entries仅适用于pg_目录和信息_模式。我通过检查日志/var/log/upstart/diamond.log看到了这一点

我能想象的唯一原因是“diamond”用户的权限错误,但我可以在psql命令行中看到用户diamond存在,并且具有超级用户权限。我从pg_目录中得到结果。所以我可以得到一些统计数据,但不是从我最感兴趣的数据库的公共模式中

是否有人扩展了postgresql collector并看到了这种行为,或者对下一步的尝试有什么建议

在此处添加相关文件。我正在测试的系统是一台流浪机器,但我使用一个傀儡文件尽可能地复制生产环境

/etc/diamond/diamond.conf /etc/diamond/configs/postgres-service.conf /usr/local/share/diamond/Collector/custompg/custompg.py
提供收集器的配置文件。提供收集器的配置文件。
[server]
pid_file = /var/run/diamond.pid

collectors_path = /usr/share/diamond/collectors/, /usr/local/share/diamond/collectors/
collectors_config_path = /etc/diamond/collectors/

handlers_path = /usr/share/diamond/handlers/
handlers_config_path = /etc/diamond/handlers/

handlers = diamond.handler.archive.ArchiveHandler


[handlers]

# logging handlers
keys = console

[[default]]

[[GraphitePickleHandler]]
host = graphite-01.local
port = 2014
timeout = 15
batch = 10


# ArchiveHandler writes stats to a local logfile.
# Much easier for testing and debugging.
[[ArchiveHandler]]
keys = watched_file
# File to write archive log files
log_file = /var/log/diamond/archive.log


[collectors]
[[default]]
hostname_method = fqdn_rev

interval = 60

[[CPUCollector]]
enabled = True
percore = True

[[DiskSpaceCollector]]
enabled = False

[[DiskUsageCollector]]
enabled = False

[[LoadAverageCollector]]
enabled = True

[[MemoryCollector]]
enabled = True

[[VMStatCollector]]
enabled = False

[[UserScriptsCollector]]
enabled = True

[loggers]
keys = root

[formatters]
keys = default

[logger_root]
level = INFO
handlers = console

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = default

[handler_watched_file]
class = handlers.WatchedFileHandler
level = DEBUG
formatter = default

[formatter_default]
format = [%(asctime)s] [%(levelname)s] [%(threadName)s] %(message)s

[configs]
path = "/etc/diamond/configs/"
extension = ".conf"
[collectors]
# Custom internal Postgresql collector. See diamond-service/files/collectors/custompg/custompg.py
[[CustomPostgresqlCollector]]
enabled = True
interval = 10
extended = True
metrics_blacklist = [^.]+\.indexes.*
pg_version = 9.3
user = diamond
# has_admin currently only controls if diamond should report how many WAL
# files exist on disk (although the query has a bug in it). However, as an
# unprivileged user, diamond can only see queries that are running as the same
# user. So in order to get the full picture of running queries on a multi-user
# system, diamond should have superuser privileges.
has_admin = False
import os
import sys
# Make sure we can import the existing postgres collector
try:
    import postgres
    from postgres import QueryStats, metrics_registry, registry
except ImportError:
    # It's likely that this is being imported in a test or script
    #   outside of the normal diamond runpath.
    # In these instances, try to add COLLECTOR_PATH to path and import again.
    # i.e. export PYTHONPATH=$PYTHONPATH:/usr/share/diamond/collectors/postgres
    raise ImportError("Unable to import built-in postgres collector."
                      "Make sure the collector path is added to PYTHONPATH.")

class CustomPostgresqlCollector(postgres.PostgresqlCollector):
    """
    Collector subclass to differentiate enabling/disabling
    company-specific Postgres metric collection.
    """
    #Even though nothing is being extended, this class is
    # still needed for the additional queries to get picked up
    # by Diamond.
    pass

class NonVacuumLongestRunningQueries(QueryStats):
    """
    Differentiate between vacuum and non-vacuum queries.

    The built-in longest running queries metric collection
    doesn't account for/filter vacuum operations.
    """
    path = "%(datname)s.non_vacuum.longest_running.%(metric)s"
    multi_db = True
    # This query is a modified version of
    #  https://github.com/python-diamond/Diamond/blob/0fda1835308255e3ac4b287724340baf16b27bb1/src/collectors/postgres/postgres.py#L506-L519
    base_query = """
    SELECT 'query',
        COALESCE(max(extract(epoch FROM CURRENT_TIMESTAMP-query_start)),0)
    FROM pg_stat_activity
    WHERE %s
    AND %s
    UNION ALL
    SELECT 'transaction',
        COALESCE(max(extract(epoch FROM CURRENT_TIMESTAMP-xact_start)),0)
    FROM pg_stat_activity
    WHERE 1=1
    AND %s
    """
    exclude_vacuum_queries = "query NOT LIKE '%VACUUM%'"

    # Two query versions in case collector needs to run on Postgres < 9.2
    query = base_query % ("current_query NOT LIKE '<IDLE%'",
                          exclude_vacuum_queries,
                          exclude_vacuum_queries)

    post_92_query = base_query % ("state NOT LIKE 'idle%'",
                                  exclude_vacuum_queries,
                                  exclude_vacuum_queries)


class UserTableVacuumStats(QueryStats):
    """Additional per-table vacuuming stats."""
    path = "%(datname)s.tables.%(schemaname)s.%(relname)s.vacuum.%(metric)s"
    multi_db = True
    # http://www.postgresql.org/docs/9.3/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW
    #  Also filter out generally non-volatile system tables.
    base_query = """
       SELECT relname, schemaname, vacuum_count, autovacuum_count
       FROM pg_stat_all_tables
       WHERE schemaname NOT IN ('pg_catalog', 'information_schema');
    """
    query = base_query


class TableBloatSize(QueryStats):
    """ Track estimated table bloat size using modified query written by ioguix:

        https://github.com/ioguix/pgsql-bloat-estimation/blob/master/table/table_bloat.sql
        WARNING: executed with a non-superuser role, the query inspects only
        tables you are granted to read.

    """
    path = "%(datname)s.tables.%(schemaname)s.%(relname)s.%(metric)s"
    multi_db = True
    query = """
        SELECT schemaname, relname, (tblpages-est_tblpages_ff)*bs AS bloat_size
        FROM (
          SELECT ceil( reltuples / ( (bs-page_hdr)/tpl_size ) ) + ceil( toasttuples / 4 ) AS est_tblpages,
            ceil( reltuples / ( (bs-page_hdr)*fillfactor/(tpl_size*100) ) ) + ceil( toasttuples / 4 ) AS est_tblpages_ff,
            tblpages, fillfactor, bs, tblid, schemaname, relname, heappages, toastpages
          FROM (
            SELECT
              ( 4 + tpl_hdr_size + tpl_data_size + (2*ma)
                - CASE WHEN tpl_hdr_size%ma = 0 THEN ma ELSE tpl_hdr_size%ma END
                - CASE WHEN ceil(tpl_data_size)::int%ma = 0 THEN ma ELSE ceil(tpl_data_size)::int%ma END
              ) AS tpl_size, (heappages + toastpages) AS tblpages, heappages,
              toastpages, reltuples, toasttuples, bs, page_hdr, tblid, schemaname, relname, fillfactor
            FROM (
              SELECT
                tbl.oid AS tblid, ns.nspname AS schemaname, tbl.relname AS relname, tbl.reltuples,
                tbl.relpages AS heappages, coalesce(toast.relpages, 0) AS toastpages,
                coalesce(toast.reltuples, 0) AS toasttuples,
                coalesce(substring(
                  array_to_string(tbl.reloptions, ' ')
                  FROM '%fillfactor=#"__#"%' FOR '#')::smallint, 100) AS fillfactor,
                current_setting('block_size')::numeric AS bs,
                CASE WHEN version()~'mingw32' OR version()~'64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END AS ma,
                24 AS page_hdr,
                23 + CASE WHEN MAX(coalesce(null_frac,0)) > 0 THEN ( 7 + count(*) ) / 8 ELSE 0::int END
                  + CASE WHEN tbl.relhasoids THEN 4 ELSE 0 END AS tpl_hdr_size,
                sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 1024) ) AS tpl_data_size
              FROM pg_attribute AS att
                JOIN pg_class AS tbl ON att.attrelid = tbl.oid
                JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
                JOIN pg_stats AS s ON s.schemaname=ns.nspname
                  AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname
                LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
              WHERE att.attnum > 0 AND NOT att.attisdropped
                AND tbl.relkind = 'r'
              GROUP BY 1,2,3,4,5,6,7,8,9,10, tbl.relhasoids
              ORDER BY 2,3
            ) AS s
          ) AS s2
        ) AS s3
        WHERE schemaname='public';
    """

class BtreeBloatSize(QueryStats):
    """ Track estimated index bloat size using modified query written by ioguix:

        https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat.sql
        WARNING: executed with a non-superuser role, the query inspect only index on tables you are granted to read.
        WARNING: rows with is_na = 't' are known to have bad statistics ("name" type is not supported). Not relevant to Public schema
    """
    path = "%(datname)s.tables.%(schemaname)s.%(relname)s.%(indexrelname)s.%(metric)s"
    multi_db = True
    query = """
        SELECT nspname AS schemaname, relname, indexrelname,
            bs*(relpages-est_pages_ff) AS bloat_size
        FROM (
          SELECT coalesce(1 +
               ceil(reltuples/floor((bs-pageopqdata-pagehdr)*fillfactor/(100*(4+nulldatahdrwidth)::float))), 0
            ) AS est_pages_ff,
            bs, nspname, relname, indexrelname, relpages, fillfactor
          FROM (
            SELECT maxalign, bs, nspname, relname, indexrelname, reltuples, relpages, relam, fillfactor,
              ( index_tuple_hdr_bm +
                  maxalign - CASE -- Add padding to the index tuple header to align on MAXALIGN
                    WHEN index_tuple_hdr_bm%maxalign = 0 THEN maxalign
                    ELSE index_tuple_hdr_bm%maxalign
                  END
                + nulldatawidth + maxalign - CASE -- Add padding to the data to align on MAXALIGN
                    WHEN nulldatawidth = 0 THEN 0
                    WHEN nulldatawidth::integer%maxalign = 0 THEN maxalign
                    ELSE nulldatawidth::integer%maxalign
                  END
              )::numeric AS nulldatahdrwidth, pagehdr, pageopqdata
            FROM (
              SELECT
                i.nspname, i.relname, i.indexrelname, i.reltuples, i.relpages, i.relam,
                current_setting('block_size')::numeric AS bs, fillfactor,
                CASE
                  -- MAXALIGN: 4 on 32bits, 8 on 64bits (and mingw32 ?)
                  WHEN version() ~ 'mingw32' OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN 8
                  ELSE 4
                END AS maxalign,
                /* per page header, fixed size: 20 for 7.X, 24 for others */
                24 AS pagehdr,
                /* per page btree opaque data */
                16 AS pageopqdata,
                /* per tuple header: add IndexAttributeBitMapData if some cols are null-able */
                CASE WHEN max(coalesce(s.null_frac,0)) = 0
                  -- IndexTupleData size
                  THEN 2
                  /* IndexTupleData size + IndexAttributeBitMapData size ( max num filed per index + 8 - 1 /8) */
                  ELSE 2 + (( 32 + 8 - 1 ) / 8)
                END AS index_tuple_hdr_bm,
                /* data len: we remove null values save space using it fractionnal part from stats */
                sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 1024)) AS nulldatawidth
              FROM pg_attribute AS a
                JOIN (
                  SELECT nspname, tbl.relname AS relname, idx.relname AS indexrelname, idx.reltuples, idx.relpages, idx.relam,
                    indrelid, indexrelid, indkey::smallint[] AS attnum,
                    coalesce(substring(
                      array_to_string(idx.reloptions, ' ')
                       from 'fillfactor=([0-9]+)')::smallint, 90) AS fillfactor
                  FROM pg_index
                    JOIN pg_class idx ON idx.oid=pg_index.indexrelid
                    JOIN pg_class tbl ON tbl.oid=pg_index.indrelid
                    JOIN pg_namespace ON pg_namespace.oid = idx.relnamespace
                  WHERE pg_index.indisvalid AND tbl.relkind = 'r' AND idx.relpages > 0
                ) AS i ON a.attrelid = i.indexrelid
                JOIN pg_stats AS s ON s.schemaname = i.nspname
                  AND ((s.tablename = i.relname AND s.attname = pg_catalog.pg_get_indexdef(a.attrelid, a.attnum, TRUE)) -- stats from tbl
                  OR   (s.tablename = i.indexrelname AND s.attname = a.attname))-- stats from functionnal cols
                JOIN pg_type AS t ON a.atttypid = t.oid
              WHERE a.attnum > 0
              GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9
            ) AS s1
          ) AS s2
            JOIN pg_am am ON s2.relam = am.oid WHERE am.amname = 'btree'
        ) AS sub
        WHERE nspname='public'
        ORDER BY 1,2,3;
    """

# Add the new metric queries to the
# registered set used by the collecting method.
metrics_registry.update({
    'NonVacuumLongestRunningQueries': NonVacuumLongestRunningQueries,
    'UserTableVacuumStats': UserTableVacuumStats,
    'TableBloatSize': TableBloatSize,
    'BtreeBloatSize': BtreeBloatSize,
})

registry['extended'] += ['NonVacuumLongestRunningQueries',
                         'UserTableVacuumStats',
                         'TableBloatSize',
                         'BtreeBloatSize']