Performance e服务器端的计时，并注意：_Performance_Postgresql_Jdbc_Psycopg2

Performance e服务器端的计时，并注意：

performance postgresql jdbc

Performance e服务器端的计时，并注意：,performance,postgresql,jdbc,psycopg2,Performance,Postgresql,Jdbc,Psycopg2,服务器上报告的sql在每种情况下都是相同的吗如果相同，则应具有相同的计时客户端是否正在生成游标而不是传递sql 一个驱动程序在字符集之间执行大量的强制转换/转换，或者隐式转换其他类型，例如日期或时间戳等等为了完整起见，将包含计划数据，这可能会告知客户提交的SQL中是否存在重大差异您是否尝试过其他ORM，如Web2Py中的Storm或DAL？如果您查看一下计时，大部分持续时间都属于：{method'fetchmany'of'psycopg2.\u psycopg.cursor'obje

服务器上报告的sql在每种情况下都是相同的吗

如果相同，则应具有相同的计时

客户端是否正在生成游标而不是传递sql

一个驱动程序在字符集之间执行大量的强制转换/转换，或者隐式转换其他类型，例如日期或时间戳

等等

为了完整起见，将包含计划数据，这可能会告知客户提交的SQL中是否存在重大差异

您是否尝试过其他ORM，如Web2Py中的Storm或DAL？如果您查看一下计时，大部分持续时间都属于：{method'fetchmany'of'psycopg2.\u psycopg.cursor'objects}。这可能是ORM（比如SQL Alchemy）调用，但我真的认为是psycopg2 c代码/库围绕postgres的libpq。根据我链接的另一个线程，我的直觉告诉我这意味着这是一个psycopg2问题以及它使用libpq的方式。你考虑过Jython吗？有了它，您可以用Python编写代码，但仍然可以使用JDBC驱动程序。此外，ETL（Informatica或Talend）工具似乎是比Python或Java更好的解决方案。关于JDBC解决方案的内存消耗：默认情况下，驱动程序首先将完整结果加载到内存中，然后才能使用

next（）

对其进行迭代。这种行为可以改变，然后您应该看到JDBC驱动程序使用的内存更少：您是否尝试过基准测试psycopg2本身而不是像SQLAlchemy这样的ORM？我们确实在适当的地方使用副本。不幸的是，我们的大多数sql脚本要复杂得多，需要数据库扫描。我们最终需要优化它们。因此，虽然拷贝可能在一个非常基本的“移动”中工作，但我们要做的远不止这些，需要一个通用的解决方案。@Brian，请确认，您知道您可以将任意

SELECT

语句（包括

JOIN

s IIRC）馈送到

copy

，对吗？是的。但是，COPY针对在文件之间移动数据进行了优化，我们使用它来实现这一点。但这只能让我们走到这一步。我们需要一些也可以用作ORM的东西。我上面介绍的数据只是简单的SQL查询，没有ORM开销。复制并没有解决我最初的问题：我们需要一个能工作的python解决方案。实际上，我来自嵌入式软件的背景，用C和汇编编写代码是我经常做的事情。然而，在本例中，我包含了Psql行作为“纯”c代码的参考点。所以你建议删除“接口”的东西对我来说没有太大意义。。。我们小组的目标是完全远离编译语言，因为它们是有限的。如果我们选择Jython，这将不是问题，因为Jython将使用与Java相同的库。。。在大数据方面比C更快（参见PSQL行）。 Records | JDBC | SQLAlchemy[1] | SQLAlchemy[2] | Psql -------------------------------------------------------------------- 1 (4kB) | 200ms | 300ms | 250ms | 10ms 10 (8kB) | 200ms | 300ms | 250ms | 10ms 100 (88kB) | 200ms | 300ms | 250ms | 10ms 1,000 (600kB) | 300ms | 300ms | 370ms | 100ms 10,000 (6MB) | 800ms | 830ms | 730ms | 850ms 100,000 (50MB) | 4s | 5s | 4.6s | 8s 1,000,000 (510MB) | 30s | 50s | 50s | 1m32s 10,000,000 (5.1GB) | 4m44s | 7m55s | 6m39s | n/a -------------------------------------------------------------------- 5,000,000 (2.6GB) | 2m30s | 4m45s | 3m52s | 14m22s -------------------------------------------------------------------- [1] - With the processrow function [2] - Without the processrow function (direct dump) #!/usr/bin/env python # testSqlAlchemy.py import sys try: import cdecimal sys.modules["decimal"]=cdecimal except ImportError,e: print >> sys.stderr, "Error: cdecimal didn't load properly." raise SystemExit from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker def processrow (row,delimiter="|",null="\N"): newrow = [] for x in row: if x is None: x = null newrow.append(str(x)) return delimiter.join(newrow) fetchsize = 10000 connectionString = "postgresql+psycopg2://usr:pass@server:port/db" eng = create_engine(connectionString, server_side_cursors=True) session = sessionmaker(bind=eng)() with open("test.sql","r") as queryFD: with open("/dev/null","w") as nullDev: query = session.execute(queryFD.read()) cur = query.cursor while cur.statusmessage not in ['FETCH 0','CLOSE CURSOR']: for row in query.fetchmany(fetchsize): print >> nullDev, processrow(row) Fri Mar 4 13:49:45 2011 sqlAlchemy.prof 415757706 function calls (415756424 primitive calls) in 563.923 CPU seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 563.924 563.924 {execfile} 1 25.151 25.151 563.924 563.924 testSqlAlchemy.py:2() 1001 0.050 0.000 329.285 0.329 base.py:2679(fetchmany) 1001 5.503 0.005 314.665 0.314 base.py:2804(_fetchmany_impl) 10000003 4.328 0.000 307.843 0.000 base.py:2795(_fetchone_impl) 10011 0.309 0.000 302.743 0.030 base.py:2790(__buffer_rows) 10011 233.620 0.023 302.425 0.030 {method 'fetchmany' of 'psycopg2._psycopg.cursor' objects} 10000000 145.459 0.000 209.147 0.000 testSqlAlchemy.py:13(processrow) Fri Mar 4 14:03:06 2011 sqlAlchemy.prof 305460312 function calls (305459030 primitive calls) in 536.368 CPU seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 536.370 536.370 {execfile} 1 29.503 29.503 536.369 536.369 testSqlAlchemy.py:2() 1001 0.066 0.000 333.806 0.333 base.py:2679(fetchmany) 1001 5.444 0.005 318.462 0.318 base.py:2804(_fetchmany_impl) 10000003 4.389 0.000 311.647 0.000 base.py:2795(_fetchone_impl) 10011 0.339 0.000 306.452 0.031 base.py:2790(__buffer_rows) 10011 235.664 0.024 306.102 0.031 {method 'fetchmany' of 'psycopg2._psycopg.cursor' objects} 10000000 32.904 0.000 172.802 0.000 base.py:2246(__repr__)

log_min_duration_statement to 0 
log_destination = 'csvlog'              # Valid values are combinations of      
logging_collector = on                # Enable capturing of stderr and csvlog 
log_directory = 'pg_log'                # directory where log files are written,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,        
debug_print_parse = on
debug_print_rewritten = on
debug_print_plan output = on
log_min_messages = info (debug1 for all server versions prior to 8.4)