Python 使用cx\U oracle批量下载表_Python_Cx Oracle

Python 使用cx\U oracle批量下载表

python

Python 使用cx\U oracle批量下载表,python,cx-oracle,Python,Cx Oracle,我需要使用cx_oracle将一个大型表从oracle数据库下载到python服务器中。但是，python服务器上的ram是有限的，因此我需要以批处理的方式进行我已经知道怎么做一整张桌子了 usr='' pwd=“” tns='（描述=…' orcl=cx_Oracle.connect（用户、pwd、tns） curs=orcl.cursor（） printHeader=True tabletoget='BIGTABLE' sql=“从“+”模式中选择*。+tabletoget curs.ex

我需要使用cx_oracle将一个大型表从oracle数据库下载到python服务器中。但是，python服务器上的ram是有限的，因此我需要以批处理的方式进行

我已经知道怎么做一整张桌子了

usr=''
pwd=“”
tns='（描述=…'
orcl=cx_Oracle.connect（用户、pwd、tns）
curs=orcl.cursor（）
printHeader=True
tabletoget='BIGTABLE'
sql=“从“+”模式中选择*。+tabletoget
curs.execute（sql）
data=pd.read\u sql（sql，orcl）
data.to_csv（tabletoget+'.csv'

我不知道该怎么做，比如一次加载一批10000行，然后保存到csv，然后重新加入。

您可以直接使用cx\U Oracle执行此类批处理：

curs.arraysize = 10000
curs.execute(sql)
while True:
    rows = cursor.fetchmany()
    if rows:
        write_to_csv(rows)
    if len(rows) < curs.arraysize:
        break

curs.arraysize=10000
curs.execute（sql）
尽管如此：
rows=cursor.fetchmany（）
如果是行：
将\u写入\u csv（行）
如果len（行）


如果您使用的是Oracle Database 12c或更高版本，还可以使用“偏移量”和“获取下一行”选项，如下所示：
offset = 0
numRowsInBatch = 10000
while True:
    curs.execute("select * from tabletoget offset :offset fetch next :nrows only",
            offset=offset, nrows=numRowsInBatch)
    rows = curs.fetchall()
    if rows:
        write_to_csv(rows)
    if len(rows) < numRowsInBatch:
        break
    offset += len(rows)

offset=0
numRowsInBatch=10000
尽管如此：
curs.execute（“从tabletoget offset:offset fetch next:nrows中选择*”，
偏移量=偏移量，nrows=numRowsInBatch）
rows=curs.fetchall（）
如果是行：
将\u写入\u csv（行）
如果len（行）

此选项的效率不如第一个选项，需要为数据库提供更多的工作，但根据具体情况，它可能对您更有利
这些示例都没有直接使用pandas。我对该软件包不是特别熟悉，但如果您（或其他人）能够适当地调整此软件包，希望这会有所帮助！
您可以像这样获得结果。我正在将数据加载到df
import cx_Oracle
import time
import pandas

user = "test"
pw = "test"
dsn="localhost:port/TEST"

con = cx_Oracle.connect(user,pw,dsn)
start = time.time()
cur = con.cursor()
cur.arraysize = 10000
try:
    cur.execute( "select * from test_table" )
    names = [ x[0] for x in cur.description]
    rows = cur.fetchall()
    df=pandas.DataFrame( rows, columns=names)
    print(df.shape)
    print(df.head())
finally:
    if cur is not None:
        cur.close()

elapsed = (time.time() - start)
print(elapsed, "seconds")

对于一般背景，cx_Oracle 8现在有一个预取行
设置，可以与阵列化
一起进行调整。请参阅。