Python 将元组列表追加到数据帧
我想要的是:Python 将元组列表追加到数据帧,python,pandas,Python,Pandas,我想要的是: tlist = [("a1", "a2","a3"),("b1", "b2","b3"),("c1", "c2","c3")] 我可以这样做: df=pd.DataFrame([["a1","a2","a3"],["b1","b2","b3"],["c1","c2","c3"]]) 然而,元组列表是从某个数据库中提取的,所以我有一个循环,每次提取一个数据块,然后追加 最好的方法是什么 目前,数据量最多可达10亿行,而且还会增长 谢谢 df2 = pd.DataFrame(tli
tlist = [("a1", "a2","a3"),("b1", "b2","b3"),("c1", "c2","c3")]
我可以这样做:
df=pd.DataFrame([["a1","a2","a3"],["b1","b2","b3"],["c1","c2","c3"]])
然而,元组列表是从某个数据库中提取的,所以我有一个循环,每次提取一个数据块,然后追加
最好的方法是什么
目前,数据量最多可达10亿行,而且还会增长
谢谢
df2 = pd.DataFrame(tlist, columns=['col1', 'col2', 'col3'])
我找到了一个更简单的解决办法
import cx_Oracle
conn_str="scott/tiger@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=db1.org.waaw.com)(PORT=1234))(CONNECT_DATA=(SERVICE_NAME=hatsx)))"
con = cx_Oracle.connect(conn_str)
cursor= con.cursor ()
cursor.arraysize = 10000
import pandas as pd
rowsx=[("xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxxx")]
labels=['col1', 'col2','col3']
df = pd.DataFrame(rowsx, columns=labels)
#verify the connection
print (con.version)
#verify the connection
#very big table#
sql2 = """Select col1,col2,col3 from bigT"""
#very big table#
try:
cursor.execute (sql2)
except cx_Oracle.DatabaseError:
print ('Failed \n'+sql2)
#need to do it in chunk as not enough memory and blow up!
while True:
rows = cursor.fetchmany()
if rows == []:
break;
df2=pd.DataFrame.from_records(rows,columns=labels)
df=df.append(df2)
我不明白。请澄清,这篇文章不清楚。请看我添加的伪代码。
import cx_Oracle
conn_str="scott/tiger@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=db1.org.waaw.com)(PORT=1234))(CONNECT_DATA=(SERVICE_NAME=hatsx)))"
con = cx_Oracle.connect(conn_str)
cursor= con.cursor ()
cursor.arraysize = 10000
import pandas as pd
rowsx=[("xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxxx","xxxxxxxxxx")]
labels=['col1', 'col2','col3']
df = pd.DataFrame(rowsx, columns=labels)
#verify the connection
print (con.version)
#verify the connection
#very big table#
sql2 = """Select col1,col2,col3 from bigT"""
#very big table#
try:
cursor.execute (sql2)
except cx_Oracle.DatabaseError:
print ('Failed \n'+sql2)
#need to do it in chunk as not enough memory and blow up!
while True:
rows = cursor.fetchmany()
if rows == []:
break;
df2=pd.DataFrame.from_records(rows,columns=labels)
df=df.append(df2)
import pandas as pd
print(con.version)
query = """select * from all_tab_columns"""
df_ora = pd.read_sql(query, con=con)