Python 从另一个postgres表更新一个postgres表_Python_Python 3.x_Postgresql_Sql Update_Sql Insert

Python 从另一个postgres表更新一个postgres表

python python-3.x postgresql

Python 从另一个postgres表更新一个postgres表,python,python-3.x,postgresql,sql-update,sql-insert,Python,Python 3.x,Postgresql,Sql Update,Sql Insert,我正在使用python（比如表a）将一个批处理csv文件加载到postgres。我使用pandas将数据上传到chunk中，速度非常快 for chunk in pd.read_csv(csv_file, sep='|',chunksize=chunk_size,low_memory=False): 现在我想根据以下规则使用A更新另一个表（比如表B 如果表A中有任何不在表B中的新记录，则将其作为新记录插入表B（基于Id字段）如果表A中的值与表B中的ID相同，则使用表A更新表B中的记录（

我正在使用python（比如表a）将一个批处理csv文件加载到postgres。我使用pandas将数据上传到chunk中，速度非常快

for chunk in pd.read_csv(csv_file, sep='|',chunksize=chunk_size,low_memory=False):

现在我想根据以下规则使用A更新另一个表（比如表B

如果表A中有任何不在表B中的新记录，则将其作为新记录插入表B（基于Id字段）
如果表A中的值与表B中的ID相同，则使用表A更新表B中的记录（有一些服务器表需要根据表A进行更新）

我可以使用下面的方法来完成这项工作，然后循环遍历每一行，但表A总是有1825172左右的记录，而且速度非常慢。任何论坛成员都可以帮助加快这一进程，或提出实现这一目标的替代方法

cursor.execute(sql)
records = cursor.fetchall()

for row in records:  
    id= 0 if row[0] is None else row[0]  # Use this to match with Table B and decide insert or update     
    id2=0 if row[1] is None else row[1]   
    id2=0 if row[2] is None else row[2]

您可以利用Postgres upsert语法，如：

insert into tableB tb (id, col1, col2)
select ta.id, ta.col1, ta.col2 from tableA ta
on conflict(id) do update
    set col1 = ta.col1, col2 = ta.col2

您应该完全在DBMS中完成这项工作，而不是在python脚本中循环记录。这允许您的DBMS更好地优化

UPDATE TableB
SET    x=y
FROM TableA
WHERE TableA.id = TableB.id

INSERT INTO TableB(id,x)
SELECT id, y
FROM TableA
WHERE TableA.id NOT IN ( SELECT id FROM TableB )

感谢分享关于upsert语句的内容，这真的很好，但我现在无法定义约束，因为它可能在表b中有重复的行。