Excel 如何捕捉复制品?
我对Python比较陌生,正在将Excel导入postgreSQL。Excel中的address列有重复项,我想捕获这些重复项。最可行的方法是什么Excel 如何捕捉复制品?,excel,postgresql,python-2.7,psycopg2,Excel,Postgresql,Python 2.7,Psycopg2,我对Python比较陌生,正在将Excel导入postgreSQL。Excel中的address列有重复项,我想捕获这些重复项。最可行的方法是什么 import psycopg2 import xlrd book = xlrd.open_workbook("data.xlsx") sheet = book.sheet_by_name("List") database = psycopg2.connect (database = "Excel", user="SQL", password="PAS
import psycopg2
import xlrd
book = xlrd.open_workbook("data.xlsx")
sheet = book.sheet_by_name("List")
database = psycopg2.connect (database = "Excel", user="SQL", password="PASS", host="YES", port="DB")
cursor = database.cursor()
delete = """Drop table if exists "Python".list"""
print (delete)
mydata = cursor.execute(delete)
cursor.execute('''CREATE TABLE "Python".list
(DCAD_Prop_ID varchar(50),
Address varchar(50),
Addition varchar(50),
Block varchar(50),
Lot integer,
Project_ID integer
);''')
print "Table created successfully"
query = """INSERT INTO "Python".list (DCAD_Prop_ID, Address,Addition,Block ,Lot,Project_ID)
VALUES (%s, %s, %s, %s, %s, %s)"""
for r in range(1, sheet.nrows):
DCAD_Prop_ID = sheet.cell(r,0).value
Address = sheet.cell(r,1).value
Addition = sheet.cell(r,2).value
Block = sheet.cell(r,3).value
Lot = sheet.cell(r,4).value
Project_ID = sheet.cell(r,5).value
values = (DCAD_Prop_ID, Address,Addition,Block ,Lot,Project_ID)
cursor.execute(query, values)
cursor.close()
database.commit()
database.close()
print ""
print "All Done! Bye, for now."
print ""
columns = str(sheet.ncols)
rows = str(sheet.nrows)
print "I just imported Excel into postgreSQL"
这将返回重复的行:
select DCAD_Prop_ID, Address,Addition,Block,Lot,Project_ID, count(*)
from "Python".list
group by 1,2,3,4,5,6
having count(*) > 1
要消除Python中的重复项,请使用:
这将返回重复的行:
select DCAD_Prop_ID, Address,Addition,Block,Lot,Project_ID, count(*)
from "Python".list
group by 1,2,3,4,5,6
having count(*) > 1
要消除Python中的重复项,请使用:
谢谢你回复Clodoaldo。这是在SQL级别,我希望在Python中运行程序时捕获重复项。这在Python中可能吗?@PLearner当然。使用已编辑问题中的集合。谢谢Clodoaldo。如何在原始脚本中应用该集。它是否类似于Address=set(sheet.cell(r,1).value)?您需要首先从Excel中读取将它们附加到集合中的所有行。然后在另一个循环中插入Postgresql。感谢您没有否决我的问题@Clodoaldo。感谢您回复Clodoaldo。这是在SQL级别,我希望在Python中运行程序时捕获重复项。这在Python中可能吗?@PLearner当然。使用已编辑问题中的集合。谢谢Clodoaldo。如何在原始脚本中应用该集。它是否类似于Address=set(sheet.cell(r,1).value)?您需要首先从Excel中读取将它们附加到集合中的所有行。然后在另一个循环中插入Postgresql。感谢您没有否决我的问题@Clodoaldo。