Python 使用Pandas将Postgres表中的行写入CSV文件_Python_Pandas_Postgresql

Python 使用Pandas将Postgres表中的行写入CSV文件

python pandas postgresql

Python 使用Pandas将Postgres表中的行写入CSV文件,python,pandas,postgresql,Python,Pandas,Postgresql,我想从postgres表中写入到给定的NASmnt文件夹，但只有在执行一些数据卫生检查之后下面是实际运行的代码，但它只从表中选择列并将数据保存到.txt文件中导入操作系统导入psycopg2 导入时间作为pd进口熊猫文件路径和名称。 filePath='/mnt/nasdatavant/datavant/covid_实现/postgres_文件/' timestr=time.strftime%Y-%m-%d-%H%m%S 文件名='covid-'+timestr+'.txt' 数据库连

我想从postgres表中写入到给定的NASmnt文件夹，但只有在执行一些数据卫生检查之后

下面是实际运行的代码，但它只从表中选择列并将数据保存到.txt文件中

导入操作系统导入psycopg2 导入时间作为pd进口熊猫文件路径和名称。 filePath='/mnt/nasdatavant/datavant/covid_实现/postgres_文件/' timestr=time.strftime%Y-%m-%d-%H%m%S 文件名='covid-'+timestr+'.txt' 数据库连接变量。连接=无检查文件路径是否存在。如果os.path.existsfilePath： connect=psycopg2.connect 使用connect.cursor作为光标： sqlSelect=选择patnt\u last\u nm作为cust\u last\u nm\ 作为客户的第一个nm\ 作为客户的日期\ 作为客户性别光盘的性别光盘\ 作为客户邮寄光盘的邮寄光盘\ 来自datavant_o.covid_patnt_注册中心的独立输入id cursor.executesqlSelect 结果=cursor.fetchall 打印结果标题=[i[0]表示游标中的i.description] csvFile=csv.writeropenfilePath+文件名'w'，换行符=，分隔符='|'，行终止符='\r\n'，转义符='\\' csvFile.writerowheaders csvFile.writerowsresults 连接。提交连接，关闭其他：打印文件路径不存在。注意：当我说printresults时，它会产生一个元组列表例如：

所以我写了一个数据帧 df=pd.DataFrameresults

我现在真正想要的是在写入.txt文件之前添加如下数据卫生检查

我尝试的是：

csvFile=csv.writeropenfilePath+文件名'w'，换行符=，分隔符='|'，行终止符='\r\n'，转义符='\\' df=pd.DataFrameresults printdf df=df.dropnaaxis=0 df=df.loc[ df[0]。astype'str'.str.len>1&df[1]。astype'str'.str.len>1& df[4].astype'str'.str.len>4&df[4].astype'str'.str.len尝试一下，修改一下以满足您的需要。

[('TOBY', 'MICHEAL', datetime.date(1986, 8, 23), 'M', '06472', '872956'), 
('CARLIE', 'NAHLA', datetime.date(1979, 9, 29),..etc]

error: 
csvFile.writerows(df)
_csv.Error: iterable expected, not int

cust_last_nm|cust_frst_nm|cust_brth_dt|cust_gendr_cd|cust_postl_cd|indiv_entpr_id
TOBY|MICHEAL|1986-08-23|M|06472|872956
CARLIE|NAHLA|1979-09-29|F|06757|499666 …etc

import os
import psycopg2
import time
import pandas as pd

# File path and name.
filePath = '/mnt/nasdatavant/datavant/covid_implementation/postgres_files/'
timestr = time.strftime("%Y-%m-%d-%-H%M%S")
fileName = 'covid-' + timestr + '.csv'

# Check if the file path exists.
if os.path.exists(filePath):
    connect = psycopg2.connect("dbname=postgres user=postgres")
    with connect.cursor() as cursor:

        # create test table
        cursor.execute("create table if not exists test_table (test_column int)")
        cursor.execute("insert into test_table (test_column) values (10);")

        # query and import as Dataframe
        df = pd.read_sql("select * from test_table", connect)

        # add here your cleaning operations
        df = df.dropna(axis=0)

        # export
        df.to_csv(os.path.join(filePath, fileName), index=False)

        connect.commit()
        connect.close()
else:
    print("File path does not exist.")