Python 3.x 如何高效地使用Python自动化SQL代码
我有一个sql代码,如下所示: 数据库:Python 3.x 如何高效地使用Python自动化SQL代码,python-3.x,pandas,group-by,Python 3.x,Pandas,Group By,我有一个sql代码,如下所示: 数据库:RedShift WITH X as ( SELECT distinct pn , pg , ic, sr , cm, fq , m1 , m2 , m3 , m4 FROM table1 ORDER BY 1,2,3 ), table2 AS ( Select g,p,t , avg(ss) as ss , avg(ce) as ce , sum(av) as ps from ( select distinct ic
RedShift
WITH
X as
(
SELECT distinct pn , pg , ic, sr , cm, fq , m1 , m2 , m3 , m4
FROM table1 ORDER BY 1,2,3
),
table2 AS
(
Select g,p,t , avg(ss) as ss , avg(ce) as ce , sum(av) as ps
from
(
select distinct ic AS g , pn AS p , cm AS t , ss , cast((sum_m1/nullif(sum_m2,0)) as decimal(3,2)) as
ce , av
from
(
select *
, cast((sum(m3) over (partition by ic, pn,cm)) as decimal) as ss
, sum(m1) over (partition by ic, pn,cm) as sum_m1
, sum(m2) over (partition by ic, pn,cm) as sum_m2
, cast((avg(m2) over (partition by ic, pn,cm)) as decimal) as av
from X
ORDER BY 1,2,3
)
order by 1,2,3
)
where ss is not null
group by 1,2,3
order by 1,2,3
)
按值分组g,p,t
每次都会更改,因此它会为g,p,t
值的每个新组合创建表格
自动化的一种方法是在Python中转储此sql代码,这可能效率低下:
下面是我的方法->我用大括号替换列表中的所有值
比如说:
我将所有可能的group by值存储在一个列表中
G=[g1、g2、g3]
p=[p1,p2,p3]
T=[t1,t2,t3]
连接到数据库:
c= psycopg2.connect(database=db,host=host,port=port,user=user,password=password,sslmode='require')
data2={}
for g in G:
for p in P:
for t in T:
sqlstr=( """ WITH
X as
(
SELECT distinct pn , pg , ic, sr , cm, fq , m1 , m2 , m3 , m4
FROM table1 ORDER BY 1,2,3
),
table2 AS
Select {},{},{} , avg(ss) as ss , avg(ce) as ce , sum(av) as ps
from
(
select distinct ic AS g , pn AS p , cm AS t , ss , cast((sum_m1/nullif(sum_m2,0)) as decimal(3,2)) as
ce , av
from
(
select *
, cast((sum(m3) over (partition by ic, pn,cm)) as decimal) as ss
, sum(m1) over (partition by ic, pn,cm) as sum_m1
, sum(m2) over (partition by ic, pn,cm) as sum_m2
, cast((avg(m2) over (partition by ic, pn,cm)) as decimal) as av
from X
ORDER BY 1,2,3
)
order by 1,2,3
)
where ss is not null
group by 1,2,3
order by 1,2,3
), select * from table2 """).format(g,p,t)
data2[g+"_"+p+"_"+t] = pd.read_sql_query(sqlstr, c)
有没有更好的方法来传递参数,就像上面的代码一样,{}的顺序应该保持以按顺序传递参数
我们可以使用SQL以外的其他方法吗?以蟒蛇的方式?怎么样<代码>从itertools导入产品,然后在产品(g,p,t)中为g,p,t使用
而不是嵌套for循环:…
这很酷。但是,如何将参数传递给SQL代码呢。任何替代传递参数的方法都可能有帮助:使用格式就可以了。但是您可以使用f字符串,因为它们看起来更干净(python+3.6)