Python 3.x 如何高效地使用Python自动化SQL代码

Python 3.x 如何高效地使用Python自动化SQL代码,python-3.x,pandas,group-by,Python 3.x,Pandas,Group By,我有一个sql代码,如下所示: 数据库:RedShift WITH X as ( SELECT distinct pn , pg , ic, sr , cm, fq , m1 , m2 , m3 , m4 FROM table1 ORDER BY 1,2,3 ), table2 AS ( Select g,p,t , avg(ss) as ss , avg(ce) as ce , sum(av) as ps from ( select distinct ic

我有一个sql代码,如下所示: 数据库:
RedShift

WITH
X as
(  
 SELECT  distinct pn , pg , ic, sr  , cm, fq , m1  , m2  , m3  , m4
 FROM table1  ORDER BY 1,2,3
),

table2 AS
 (
   Select g,p,t , avg(ss) as ss , avg(ce) as ce , sum(av) as ps
  from
 (
select distinct ic AS g , pn AS p , cm AS t , ss , cast((sum_m1/nullif(sum_m2,0)) as decimal(3,2)) as 
 ce , av
from
(
  select *
  , cast((sum(m3) over (partition by ic, pn,cm)) as decimal) as ss
  , sum(m1) over (partition by ic, pn,cm) as sum_m1
  , sum(m2) over (partition by ic, pn,cm) as sum_m2
  , cast((avg(m2) over (partition by ic, pn,cm)) as decimal) as av
  from X
  ORDER BY 1,2,3
)
order by 1,2,3
 )
 where ss is not null
  group by 1,2,3
  order by 1,2,3
  )
按值分组
g,p,t
每次都会更改,因此它会为
g,p,t
值的每个新组合创建表格

自动化的一种方法是在Python中转储此sql代码,这可能效率低下: 下面是我的方法->我用大括号替换列表中的所有值 比如说:

我将所有可能的group by值存储在一个列表中

G=[g1、g2、g3]

p=[p1,p2,p3]

T=[t1,t2,t3]

连接到数据库:

   c= psycopg2.connect(database=db,host=host,port=port,user=user,password=password,sslmode='require')
   data2={}

   for g in G:

     for p in P:

        for t in T:
           sqlstr=( """ WITH
                            X as
                          (  
                 SELECT  distinct pn , pg , ic, sr  , cm, fq , m1  , m2  , m3  , m4
              FROM table1  ORDER BY 1,2,3
                        ),

                  table2 AS
              Select {},{},{} , avg(ss) as ss , avg(ce) as ce , sum(av) as ps
                             from
                         (
                    select distinct ic AS g , pn AS p , cm AS t , ss , cast((sum_m1/nullif(sum_m2,0)) as decimal(3,2)) as 
                      ce , av
                          from
                  (
                       select *
               , cast((sum(m3) over (partition by ic, pn,cm)) as decimal) as ss
                , sum(m1) over (partition by ic, pn,cm) as sum_m1
               , sum(m2) over (partition by ic, pn,cm) as sum_m2
               , cast((avg(m2) over (partition by ic, pn,cm)) as decimal) as av
                      from X
                     ORDER BY 1,2,3
                       )
                   order by 1,2,3
                    )
                 where ss is not null
                   group by 1,2,3
                   order by 1,2,3
                    ), select * from table2 """).format(g,p,t)
  data2[g+"_"+p+"_"+t] = pd.read_sql_query(sqlstr, c)
有没有更好的方法来传递参数,就像上面的代码一样,{}的顺序应该保持以按顺序传递参数


我们可以使用SQL以外的其他方法吗?以蟒蛇的方式?

怎么样<代码>从itertools导入产品,然后在产品(g,p,t)中为g,p,t使用
而不是嵌套for循环:…
这很酷。但是,如何将参数传递给SQL代码呢。任何替代传递参数的方法都可能有帮助:使用格式就可以了。但是您可以使用f字符串,因为它们看起来更干净(python+3.6)