Python 在文件名(yyyy/mm/dd)中查找日期/财务季度,然后将文件名中的日期更改为yyyy/mm/dd+;三个月
问题:总共有100多个表要合并。名字的唯一区别是日期。与其编写100多个代码块,只更改日期,不如编写一个python脚本来输出sql代码: 我们的SQL查询,查询一个表(即一个财务季度): 每个文件名的命名约定包括日期(财务季度) 我们想查询所有财务季度,包括到目前为止的19921231Python 在文件名(yyyy/mm/dd)中查找日期/财务季度,然后将文件名中的日期更改为yyyy/mm/dd+;三个月,python,Python,问题:总共有100多个表要合并。名字的唯一区别是日期。与其编写100多个代码块,只更改日期,不如编写一个python脚本来输出sql代码: 我们的SQL查询,查询一个表(即一个财务季度): 每个文件名的命名约定包括日期(财务季度) 我们想查询所有财务季度,包括到目前为止的19921231 19921231 19930331 19930630 19930930 19931231 19940331 19940630 19940930
19921231
19930331
19930630
19930930
19931231
19940331
19940630
19940930
19941231
19950331
19950630
19950930
19951231
…..
….
20180930
脚本将:
Step one: find the yyyy/mm/dd in the file name (e.g. 19921231)
Step two: copy the query
Step three: change the yyyy/mm/dd in the copied file name(s)
IF 1231 change to “+1”0331 (e.g. 19921231 to 19930331)
IF 0331 change to 0630 (e.g. 19930331 to 19930630)
IF 0630 change to 0930 (e.g. 19930630 to 19930930)
IF 0930 change to 1231 (e.g. 19930930 to 19931231)
IF 1231 change to +1 0331 (e.g. 19931231 to 19940331)
…..
…..
…..
IF 91231 change to 00331 (e.g. 19991231 to 20000331)
….
IF 91231 change to 0031 (e.g. 20091231 to 20100331)
Step four: print new code block after UNION ALL
Step five: repeat step three
Step six: repeat step four
输入将是一个单独的财务季度(见上面的代码块),输出是,该代码块重复100多次,每个文件名中仅更改了yyyy/mm/dd。每个代码块将与一个UNION ALL连接:
SELECT
(SELECT
PCR.repdte
FROM
All_Reports_19921231_Performance_and_Condition_Ratios as
PCR) AS Quarter,
(SELECT
Round(AVG(PCR.lnlsdepr))
FROM
All_Reports_19921231_Performance_and_Condition_Ratios as
PCR) AS NetLoansAndLeasesToDeposits,
(SELECT sum(CAST(LD.IDdepsam as int))
FROM
'All_Reports_19921231_Deposits_Based_on_the_Dollars250,000
_Reporting_Threshold' AS LD) AS
DepositAccountsWith$LessThan$250k
UNION ALL
SELECT
(SELECT
PCR.repdte
FROM
All_Reports_19930330_Performance_and_Condition_Ratios as
PCR) AS Quarter,
(SELECT
Round(AVG(PCR.lnlsdepr))
FROM All_Reports_19930330_Performance_and_Condition_Ratios
as PCR) AS NetLoansAndLeasesToDeposits,
(SELECT sum(CAST(LD.IDdepsam as int))
FROM
'All_Reports_19930330_Deposits_Based_on_the_Dollars250,000
_Reporting_Threshold' AS LD) AS
DepositAccountsWith$LessThan$250k
第一步是编写一个生成器,从给定的日期字符串到现在,以三个月的增量生成日期字符串。我们可以将初始日期存储在
datetime
对象中,使用该对象在每一步生成新的日期字符串,并防止日期超过当前日期。然后,我们可以使用查找给定月份的最后几天
from datetime import datetime
from calendar import monthrange
def dates_from(start):
date = datetime.strptime(start, "%Y%m%d")
today = datetime.today()
while date < today:
yield date.strftime("%Y%m%d")
month = date.month + 3
year = date.year
if month > 12:
year += 1
month -= 12
_, day = monthrange(year, month)
date = datetime(year, month, day)
SELECT
(SELECT
PCR.repdte
FROM
All_Reports_19921231_Performance_and_Condition_Ratios as
PCR) AS Quarter,
(SELECT
Round(AVG(PCR.lnlsdepr))
FROM
All_Reports_19921231_Performance_and_Condition_Ratios as
PCR) AS NetLoansAndLeasesToDeposits,
(SELECT sum(CAST(LD.IDdepsam as int))
FROM
'All_Reports_19921231_Deposits_Based_on_the_Dollars250,000
_Reporting_Threshold' AS LD) AS
DepositAccountsWith$LessThan$250k
UNION ALL
SELECT
(SELECT
PCR.repdte
FROM
All_Reports_19930330_Performance_and_Condition_Ratios as
PCR) AS Quarter,
(SELECT
Round(AVG(PCR.lnlsdepr))
FROM All_Reports_19930330_Performance_and_Condition_Ratios
as PCR) AS NetLoansAndLeasesToDeposits,
(SELECT sum(CAST(LD.IDdepsam as int))
FROM
'All_Reports_19930330_Deposits_Based_on_the_Dollars250,000
_Reporting_Threshold' AS LD) AS
DepositAccountsWith$LessThan$250k
from datetime import datetime
from calendar import monthrange
def dates_from(start):
date = datetime.strptime(start, "%Y%m%d")
today = datetime.today()
while date < today:
yield date.strftime("%Y%m%d")
month = date.month + 3
year = date.year
if month > 12:
year += 1
month -= 12
_, day = monthrange(year, month)
date = datetime(year, month, day)
sql_template = """\
SELECT
(SELECT
PCR.repdte
FROM
All_Reports_{0}_Performance_and_Condition_Ratios as
PCR) AS Quarter,
(SELECT
Round(AVG(PCR.lnlsdepr))
FROM
All_Reports_{0}_Performance_and_Condition_Ratios as
PCR) AS NetLoansAndLeasesToDeposits,
(SELECT sum(CAST(LD.IDdepsam as int))
FROM
'All_Reports_{0}_Deposits_Based_on_the_Dollars250,000
_Reporting_Threshold' AS LD) AS
DepositAccountsWith$LessThan$250k"""
queries = map(sql_template.format, dates_from("19921231"))
output_string = "\nUNION ALL\n".join(queries)