Python 优化查询和排序结果
我有以下两个表格:Python 优化查询和排序结果,python,sql,sql-server,database,Python,Sql,Sql Server,Database,我有以下两个表格: Table 1 ID | Date1 | Date 2 ------------------------ 1 | 20180201| 20180201 1 | 20190201| 20190120 1 | 20200201| 20200129 2 | 20200810| 20200731 3 | 20191121| 20191023 3 | 20201030| 20201024 . . . Table 2 ID | Tag | r
Table 1
ID | Date1 | Date 2
------------------------
1 | 20180201| 20180201
1 | 20190201| 20190120
1 | 20200201| 20200129
2 | 20200810| 20200731
3 | 20191121| 20191023
3 | 20201030| 20201024
.
.
.
Table 2
ID | Tag | rel_ID
------------------------
1 | related | 3
1 | related | 8
1 | related | 10
2 | related | 4
2 | related | 5
.
.
.
ID|Date1|Date2|rel_ID_1|Date1|Date2|rel_ID_2|Date1|Date2|rel_ID_3|Date1|Date2|Prim ID|Date1|Date 2
1|20200201|20200129|3|20201030|20201024|8|20201104|20201030|10|20200301|20200229|1|20200201|20200129
所需的输出如下:
Table 1
ID | Date1 | Date 2
------------------------
1 | 20180201| 20180201
1 | 20190201| 20190120
1 | 20200201| 20200129
2 | 20200810| 20200731
3 | 20191121| 20191023
3 | 20201030| 20201024
.
.
.
Table 2
ID | Tag | rel_ID
------------------------
1 | related | 3
1 | related | 8
1 | related | 10
2 | related | 4
2 | related | 5
.
.
.
ID|Date1|Date2|rel_ID_1|Date1|Date2|rel_ID_2|Date1|Date2|rel_ID_3|Date1|Date2|Prim ID|Date1|Date 2
1|20200201|20200129|3|20201030|20201024|8|20201104|20201030|10|20200301|20200229|1|20200201|20200129
Date1和Date 2字段需要是ID和rel_ID的最大值(Date1)和最大值(date2)。一旦我得到这个值,我需要根据Date1是所有相关ID之间的最小值来确定ID或rel_ID是否是主ID。
我尝试了多个子查询来实现这一点,但它们需要很长时间。表2大约有一百万行,表1有800k行。
仅仅通过一个查询就可以做到这一点吗?或者这需要包含SQL的代码吗
我尝试了很长时间的一个查询是:
select t2.ID,t1.date1,t1.date2,t2.rel_ID,rel_date1 = (select max(date1) from t1 where ID = rel_ID), rel_date2 = (select max(date2) from t1 where ID = rel_ID)
from
(select distinct ID,max(date1),max(date2) from t1 group by ID) as t1,
(select distinct ID,rel_ID from t2) as t2
where t1 .ID = t2.ID
以下内容可能会有所帮助:
select t1.id, min(date1) date1, max(date2) date2
into new_t1
from t1
group by t1.id
select t2.id, t1.date1 date1_min, t1.date2 as date2_max, t2.rel_ID, t1_1.date1, t1_1.date2
into #temp
from t2
left join new_t1 t1 on t1.id = t2.id
left join new_t1 t1_1 on t1_1.id = t2.rel_ID
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);
SET @cols = STUFF((select ',
max(CASE WHEN [rel_ID]=''' + CAST([rel_ID] as varchar(10)) + ''' then ''' + CAST([rel_ID] as varchar(10)) + ''' end) as [rel_ID],
max(CASE WHEN [rel_ID]=''' + CAST([rel_ID] as varchar(10)) + ''' THEN CONVERT(VARCHAR(10), [date1]) ELSE ''0'' END) AS [' + CAST([rel_ID] as varchar(10)) + '_date1],
max(CASE WHEN [rel_ID]=''' + CAST([rel_ID] as varchar(10)) + ''' THEN CONVERT(VARCHAR(10), [date2]) ELSE ''0'' END) AS [' + CAST([rel_ID] as varchar(10)) + '_date2]'
FROM #temp
FOR XML PATH(''),type).value('.','varchar(max)'),1,2,'')
SET @query = 'SELECT id, date1_min as date1, date2_max as date2,' + @Cols + ' FROM #temp group by id, date1_min, date2_max'
print (@query)
exec (@query)
请向我们展示您对该样本数据的预期结果。请发布您迄今为止尝试过的查询。预期结果和我尝试过的查询将发布在原始问题中