Sql 查找数据中缺少的日期
我每天从多个天线接收一个txt数据文件。文件的命名约定为: 唯一天线ID+年+月+日+随机3位数 我解析了文件名并创建了如下表:Sql 查找数据中缺少的日期,sql,sql-server,missing-data,gaps-and-islands,sql-server-2016,Sql,Sql Server,Missing Data,Gaps And Islands,Sql Server 2016,我每天从多个天线接收一个txt数据文件。文件的命名约定为: 唯一天线ID+年+月+日+随机3位数 我解析了文件名并创建了如下表: AntennaID fileyear filemonth fileday filenumber filename 0000 2016 09 22 459 000020160922459.txt 0000 2016 09
AntennaID fileyear filemonth fileday filenumber filename
0000 2016 09 22 459 000020160922459.txt
0000 2016 09 21 981 000020160921981.txt
0000 2016 09 20 762 000020160920762.txt
0001 2016 09 22 635 000120160922635.txt
.
.
.
etc. (200k rows)
有时天线发送的文件不止一个,或者根本没有文件。如果发送的文件超过1个,则唯一的3位文件号会区分文件,但是我正在尝试查找未发送文件的日期
我尝试了几个groupby语句来比较一个月内的数据文件量,看看它是否与该月的天数相匹配——但问题是,有时天线每天发送超过1个文件,如果我们只是比较计数,这可能会人为地弥补丢失的文件
我正在寻找一种更可靠的方法来查找丢失文件的日期或日期范围。我已经研究了分区和函数,感觉可能有潜力,但我不确定如何使用它们,因为我对SQL还相当陌生
我正在使用Microsoft SQL Server 2016您可以使用或简称cte来创建日期表。然后,您可以从此表连接到天线数据,并查找返回空值的日期:
您可以使用“不存在”:
我的投票使你的声誉超过1000。“你欠我一杯啤酒。”丹布拉克想回顾一下你的历史,就9件事投票,结果你得了15万英镑……这是所谓的差距和岛屿问题的另一个例子。你可以用谷歌搜索这个词来找到很多解决方案。
declare @MinDate date = getdate()-50
declare @MaxDate date = getdate()
;with Dates as
(
select @MinDate as DateValue
union all
select dateadd(d,1,DateValue)
from Dates
where DateValue < @MaxDate
)
select d.DateValue
from Dates d
left join AntennaData a
on(d.DateValue = cast(cast(a.fileyear as nvarchar(4)) + cast(a.filemonth as nvarchar(4)) + cast(a.fileday as nvarchar(4)) as date))
option (maxrecursion 0)
declare @MinDate date = getdate()-50;
declare @MaxDate date = getdate();
-- Generate table with 10 rows
with t(t) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Add row numbers (-1 to start at adding 0 to retain @MinDate value) based on tally table to @MinDate for the number of days +1 (to ensure Min and Max date are included) between the two dates
,d(d) as (select top(datediff(day, @MinDate, @MaxDate)+1) dateadd(day,row_number() over (order by (select null))-1,@MinDate)
from t t1,t t2,t t3,t t4,t t5,t t6 -- Cross join creates 10^6 or 10*10*10*10*10*10 = 1,000,000 row table
)
select *
from d;
DECLARE @BeginDate DATE, @EndDate DATE;
SET @BeginDate = '20160101';
SET @EndDate = '20160922';
WITH Dates AS
(
SELECT DATEADD(DAY,number,@BeginDate) [Date]
FROM master.dbo.spt_values
WHERE type = 'P'
AND DATEADD(DAY,number,@BeginDate) <= @EndDate
)
SELECT *
FROM Dates A
WHERE NOT EXISTS(SELECT 1 FROM dbo.Antenna
WHERE SUBSTRING([filename],5,8) = A.[Date]);