Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Mysql 在sql中执行SCD的通用过程_Mysql_Sql_Sql Server_Tsql_Scd - Fatal编程技术网

Mysql 在sql中执行SCD的通用过程

Mysql 在sql中执行SCD的通用过程,mysql,sql,sql-server,tsql,scd,Mysql,Sql,Sql Server,Tsql,Scd,我在mssql服务器中有2个表。我可以通过自定义insert/update/delete和Merge语句执行scd 我想知道是否有任何通用程序可以满足这一目的。我们只要给它两张表,它就会形成SCD。SQL server 2008中有任何选项吗? 谢谢不,没有,也不可能有一个通用的适用于任何表的。有几个原因: 您如何知道哪种SCD类型?(好的,可能是另一个参数,但是…) 你怎么知道哪一列应该被历史化,哪一列应该被覆盖 如何确定哪列是业务键、代理键、过期列等等 要在update语句中指定列,

我在mssql服务器中有2个表。我可以通过自定义insert/update/delete和Merge语句执行scd

我想知道是否有任何通用程序可以满足这一目的。我们只要给它两张表,它就会形成SCD。SQL server 2008中有任何选项吗?
谢谢

不,没有,也不可能有一个通用的适用于任何表的。有几个原因:

  • 您如何知道哪种SCD类型?(好的,可能是另一个参数,但是…)
  • 你怎么知道哪一列应该被历史化,哪一列应该被覆盖
  • 如何确定哪列是业务键、代理键、过期列等等
  • 要在update语句中指定列,必须编写动态sql,这是可能的,但上面的一点起作用
这不是不可能的原因,但也要考虑:对于正确的
UPSERT
来说,通常使用临时表,
MERGE
语句对于SCD来说很糟糕,除非在特殊情况下。这是因为您不能将
MERGE
语句与
INSERT/UPDATE
一起使用,您必须为此禁用外键,因为
UPDATE
实现为
DELETE然后INSERT
(或者类似的,我记不清了,但我在尝试时遇到了这些问题)

我更喜欢这样做(SCD类型2和SQL Server):

第1步:

IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;

SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName 
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT 
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds 
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
UPDATE D_yourDimensionName SET 
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM 
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn 
)
;
INSERT INTO D_yourDimensionName 
SELECT * FROM tmpDimYourDimensionName 
WHERE sid = 0;
第二步:

IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;

SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName 
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT 
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds 
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
UPDATE D_yourDimensionName SET 
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM 
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn 
)
;
INSERT INTO D_yourDimensionName 
SELECT * FROM tmpDimYourDimensionName 
WHERE sid = 0;
步骤2中的
ISNULL(d.sid,0)
非常重要。如果条目已存在,则返回维度的代理项id,否则返回0

第三步:

IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;

SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName 
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT 
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds 
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
UPDATE D_yourDimensionName SET 
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM 
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn 
)
;
INSERT INTO D_yourDimensionName 
SELECT * FROM tmpDimYourDimensionName 
WHERE sid = 0;

就是这样。

不,没有,也不可能有一个通用的,无论你传递给它什么表都适用的。有几个原因:

  • 您如何知道哪种SCD类型?(好的,可能是另一个参数,但是…)
  • 你怎么知道哪一列应该被历史化,哪一列应该被覆盖
  • 如何确定哪列是业务键、代理键、过期列等等
  • 要在update语句中指定列,必须编写动态sql,这是可能的,但上面的一点起作用
这不是不可能的原因,但也要考虑:对于正确的
UPSERT
来说,通常使用临时表,
MERGE
语句对于SCD来说很糟糕,除非在特殊情况下。这是因为您不能将
MERGE
语句与
INSERT/UPDATE
一起使用,您必须为此禁用外键,因为
UPDATE
实现为
DELETE然后INSERT
(或者类似的,我记不清了,但我在尝试时遇到了这些问题)

我更喜欢这样做(SCD类型2和SQL Server):

第1步:

IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;

SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName 
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT 
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds 
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
UPDATE D_yourDimensionName SET 
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM 
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn 
)
;
INSERT INTO D_yourDimensionName 
SELECT * FROM tmpDimYourDimensionName 
WHERE sid = 0;
第二步:

IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;

SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName 
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT 
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds 
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
UPDATE D_yourDimensionName SET 
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM 
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn 
)
;
INSERT INTO D_yourDimensionName 
SELECT * FROM tmpDimYourDimensionName 
WHERE sid = 0;
步骤2中的
ISNULL(d.sid,0)
非常重要。如果条目已存在,则返回维度的代理项id,否则返回0

第三步:

IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;

SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName 
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT 
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds 
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
UPDATE D_yourDimensionName SET 
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM 
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn 
)
;
INSERT INTO D_yourDimensionName 
SELECT * FROM tmpDimYourDimensionName 
WHERE sid = 0;

就这样。

@user1765876更新了我的答案。@user1765876有什么反馈吗?是的,先生,通过插入更新我已经知道了。。我想知道你是否可以通过merge语句做些什么。是的,你可以,就像我说的,但是你必须禁用外键(如果它是星型模式的话,通常是从facts表中),这取决于你想要实现什么类型的SCD。而且,没有一种合理的方法可以编写一个过程,无论您对它抛出什么表,都可以应用它。当然,您可以为每个维度编写一个过程。请随意。@user1765876更新了我的答案。@user1765876有任何反馈吗?是的,先生,通过插入更新我已经知道了。。我想知道你是否可以通过merge语句做些什么。是的,你可以,就像我说的,但是你必须禁用外键(如果它是星型模式的话,通常是从facts表中),这取决于你想要实现什么类型的SCD。而且,没有一种合理的方法可以编写一个过程,无论您对它抛出什么表,都可以应用它。当然,您可以为每个维度编写一个过程。请随意