Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/79.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/26.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在SQL SERVER数据库中进行数据挖掘,查找最可能的组合_Sql_Sql Server_Database_Tsql - Fatal编程技术网

在SQL SERVER数据库中进行数据挖掘,查找最可能的组合

在SQL SERVER数据库中进行数据挖掘,查找最可能的组合,sql,sql-server,database,tsql,Sql,Sql Server,Database,Tsql,我必须建立一个查询来分析商店的销售趋势。 基本上,我需要获得购买物品组合时的发生率,例如: 当购买物品0001时,很可能也购买了物品0002,因此我想检索如下内容: article a | article b | occurrences | --------- | --------- | ----------- | 0001 | 0002 | 1 0001 | 0003 | 0 store | station | document | consecuti

我必须建立一个查询来分析商店的销售趋势。 基本上,我需要获得购买物品组合时的发生率,例如: 当购买物品0001时,很可能也购买了物品0002,因此我想检索如下内容:

article a | article b | occurrences |
--------- | --------- | ----------- |
0001      | 0002      | 1
0001      | 0003      | 0
store | station | document | consecutive | article
----- | ------- | -------- | ----------- | ------
w     | x       | y        | a           | 0001
w     | x       | y        | a           | 0002 (same ticket, different article)
w     | x       | y        | b           | 0003
事实上,我有一个表TicketDetails,其中存储了每张票以及每张票上包含的文章代码,类似于:

article a | article b | occurrences |
--------- | --------- | ----------- |
0001      | 0002      | 1
0001      | 0003      | 0
store | station | document | consecutive | article
----- | ------- | -------- | ----------- | ------
w     | x       | y        | a           | 0001
w     | x       | y        | a           | 0002 (same ticket, different article)
w     | x       | y        | b           | 0003
请给我任何关于如何建立这个查询的建议,我觉得有点迷茫

注:
如上图所示,每个票证都是前4列w-x-y-z的组合,我想您只需要一个自联接。如果您想要所有的文章,而不是那些只按顺序同时出现的文章,那么SQL有点棘手

假设您有一个名为articles的表,因此可以首先生成所有对:

select a1.article, a2.article, count(td2.article) as occurrences
from articles a1 join
     articles a2
     on a1.article < a2.article left join -- (a, b) is the same as (b, a)
     ticketDetails td1
     on td1.article = a1.article left join
     ticketDetails td2
     on td2.article = a2.article and
        td2.store = td1.store and
        td2.station = td1.station and
        td2.document = td1.document and
        td2.consecutive = td1.consecutive
group by a1.article, a2.article;

我想你只是想要一个自我加入。如果您想要所有的文章,而不是那些只按顺序同时出现的文章,那么SQL有点棘手

假设您有一个名为articles的表,因此可以首先生成所有对:

select a1.article, a2.article, count(td2.article) as occurrences
from articles a1 join
     articles a2
     on a1.article < a2.article left join -- (a, b) is the same as (b, a)
     ticketDetails td1
     on td1.article = a1.article left join
     ticketDetails td2
     on td2.article = a2.article and
        td2.store = td1.store and
        td2.station = td1.station and
        td2.document = td1.document and
        td2.consecutive = td1.consecutive
group by a1.article, a2.article;

将TicketDetails连接到自身,匹配票证,但不同的文章

select t1.article
      ,t2.article
      ,Count(t1.article)
from ticketdetails t1
left join ticketdetails t2
   on t1.store = t2.store
      t1.station = t2.station
      t1.document = t2.document
      t1.consecutive = t2.consecutive
      t1.article < t2.article
group by t1.article, t2.article

将TicketDetails连接到自身,匹配票证,但不同的文章

select t1.article
      ,t2.article
      ,Count(t1.article)
from ticketdetails t1
left join ticketdetails t2
   on t1.store = t2.store
      t1.station = t2.station
      t1.document = t2.document
      t1.consecutive = t2.consecutive
      t1.article < t2.article
group by t1.article, t2.article

您使用的是MySQL还是MS SQL Server?不要标记未涉及的产品。您使用的是MySQL还是MS SQL Server?不要给未涉及的产品贴标签。谢谢你的回复戈登,我用了大约8个小时的时间来查询,什么都没有。问题是ticketdetails表有近400万条记录,articles表也有几千条记录。无论如何,我认为你的方法是有帮助的!你知道在大型数据库上该怎么做吗?谢谢你的回复戈登,我运行了你的查询大约8个小时,什么都没有。问题是ticketdetails表有近400万条记录,articles表也有几千条记录。无论如何,我认为你的方法是有帮助的!知道在大型数据库上执行什么操作吗?这不会返回0个实例。是的,看一看,我发现这会错过一个特别请求的结果。这不会返回0个实例。是的,看一看,我看到这会错过一个特别请求的结果。