Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/csharp/329.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/57.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C# 交叉口数据查询_C#_Mysql_Sql_Data Mining_Depth First Search - Fatal编程技术网

C# 交叉口数据查询

C# 交叉口数据查询,c#,mysql,sql,data-mining,depth-first-search,C#,Mysql,Sql,Data Mining,Depth First Search,我有9个交易样本,包含5个项目: [Table 1] itemset | TID_set --------+--------------------------------------- a | 100, 400, 500, 700, 800, 900 b | 100, 200, 300, 400, 600, 800, 900 c | 300, 500, 600, 700, 800, 900 d | 200, 400 e | 100,

我有9个交易样本,包含5个项目:

[Table 1]
itemset | TID_set
--------+---------------------------------------
a       | 100, 400, 500, 700, 800, 900
b       | 100, 200, 300, 400, 600, 800, 900
c       | 300, 500, 600, 700, 800, 900
d       | 200, 400
e       | 100, 800

[Table 2]
itemset | TID_set
--------+----------------------
a, b    | 100, 400, 800, 900
a, c    | 500, 700, 800, 900
a, d    | 400
a, e    | 100, 800
b, c    | 300, 600, 800, 900
b, d    | 200, 400
b, e    | 100, 800
c, e    | 800

[Table 3]
itemset | TID_set
--------+-----------
a, b, c | 800, 900
a, b, e | 100, 800
我想使用深度优先搜索算法显示表3中的数据,但结果与表3不同。这是我的源代码:

string query = "INSERT INTO table" + (k) + " SELECT DISTINCT ";

        for (int i = 1; i <= k - 1; i++)
        {
            query = query + "P.itemset" + i + ", ";
        }
        query = query + "Q.itemset" + (k - 1) + ",(SELECT COUNT(DISTINCT table1.TID_set) FROM table1 WHERE table1.TID_set = ANY(SELECT table1.TID_set FROM table1 WHERE table1.itemset IN( ";


        for (int i = 1; i <= k - 1; i++)
        {
            query = query + "P.itemset" + i + ",";
        }

        query = query + "Q.itemset" + (k - 1) + ") GROUP BY table1.TID_set HAVING COUNT(DISTINCT table1.itemset)>=" + k + "))";
        query = query + "FROM table" + (k - 1) + " P , table" + (k - 1) + " Q WHERE Q.itemset" + (k - 1) + " > P.itemset" + (k - 1) + " ";

        for (int i = 2; i < k - 1; i++)
        {
            query = query + "AND P.itemset" + i + " > P.itemset" + (i - 1) + " ";
        }

        query = query + "ORDER BY ";

        for (int i = 1; i <= k - 1; i++)
        {
            query = query + "P.itemset" + i + ",";
        }

        query = query + "Q.itemset" + (k - 1) + "";

著名的APRIORI算法不针对每个项目集组合查询数据库一次,而只针对每个项目集长度扫描数据库一次,这是有原因的:这已经足够昂贵了

如果您试图将所有内容都塞进一个大的SQL查询中,这是没有用的

由于大小的原因,您的方法无法扩展到任何有意义的数据集


如果您将数据库简单地视为一个数据存储,从中读取事务,并在C程序中执行实际的算法,而不是滥用SQL来处理它不是为……而设计的,那么将数据库简单地视为一个数据存储,从中读取事务,并在C程序中执行实际的算法,这将变得容易得多。

将数据作为逗号分隔的项存储在列中通常是一个坏主意。在编写查询时,它会导致很多问题。通常每行存储一个值。这只是一个示例,实际上数据存储不使用逗号分隔每个事务id,但我每行存储一个值。这只是代表性的。