Mysql 在排除某些客户的同时，寻找具有相似品味的客户_Mysql

Mysql 在排除某些客户的同时，寻找具有相似品味的客户

mysql

Mysql 在排除某些客户的同时，寻找具有相似品味的客户,mysql,Mysql,我有一个记录客户购买的表格，每次购买一行： CustomerID | ProductID 1 | 1000 1 | 2000 1 | 3000 2 | 1000 3 | 1000 3 | 3000 ... | ... 我使用以下代码查找与客户1重叠产品最多的十个客户第一个结果是重叠最多的一个，等等： SELECT othercu

我有一个记录客户购买的表格，每次购买一行：

CustomerID  |  ProductID 
1           |  1000 
1           |  2000 
1           |  3000 
2           |  1000 
3           |  1000 
3           |  3000 
...         |  ...

我使用以下代码查找与客户1重叠产品最多的十个客户第一个结果是重叠最多的一个，等等：

SELECT othercustomers.CustomerID, COUNT(DISTINCT othercustomers.ProductID)
FROM `purchases` AS thiscustomer
JOIN `purchases` AS othercustomers ON
    thiscustomer.CustomerID != othercustomers.CustomerID
    AND thiscustomer.ProductID = othercustomers.ProductID
WHERE thiscustomer.CustomerID = '1'
GROUP BY othercustomers.CustomerID
ORDER BY COUNT(DISTINCT othercustomers.ProductID) DESC
LIMIT 10

该代码生成预期的输出客户ID+与客户1重叠的产品总数

我现在希望查询排除购买了超过1000种不同产品的重复购买客户，因为这些客户是购买整个库存的批量买家，因此在搜索具有类似口味的客户时，其购买历史没有意义

换句话说，如果客户500购买了超过1000种不同的产品，我希望在搜索与客户1口味相似的客户时，将其排除在结果之外-即使客户500购买了客户1购买的所有三种产品，并且通常在相似性/重叠性方面排名第一

我想应该有一些条件，但我似乎不知道什么是合适的条件

谢谢

我认为have不会满足您的要求，因为它只会提供重叠产品的总数，而您需要其他客户的产品总数

您可以在WHERE子句中使用相关子查询进行筛选：

为了提高性能，您需要purchasesCustomerID、ProductID上的索引。

谢谢，但此查询需要花费很长时间才能执行。它已经运行了几分钟，尚未完成。。。。编辑：为了澄清，以前的查询通常需要1-3秒才能执行。@GummySQL:为了获得更多信息，您需要对purchaes进行另一次扫描。我在回答中添加了一个索引建议。不幸的是，添加索引似乎没有帮助。有没有办法在结果中添加第三列，包含每个用户的购买数量，使用另一个联接，并以这种方式规避子查询的性能问题？

SELECT othercustomers.CustomerID, COUNT(DISTINCT othercustomers.ProductID)
FROM `purchases` AS thiscustomer
JOIN `purchases` AS othercustomers ON
    thiscustomer.CustomerID != othercustomers.CustomerID
    AND thiscustomer.ProductID = othercustomers.ProductID
WHERE 
    thiscustomer.CustomerID = '1'
    AND (
        SELECT COUNT(DISTINCT ProductID) 
        FROM `purchases` AS p
        WHERE p.CustomerID = othercustomers.CustomerID
    ) < 1000
GROUP BY othercustomers.CustomerID
ORDER BY COUNT(DISTINCT othercustomers.ProductID) DESC
LIMIT 10