Sql 如何找到具有类似组件的产品？_Sql_Sql Server_Subquery

Sql 如何找到具有类似组件的产品？

sql sql-server

Sql 如何找到具有类似组件的产品？,sql,sql-server,subquery,Sql,Sql Server,Subquery,我正在努力寻找至少有75%以上类似成分的产品，我们有数千种产品。我的表有两列，Item和Component。例如： +------+-----------+ | Item | Component | +------+-----------+ | AAA | screw | | AAA | metal | | AAA | bar | | AAA | nut | | ABC | screw | | ABC | metal | | A

我正在努力寻找至少有75%以上类似成分的产品，我们有数千种产品。我的表有两列，Item和Component。例如：

+------+-----------+
| Item | Component |
+------+-----------+
| AAA  | screw     |
| AAA  | metal     |
| AAA  | bar       |
| AAA  | nut       |
| ABC  | screw     |
| ABC  | metal     |
| ABC  | bar       |
| CAA  | nut       |
| CAA  | cap       |
+------+-----------+

最终结果我想得到3列。项目、项目2和百分比相似。所以它看起来像：

+------+-------+-------------------+
| Item | Item2 | PercentageSimilar |
+------+-------+-------------------+
| AAA  | ABC   | 75%               |
| AAA  | CAA   | 25%               |
| ABC  | AAA   | 100%              |
| ABC  | CAA   | 0%                |
| CAA  | AAA   | 50%               |
| CAA  | ABC   | 0%                |
+------+-------+-------------------+

这可以用SQL实现吗？

您可以使用

自连接来实现这一点
select t1.item,t2.item
,100.*count(case when t1.component=t2.component then 1 end)
 /count(distinct t1.component) as pct_similar
from t t1
join t t2 on t1.item<>t2.item
group by t1.item,t2.item 

选择t1.项目，t2.项目
，100.*计数（t1.component=t2.component然后1结束时的情况）
/计数（不同的t1分量）与pct_相似
从t1开始
在t1.itemt2.item上连接t2
按t1.项目、t2.项目分组
您可以使用自连接来执行此操作
select t1.item,t2.item
,100.*count(case when t1.component=t2.component then 1 end)
 /count(distinct t1.component) as pct_similar
from t t1
join t t2 on t1.item<>t2.item
group by t1.item,t2.item 

选择t1.项目，t2.项目
，100.*计数（t1.component=t2.component然后1结束时的情况）
/计数（不同的t1分量）与pct_相似
从t1开始
在t1.itemt2.item上连接t2
按t1.项目、t2.项目分组
给你-比你要求的信息多了一点，但下面是详细信息，以便你了解如何实现这一结果：
使用示例数据设置：
DECLARE @ItemsAndComponents TABLE 
    (
        Item VARCHAR(3), 
        Component VARCHAR(50)
    )

INSERT INTO @ItemsAndComponents
VALUES
('AAA', 'screw'),
('AAA', 'metal'),
('AAA', 'bar'),
('AAA', 'nut'),
('ABC', 'screw'),
('ABC', 'metal'),
('ABC', 'bar'),
('CAA', 'nut'),
('CAA', 'cap')

查询：
SELECT DISTINCT     
       T1.Item AS [First Item], 
       T2.Item AS [Second Item],
       SUM(CASE WHEN T1.Component = T2.Component THEN 1 ELSE 0 END) AS [Matches], 
       COUNT(distinct T1.Component) AS [Total],
       CAST(100. * SUM(CASE WHEN T1.Component = T2.Component THEN 1 ELSE 0 END) / COUNT(distinct T1.Component) AS DECIMAL(18, 2)) AS [Percent Similar]
FROM @ItemsAndComponents T1
JOIN @ItemsAndComponents T2
    ON T1.Item <> T2.Item
GROUP BY T1.Item, T2.Item
ORDER BY T1.Item, T2.Item

给你-比你要求的信息多一点，但下面是细目，以便你了解如何实现这一结果：
使用示例数据设置：
DECLARE @ItemsAndComponents TABLE 
    (
        Item VARCHAR(3), 
        Component VARCHAR(50)
    )

INSERT INTO @ItemsAndComponents
VALUES
('AAA', 'screw'),
('AAA', 'metal'),
('AAA', 'bar'),
('AAA', 'nut'),
('ABC', 'screw'),
('ABC', 'metal'),
('ABC', 'bar'),
('CAA', 'nut'),
('CAA', 'cap')

查询：
SELECT DISTINCT     
       T1.Item AS [First Item], 
       T2.Item AS [Second Item],
       SUM(CASE WHEN T1.Component = T2.Component THEN 1 ELSE 0 END) AS [Matches], 
       COUNT(distinct T1.Component) AS [Total],
       CAST(100. * SUM(CASE WHEN T1.Component = T2.Component THEN 1 ELSE 0 END) / COUNT(distinct T1.Component) AS DECIMAL(18, 2)) AS [Percent Similar]
FROM @ItemsAndComponents T1
JOIN @ItemsAndComponents T2
    ON T1.Item <> T2.Item
GROUP BY T1.Item, T2.Item
ORDER BY T1.Item, T2.Item

“ABC”75%与“AAA”有何相似之处，“CAA仅为25%”与“AAA”有何相似之处？以及（我最喜欢的）“ABC”0%与“CAA”有何相似之处"?  你想做什么？你预期的结果毫无意义。。。你能澄清一下吗？@StanShaw，我也明白了，我认为相似性应该来自螺钉和螺母等@user6144226，但如果是这样的话，“AAA”和“ABC”将是100%相似的，因为组件列是相同的。你看起来像模糊查找吗，然后在SSIS中使用模糊查找来获得这一点。“ABC”75%与“AAA”相似吗，“CAA仅为25%”与“AAA”相似，（我最喜欢的）“ABC”与“CAA”相似吗？你想做什么？你预期的结果毫无意义。。。你能澄清一下吗？@StanShaw，我也明白了，我认为相似性应该来自螺钉和螺母等@user6144226，但如果是这样的话，“AAA”和“ABC”将是100%相似的，因为组件列是相同的。你看起来像模糊查找吗，然后在SSIS中使用模糊查找来实现这一点这是完美的！！非常感谢你的帮助这太完美了！！非常感谢您的帮助这确实让它更容易理解，谢谢！！这确实让它更容易理解，谢谢！！