Mysql 从联接表中获取每个外键的第一条记录，但不包含重复的主键_Mysql_Sql_Sql Server_Tsql

Mysql 从联接表中获取每个外键的第一条记录，但不包含重复的主键

mysql sql sql-server tsql

Mysql 从联接表中获取每个外键的第一条记录，但不包含重复的主键,mysql,sql,sql-server,tsql,Mysql,Sql,Sql Server,Tsql,我有以下表格结构： Tags: Tag_ID | Name 1 | Tag1 2 | Tag2 3 | Tag3 4 | Tag4 5 | Tag5 6 | Tag6 Posts: Post_ID | Title | Body 1 | Post1 | Post1 2 | Post2 | Post2 3 | Post3 | Post3 4 | Post4 | Post4 5 |

我有以下表格结构：

Tags:
Tag_ID | Name
1      | Tag1
2      | Tag2
3      | Tag3
4      | Tag4
5      | Tag5
6      | Tag6

Posts:
Post_ID | Title | Body
1       | Post1 | Post1
2       | Post2 | Post2
3       | Post3 | Post3
4       | Post4 | Post4
5       | Post5 | Post5
6       | Post6 | Post6
7       | Post7 | Post7
8       | Post8 | Post8
9       | Post9 | Post9
10      | Post10| Post10

TagsPosts:
Tag_ID | Post_ID
1      | 1
1      | 2
1      | 3
1      | 4
1      | 5
1      | 10
1      | 1
2      | 1
2      | 2
2      | 6
2      | 7
3      | 4
3      | 8
3      | 9
4      | 7
5      | 1
5      | 2
5      | 3
5      | 4
5      | 5
5      | 6
5      | 7
6      | 2

我需要从查询中返回的是最常见标记的前3个帖子和其余标记的前1个帖子，而不提供任何重复帖子

到目前为止，我能够使用以下方法确定最常见标签的前3个帖子：

我还使用以下方法确定了其余标签的前1个帖子：

这几乎是存在的，但正如您所看到的，它返回重复的帖子

顺便说一下，我正在使用SQLServer2008Express进行测试，因为我不熟悉MySQL，但有人要求我确定可以应用于MySQL数据库的SQL查询。我想，如果我用T-SQL实现基本查询，那么将其转换为MySQL使用的任何SQL都将相当简单。

我将使用一个窗口函数，将其存储在CTE中，然后在谓词中引用它。同样，使用简化版本的数据，可以从SSM按原样运行。您列出了SQL Server，但没有列出版本。我相信表函数可以在SQLServer的2005版和更高版本上运行，但我不确定

declare @Tag table ( tagid int identity, name varchar(8));

insert into @Tag values ('Tag1'),('Tag2'),('Tag3'),('Tag4'),('Tag5'),('Tag6');

declare @Posts table (postid int identity, tagid int, postbody varchar(32));

insert into @Posts values (1,'Blah'),(1, 'Blahblah'),(2, 'Blahblah'),(3, 'Blahbodyblah'),(4, 'Blahblahblah'),(4, 'Blahbodyblah'),(4, 'Blah'),(5, 'Blah'),(5, 'Blahblah'),(6, 'Blahblah');

-- use a CTE
with a as 
    (
    select 
        p.postbody
    ,   count(t.tagid) as TimesTagged
        /* You stated you wanted a return of posts based on their occurrence.  I am counting a position 
        of the COUNTS OF TAGID's descending (greatest first) starting from one.  If you have a tie and want to 
        do those I would consider using DENSE_RANK.  You would have to insert more values where you get a third 
        occurence to become a TIE to see how Rank, Dense_Rank, and Row_number differ.  They all have their 
        purposes but the user should know what they want before determining which they use.
        */
    ,   row_number() over(order by count(t.tagid) desc) as PositionOfCountsTaggedByGreatestOrderFirst
    ,   Rank() over(order by count(t.tagid) desc) as PositionOfCountsTaggedByGreatestOrderFirst_Ranking
    ,   Dense_Rank() over(order by count(t.tagid) desc) as PositionOfCountsTaggedByGreatestOrderFirst_DenseRanking
    from @Tag t 
        join @Posts p on t.tagid = p.tagid
    group by p.postbody
    )
select *
from a
-- I only use Row_Number, you can change to use one of the other predicates above if you wish.
where PositionOfCountsTaggedByGreatestOrderFirst <= 3


/*
You are stating you only want the top three counts
windowed functions are better than using top IMHO as you can specify lists 'in', medians, and all other types
explicitly defined rather than having to repeating nested selects.  The only downer is you can not use 
a predicate on a windowed function directly.  Yout must create it and then in a nested select, CTE (as shown)
, a table variable, temp table, etc...  define a predicate on it.
*/

mysql的当前解决方案无法在ms sql上测试。是否要显示其他Post_ID而不是重复Post，或者只是从输出中删除该行？@HamletHakobyan-我想知道是否会是这种情况。在我第一次使用MySQL时，有没有一个简单易懂的MySQL解决方案？我想在可能的情况下展示另一个不重复的Post_ID。如果这是不可能的，那么我将在结果中保留该标记。然后，GROUPBY+FIRST可能是第一步解决方案

SELECT Top(3) t.Tag_ID, p.Post_ID FROM Tags as t
INNER JOIN TagsPosts as tp ON t.Tag_ID = tp.Tag_ID
INNER JOIN Posts as p ON tp.Post_ID = p.Post_ID
WHERE t.Tag_ID IN (
    SELECT TOP(1) Tag_ID FROM TagsPosts GROUP BY Tag_ID ORDER BY COUNT(Tag_ID) DESC)

Result:
Tag_ID | Post_ID
5      | 1
5      | 2
5      | 3

SELECT t.Tag_ID, p.Post_ID FROM Tags as t
INNER JOIN (
    SELECT t.Tag_ID, Max(p.Post_ID) as Post_ID FROM Tags as t
INNER JOIN TagsPosts as tp ON t.Tag_ID = tp.Tag_ID
INNER JOIN Posts as p ON tp.Post_ID = p.Post_ID
WHERE t.Tag_ID NOT IN (
        SELECT TOP(1) Tag_ID FROM TagsPosts GROUP BY Tag_ID ORDER BY COUNT(Tag_ID) DESC)
    AND
p.Post_ID NOT IN (
        SELECT Top(3) p.Post_ID FROM Tags as t
    INNER JOIN TagsPosts as tp ON t.Tag_ID = tp.Tag_ID
    INNER JOIN Posts as p ON tp.Post_ID = p.Post_ID
    WHERE t.Tag_ID IN (
        SELECT TOP(1) Tag_ID FROM TagsPosts GROUP BY Tag_ID ORDER BY COUNT(Tag_ID) DESC))
    GROUP BY t.Tag_ID) as s ON t.Tag_ID = s.Tag_ID
INNER JOIN Posts as p ON s.Post_ID = p.Post_ID

Result:
Tag_ID | Post_ID
1      | 10
2      | 7
3      | 9
4      | 7

declare @Tag table ( tagid int identity, name varchar(8));

insert into @Tag values ('Tag1'),('Tag2'),('Tag3'),('Tag4'),('Tag5'),('Tag6');

declare @Posts table (postid int identity, tagid int, postbody varchar(32));

insert into @Posts values (1,'Blah'),(1, 'Blahblah'),(2, 'Blahblah'),(3, 'Blahbodyblah'),(4, 'Blahblahblah'),(4, 'Blahbodyblah'),(4, 'Blah'),(5, 'Blah'),(5, 'Blahblah'),(6, 'Blahblah');

-- use a CTE
with a as 
    (
    select 
        p.postbody
    ,   count(t.tagid) as TimesTagged
        /* You stated you wanted a return of posts based on their occurrence.  I am counting a position 
        of the COUNTS OF TAGID's descending (greatest first) starting from one.  If you have a tie and want to 
        do those I would consider using DENSE_RANK.  You would have to insert more values where you get a third 
        occurence to become a TIE to see how Rank, Dense_Rank, and Row_number differ.  They all have their 
        purposes but the user should know what they want before determining which they use.
        */
    ,   row_number() over(order by count(t.tagid) desc) as PositionOfCountsTaggedByGreatestOrderFirst
    ,   Rank() over(order by count(t.tagid) desc) as PositionOfCountsTaggedByGreatestOrderFirst_Ranking
    ,   Dense_Rank() over(order by count(t.tagid) desc) as PositionOfCountsTaggedByGreatestOrderFirst_DenseRanking
    from @Tag t 
        join @Posts p on t.tagid = p.tagid
    group by p.postbody
    )
select *
from a
-- I only use Row_Number, you can change to use one of the other predicates above if you wish.
where PositionOfCountsTaggedByGreatestOrderFirst <= 3


/*
You are stating you only want the top three counts
windowed functions are better than using top IMHO as you can specify lists 'in', medians, and all other types
explicitly defined rather than having to repeating nested selects.  The only downer is you can not use 
a predicate on a windowed function directly.  Yout must create it and then in a nested select, CTE (as shown)
, a table variable, temp table, etc...  define a predicate on it.
*/