Mysql 按用户分组以显示按时间排序的结果

Mysql 按用户分组以显示按时间排序的结果,mysql,sql,group-by,sql-order-by,Mysql,Sql,Group By,Sql Order By,我正在尝试为用户创建一个收件箱。我需要显示所有线程分组的通讯员和按时间的最后一次发布的消息为特定的通信。 我被这个sql困住了,不知道该如何继续: CREATE TABLE `user_mail` ( `id` int(10) NOT NULL, `author` int(10) NOT NULL, `recipient` int(10) NOT NULL, `title` varchar(100) NOT NULL, `message` text NOT NULL, `

我正在尝试为用户创建一个收件箱。我需要显示所有线程分组的通讯员和按时间的最后一次发布的消息为特定的通信。 我被这个sql困住了,不知道该如何继续:

CREATE TABLE `user_mail` (
  `id` int(10) NOT NULL,
  `author` int(10) NOT NULL,
  `recipient` int(10) NOT NULL,
  `title` varchar(100) NOT NULL,
  `message` text NOT NULL,
  `date` int(100) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

SELECT * FROM user_mail t1 
        INNER JOIN 
        (SELECT author, recepient, MAX(date) AS Ordered FROM user_mail
        WHERE recepient = '$thisUser' OR author = '$thisUser' GROUP BY author) t2
        ON t1.author = t2.author
        WHERE t1.recepient = '$thisUser' OR t1.author = '$thisUser' 
        ORDER BY t2.Ordered DESC
以下是我需要展示的内容:

Correspondence with User 1        

 Newest reply  - author: User 1    | time: 11:00
 Next reply    - author: This user | time: ...
 Reply         - author: User 1    | time: ...
 ...
 Original post - author: This user | time: 09:30
________________________________________________
Correspondence with User 2

 Newest reply  - author: This user | time: 10:30
 ...
 Original post - author: User 2    | time: 10:00
您可以看到与用户1的通信是如何占据首位的,因为它有最新的回复,尽管它的原始帖子比另一篇旧


此外,无论该用户启动还是其他用户启动,都应显示所有通信。

使用以下SQL语句,结果将与显示示例相同

SELECT id
      ,CASE WHEN rn_min = 1
            THEN 'Original Post - '
            WHEN rn_max = 1
            THEN 'Newest reply  - '
            WHEN rn_min = 2 AND rn_max != 2
            THEN 'Reply         - '
            ELSE 'Next reply    - '
        END +
       CASE WHEN author = @thisuser
            THEN 'author: This ' + CONVERT(VARCHAR, author) 
            ELSE 'author: User ' + CONVERT(VARCHAR, author) 
        END +
       CASE WHEN rn_min = 1 OR rn_max = 1
            THEN ' | time: '+ CONVERT(VARCHAR(8),posteddate,108)
            ELSE ''
        END value
  FROM (SELECT id
              ,author
              ,recipient
              ,message
              ,posteddate
              ,row_number() OVER (PARTITION BY id ORDER BY posteddate) rn_min
              ,row_number() OVER (PARTITION BY id ORDER BY posteddate desc) rn_max
          FROM user_mail
         WHERE author = @thisuser OR recipient = @thisuser
       ) t1

由于您的用户可以同时位于这两列中,因此您必须在搜索和分组方式中使用这两列的值

试试这个:

select * 
from user_mail t1
join 
(  
  select max(date) as ConvMaxDate, 
    case when author = '$thisUser' then recipient 
         else author 
    end as OtherUser
  from user_mail
  where author = '$thisUser' or recipient = '$thisUser'
  group by case when author = '$thisUser' then recipient 
                else author 
           end
) ConversationMaxDate
on Author = '$thisUser' and OtherUser = recipient 
   or Recipient = '$thisUser' and OtherUser = Author
order by ConvMaxDate desc, Date desc;
ConversationMaxDate的内部查询首先确定对话伙伴,然后按此其他用户分组,计算每个线程的最新日期。这是有效的,因为只有当你在特定的电子邮件中知道对话中的哪个是哪个时,你才能提供给这个用户


您将需要关于author、recipient、date和recipient、author、date的索引,因为MySQL随后可以使用索引合并。否则它将需要一个完整的表/索引扫描。

因为您不知道每个邮件$thisUser是作者还是收件人,所以您可以使用LEASTauthor、recipient和GREATESTauthor、recipient来标识一个线程,并在子查询的GROUP BY子句中使用它们以及用于连接条件

SELECT m.* 
FROM user_mail m
JOIN (
    SELECT
        LEAST(author, recipient)    as user1,
        GREATEST(author, recipient) as user2,
        MAX(date) as date
    FROM user_mail
    WHERE $thisUser IN (author, recipient)
    GROUP BY user1, user2
) s ON  s.user1 = LEAST(m.author, m.recipient)
    AND s.user2 = GREATEST(m.author, m.recipient)
WHERE $thisUser IN (m.author, m.recipient)
ORDER BY
    s.date DESC,
    LEAST(m.author, m.recipient),
    GREATEST(m.author, m.recipient),
    m.date DESC
但这在大数据集上会很慢,因为GROUPBY子句和JOIN条件不能使用索引。 我会使id自动递增主键,并使用它代替日期。 这样,您至少可以为连接使用索引PK。而且查询也会更短

SELECT m.* 
FROM user_mail m
JOIN (
    SELECT MAX(id) as id
    FROM user_mail
    WHERE $thisUser IN (author, recipient)
    GROUP BY
        LEAST(author, recipient),
        GREATEST(author, recipient)
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC
使用子查询的UNIONALL优化可以获得更好的性能

SELECT m.* 
FROM user_mail m
JOIN (
    SELECT MAX(id) as id
    FROM (
        SELECT recipient as user, MAX(id) as id
        FROM user_mail
        WHERE author = $thisUser
        GROUP BY recipient
        UNION ALL
        SELECT author as user, MAX(id) as id
        FROM user_mail
        WHERE recipient = $thisUser
        GROUP BY author
    ) sub1
    GROUP BY user
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC
对于此查询,您应该定义作者、收件人和收件人、作者的复合索引

使现代化 您的评论是对的:最后两个查询只返回每个对话的最新消息。但是第一个应该返回所有消息

但是-以下是UNION ALL optimized查询的正确版本:

SELECT m.*, s.max_id
FROM user_mail m
JOIN (
    SELECT other_user, MAX(id) as max_id
    FROM (
        SELECT recipient as other_user, MAX(id) as id
        FROM user_mail
        WHERE author = $thisUser
        GROUP BY recipient
        UNION ALL
        SELECT author as other_user, MAX(id) as id
        FROM user_mail
        WHERE recipient = $thisUser
        GROUP BY author
    ) sub1
    GROUP BY other_user
) s ON s.other_user = m.recipient
WHERE m.author = $thisUser

UNION ALL

SELECT m.*, s.max_id
FROM user_mail m
JOIN (
    SELECT other_user, MAX(id) as max_id
    FROM (
        SELECT recipient as other_user, MAX(id) as id
        FROM user_mail
        WHERE author = $thisUser
        GROUP BY recipient
        UNION ALL
        SELECT author as other_user, MAX(id) as id
        FROM user_mail
        WHERE recipient = $thisUser
        GROUP BY author
    ) sub1
    GROUP BY other_user
) s ON s.other_user = m.author
WHERE m.recipient = $thisUser

ORDER BY max_id DESC, id DESC
虽然看起来很庞大,但在我的百万行测试数据集上,这个查询运行不到20毫秒,而其他解决方案需要300-500毫秒。 请注意,子查询在这两部分中是相同的。MySQL应该能够缓存和重用结果。 为了避免代码重复,可以将子查询存储在字符串变量中并重用它。如果您使用MariaDB 10.2,您可能还想尝试CTE


另外,不要忘记定义作者、收件人和收件人、作者的索引参见如何定义线程?在您发布的示例中,似乎用户1发送或接收的任何邮件/消息都将是混合的。确切地说,我只需要按用户分组-发送给用户1的邮件和来自用户1的邮件,然后发送给用户2的邮件和来自用户2的邮件等等,不完全是按此顺序排列的,如果用户2通信/线程有一条最近的邮件,它应该位于顶部。是的,但是回复的数量实际上是未知的。我只给出了一个示意图示例,不需要将字符串附加到结果。给出的时间是指出哪个帖子应该在顶部,我将显示每个帖子的时间。我认为CASE语句除了author.顺便说一句,因为我没有提到:您在查询中使用了recepient,在表定义中使用了recipient作为列的名称。我选择了后者;如果你使用了其他的拼写,你当然需要调整它。我感谢你的回答。我尝试了你的上一个建议,但它并没有完全达到我的预期——它只显示了每个用户线程的最新消息。不过,您建议使用id而不是日期是很好的,因为我已经在id上有了一个自动递增和唯一索引。很抱歉,我不得不接受Solarflare的回答,因为它完全按照我的要求显示了结果。