Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/spring-mvc/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C# 拆分逗号分隔的字符串并选择按计数排序的顶部“SearchTags”_C#_Sql_Sql Server_Linq_Linq To Sql - Fatal编程技术网

C# 拆分逗号分隔的字符串并选择按计数排序的顶部“SearchTags”

C# 拆分逗号分隔的字符串并选择按计数排序的顶部“SearchTags”,c#,sql,sql-server,linq,linq-to-sql,C#,Sql,Sql Server,Linq,Linq To Sql,我正在使用以下数据集: ID SearchTags 1 Cats,Birds,Dogs,Snakes,Roosters 2 Mice,Chickens,Cats,Lizards 3 Birds,Zebras,Sheep,Horses,Monkeys,Chimps 4 Lions,Tigers,Bears,Chickens 5 Cats,Goats,Pandas 6 Birds,Zebras,Sheep,Horses 7 Rats,Dogs,Hawks,Eagles,

我正在使用以下数据集:

ID  SearchTags
1   Cats,Birds,Dogs,Snakes,Roosters
2   Mice,Chickens,Cats,Lizards
3   Birds,Zebras,Sheep,Horses,Monkeys,Chimps
4   Lions,Tigers,Bears,Chickens
5   Cats,Goats,Pandas
6   Birds,Zebras,Sheep,Horses
7   Rats,Dogs,Hawks,Eagles,Tigers
8   Cats,Tigers,Dogs,Pandas
9   Dogs,Beavers,Sharks,Vultures
10  Cats,Bears,Bats,Leopards,Chickens
我需要查询出最流行的搜索标签列表

我有一个查询,它将返回最流行的搜索标签,但它会返回整个单词列表。这是我预料到的。是否可以拆分上的SearchTags列,并生成最常用标记的列表,以便最终得到如下列表/计数

Cats    5
Dogs    4
Chickens    3
Tigers  3
Bears   2
Sharks  1
etc...
而不是我现在得到的:

Cats,Birds,Dogs,Snakes,Roosters 1
Dogs,Beavers,Sharks,Vultures    1
Cats,Bears,Bats,Leopards,Chickens 1
etc...
下面是返回单词列表的查询

SELECT SearchTags, COUNT(*) AS TagCount
FROM Animals
GROUP BY SearchTags
ORDER BY TagCount DESC
我正在使用SQL Server。我更喜欢查询,但如果需要,可以创建存储过程


感谢您提供的帮助。

您已经用C和LINQ标记了问题,如果您在数据表中有数据,则可以执行以下操作:

DataTable dt = GetDataTableFromDB();
var query = dt.AsEnumerable()
               .Select(r => r.Field<string>("SearchTags").Split(','))
               .SelectMany(r => r)
               .GroupBy(r => r)
               .Select(grp => new
                   {
                       Key = grp.Key,
                       Count = grp.Count()
                   });
var query = db.YourTable
               .Select(r=> r.SearchTags)
               .AsEnumerable()
               .Where(r=> !string.IsNullOrWhiteSpace(r))
               .Select(r => r.Split(','))
               .SelectMany(r => r)
               .GroupBy(r => r)
               .Select(grp => new
                   {
                       Key = grp.Key,
                       Count = grp.Count()
                   });

           });
这将加载内存中的所有SearchTag,然后您将能够应用Split

您还可以在数据库端为SearchTag筛选空字符串值,如:

var query = db.YourTable
               .Where(r=> r.SearchTags != null && r.SearchTags.Trim() != "")
               .Select(r=> r.SearchTags)
               .AsEnumerable()
               .Select(r => r.Split(','))
               .SelectMany(r => r)
               .GroupBy(r => r)
               .Select(grp => new
                   {
                       Key = grp.Key,
                       Count = grp.Count()
                   });

           });
上述方法将从数据库端返回的集合中过滤掉null或空字符串/仅空格,并将更有效地工作

要筛选出日期,请执行以下操作:

DateTime dt = DateTime.Today.AddDays(-14);
var query = db.YourTable
               .Where(r=> r.SearchTags != null && 
                      r.SearchTags.Trim() != "" &&
                      r.MediaDate >= dt)
               .Select(r=> r.SearchTags)
               .AsEnumerable()
               .Select(r => r.Split(','))
               .SelectMany(r => r)
               .GroupBy(r => r)
               .Select(grp => new
                   {
                       Key = grp.Key,
                       Count = grp.Count()
                   });

           });

假设您想要TSQL

有许多用于拆分字符串的TSQL函数,但与大量循环函数相比,使用XQuery的任何函数都是最快的

我在一个生产系统中使用类似的方法,在一个具有10-15K CSV值的表上运行,它只需几秒钟,而旧的循环函数有时需要一分钟

无论如何,这里有一个快速的演示让你开始

DECLARE @DATA TABLE (ID INT, SEARCHTAGS VARCHAR(100))
INSERT INTO @DATA
SELECT 1,'Cats,Birds,Dogs,Snakes,Roosters' UNION ALL
SELECT 2,'Mice,Chickens,Cats,Lizards' UNION ALL
SELECT 3,'Birds,Zebras,Sheep,Horses,Monkeys,Chimps' UNION ALL
SELECT 4,'Lions,Tigers,Bears,Chickens' UNION ALL
SELECT 5,'Cats,Goats,Pandas' UNION ALL
SELECT 6,'Birds,Zebras,Sheep,Horses' UNION ALL
SELECT 7,'Rats,Dogs,Hawks,Eagles,Tigers' UNION ALL
SELECT 8,'Cats,Tigers,Dogs,Pandas' UNION ALL
SELECT 9,'Dogs,Beavers,Sharks,Vultures' UNION ALL
SELECT 10,'Cats,Bears,Bats,Leopards,Chickens'

;WITH TagList AS
(
SELECT ID, Split.a.value('.', 'VARCHAR(max)') AS String
FROM  (SELECT ID, 
              CAST ('<M>' + REPLACE(CAST(SEARCHTAGS AS VARCHAR), ',', '</M><M>') + '</M>' AS XML) AS String  
       FROM @DATA) AS A 
CROSS APPLY String.nodes ('/M') AS Split(a)
)

SELECT TOP (10) String, COUNT(*) AS [SearchCount]
FROM TagList
GROUP BY String
ORDER BY [SearchCount] DESC

注:任何与字符串操作有关的操作都几乎总是更快,如果你能在c。。。因此,Habib的答案可能比TSQL解决方案更有效。

请参阅:我尝试实现LINQ to SQL示例,但get Object reference未设置为Object的实例。DB.anives.Selectr=>r.SearchTags返回结果,因此示例的第一部分是确定的。有什么想法吗?@Maddhacker24,你的数据库中可能有SearchTags为null的记录,因此只需在AsEnumerable之后添加一个过滤器,我已经将LINQ的代码修改为SQL。其中r=>!string.IsNullOrWhiteSpacerBingo,就是这样。多谢各位@Maddhacker24,不客气,我还添加了另一个选项来过滤数据库端的空/空字符串值,如果您的记录包含许多搜索标记的空值数据,这将为您提供更好的性能。