在Oracle SQL中计算列中的字数
如何根据模式对这些数据进行分组?在SQL中可能吗在Oracle SQL中计算列中的字数,sql,database,oracle,group-by,count,Sql,Database,Oracle,Group By,Count,如何根据模式对这些数据进行分组?在SQL中可能吗 CREATE TABLE ABC ("NAMES" VARCHAR2(50 BYTE)) ` `INSERT INTO ABC (names) VALUES ('CA Apple 3'); INSERT INTO ABC (names) VALUES ('New Apple 4'); INSERT INTO ABC (names) VALUES ('Cra Apple 5'); INSERT INTO ABC (names) VALUE
CREATE TABLE ABC ("NAMES" VARCHAR2(50 BYTE)) `
`INSERT INTO ABC (names) VALUES ('CA Apple 3');
INSERT INTO ABC (names) VALUES ('New Apple 4');
INSERT INTO ABC (names) VALUES ('Cra Apple 5');
INSERT INTO ABC (names) VALUES ('UK Apple 5c');
INSERT INTO ABC (names) VALUES ('Apple 6s');
INSERT INTO ABC (names) VALUES ('Apple 7');
INSERT INTO ABC (names) VALUES ('Apple x');
INSERT INTO ABC (names) VALUES ('az Apple xr');
INSERT INTO ABC (names) VALUES ('Apple xs');
INSERT INTO ABC (names) VALUES ('Motorola RIZR');
INSERT INTO ABC (names) VALUES ('eu Motorola RAZR');
INSERT INTO ABC (names) VALUES ('Motorola RoZR');
INSERT INTO ABC (names) VALUES ('Motorola RR');
INSERT INTO ABC (names) VALUES ('fin Motorola RIZ');
INSERT INTO ABC (names) VALUES ('Motorola R');
INSERT INTO ABC (names) VALUES ('sau Google Pixel');
INSERT INTO ABC (names) VALUES ('Google Pixel 2');
INSERT INTO ABC (names) VALUES ('Google Pixel 3');
INSERT INTO ABC (names) VALUES ('Samsung Galaxy');
INSERT INTO ABC (names) VALUES ('aus Samsung Galaxy 3');
INSERT INTO ABC (names) VALUES ('Samsung Small 2');
INSERT INTO ABC (names) VALUES ('Samsung Earth');
INSERT INTO ABC (names) VALUES ('ko Samsung Solar');
INSERT INTO ABC (names) VALUES ('Samsung Milky Way');
INSERT INTO ABC (names) VALUES ('Samsung Chill');
INSERT INTO ABC (names) VALUES ('Yi Apple Chill');
INSERT INTO ABC (names) VALUES ('In Apple');
INSERT INTO ABC (names) VALUES ('razy Motorola');
INSERT INTO ABC (names) VALUES ('Samsung');`
我有一张这样的桌子,想象一下有500000行和4800个品牌名称
4800个品牌名称可以是第一个单词、第二个单词、第三个单词或最后一个单词
解决这个问题的一种可能方法是获取子字符串并对其进行计数,然后按countpattern desc排序,其中rownum<4800
现在我需要计算单词的数量,例如:苹果、三星、摩托罗拉
所需输出如下所示:
如果模式可以简化为名称的第一个单词,则类似以下内容:
select
case
when names like '%_ %' then substring(names, 1, charindex(' ', names) - 1)
else names
end pattern,
count(*) counter
from abc
group by case
when names like '%_ %' then substring(names, 1, charindex(' ', names) - 1)
else names
end
select p.pat, count(*)
from abc join
(select 'Motorola' as pat from dual union all
select 'Samsung' from dual union all
select 'Apple' from dual union all
. . .
) p
on abc.name like '%' || p.pat || '%'
group by p.pat
order by count(*) desc;
这将适用于SqlServer。
看。
结果:
如果您有要查找的关键字,您可以执行如下连接:
select
case
when names like '%_ %' then substring(names, 1, charindex(' ', names) - 1)
else names
end pattern,
count(*) counter
from abc
group by case
when names like '%_ %' then substring(names, 1, charindex(' ', names) - 1)
else names
end
select p.pat, count(*)
from abc join
(select 'Motorola' as pat from dual union all
select 'Samsung' from dual union all
select 'Apple' from dual union all
. . .
) p
on abc.name like '%' || p.pat || '%'
group by p.pat
order by count(*) desc;
请注意,如果一行与多个模式匹配,则这将对一行进行多次计数。在SQL Server 2008版开始的版本中,您可以在所需的列上进行计数。这假设您的表上有一个索引列。
例如:
CREATE UNIQUE INDEX uix_abc_id ON ABC(id);
CREATE FULLTEXT CATALOG ft AS DEFAULT;
CREATE FULLTEXT INDEX ON ABC(names)
KEY INDEX uix_abc_id
WITH STOPLIST = SYSTEM;
这将允许您使用sys.dm_fts_index_关键字存储过程有效地查询单词的出现次数
通常,常用关键字声明为stopwords,您可以,它不会被索引,也不会出现在所述存储过程中。此数据集的答案如下:
`select * from(
select x,count(*) as coun from (
select substr(names,
INSTR(names, ' ', -1, 1)+1) as x
from abc
union all
SELECT SUBSTR(names,
INSTR(names, ' ', 1, 1) + 1,
INSTR(names, ' ', 1, 2) - INSTR(names, ' ', 1, 1) - 1) as x
FROM abc
union all
SELECT SUBSTR(names,1,
INSTR(names, ' ',1 , 1)-1) as x
FROM abc
)
where x is not null and x not in ('1','2','3','4','5','6','7')
group by x
order by coun desc)
where rownum < 4800;'
答复:
我的系统上没有MySql,但是如果关键字是第二个或最后一个单词,它能工作吗?我喜欢你的主意,但是苹果数是11。我认为应该添加另一条语句。正如我所说,只有当模式是第一个单词时,它才有效,因为代码无法猜测值的哪一部分是模式。它是什么dbms?您标记了sql Server,OracleI想知道sql Server或oracle的语法。因为我使用这两种软件。我有表可以上传到其中一个。SQL Server没有varchar2,所以我投票给Oracle并删除了SQL Server标记。有没有办法删除这个问题,我在R tableUnlistRSplitToLowerABC$names中解决了这个问题,因为应用程序不允许我发布SQLquestions@StackOne . . . 您必须定义查找模式的规则,并解释问题中的规则。