SQL:如何查询一组数据并计算给定字符串列表中匹配的字符串数
我建立了一个简单的问答系统 在我的数据库中,有三个表:SQL:如何查询一组数据并计算给定字符串列表中匹配的字符串数,sql,postgresql,Sql,Postgresql,我建立了一个简单的问答系统 在我的数据库中,有三个表: question ( id int question varchar(200) answer_id int /* foreign key mapping to answer.id */ ); answer ( id int answer varchar(500) ) question_elements ( id int seq int /*vocabular
question (
id int
question varchar(200)
answer_id int /* foreign key mapping to answer.id */
);
answer (
id int
answer varchar(500)
)
question_elements (
id int
seq int /*vocabulary in question location */
question_id int /** foreign key mapping to question.id */
vocabulary varchar(40)
)
现在我有一个问题:
What approach should a company adopt when its debt ratio is higher than 50% and wanna continue to get funding ?
所以,在表问题中,记录是:
question {
id: 1,
question:"What approach should a company adopt when its debt ratio is higher than 50% and wanna continue to get funding ?",
answer_id:1
}
表中的问题_元素
question_elements [
{
id: 1,
seq: 1,
question_id: 1,
vocabulary: "what"
},
{
id: 2,
seq: 2,
question_id: 1,
vocabulary: "approach"
},
{
id: 3,
seq: 3,
question_id: 1,
vocabulary: "should"
},
{
id: 4,
seq: 4,
question_id: 1,
vocabulary: "a"
},
{
id: 5,
seq: 5,
question_id: 1,
vocabulary: "company"
},
{
id: 6,
seq: 6,
question_id: 1,
vocabulary: "adopt"
},
{
id: 7,
seq: 7,
question_id: 1,
vocabulary: "when"
},
....
....
{
id: 19,
seq: 19,
question_id: 1,
vocabulary: "get"
},
{
id: 20,
seq: 20,
question_id: 1,
vocabulary: "funding"
}
]
现在,当用户输入:
What action does a company should do when it wanna get more funding with high debt ratio
我的想法是将上述语句拆分为一个字符串列表,并执行一个SQL查询,以便通过给出上述字符串列表来计算表question_元素中匹配的字符串
PostgreSQL中的SQL语句是什么?如果我理解得很好,您需要这样的语句:
WITH answer AS (
SELECT
'What action does a company should do when it wanna get more funding' AS a
),
question AS (
SELECT 'what' AS q
UNION ALL SELECT 'should'
UNION ALL SELECT 'a'
UNION ALL SELECT 'company'
UNION ALL SELECT 'do'
UNION ALL SELECT 'when'
)
SELECT COUNT(result)
FROM (
SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result
FROM answer
) AS tbaux
WHERE result IN (select CAST(q AS VARCHAR) FROM question);
没有文字大写,以及一些解释:
SELECT COUNT(result)
FROM ( --count how many lines have in the subquery
SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result --this break the user input in one word per line, excluding ' '
FROM answer
) AS tbaux --name of the subquery
WHERE upper(result) IN (select upper(CAST(q AS VARCHAR)) FROM question); --upper turns lowercase letters in uppercase, only the line who match will remain to the COUNT()
这将统计用户输入的单词在问题表中的数量(在您的案例中为question\u elements
)
如果我理解得很好,你想要这样的东西:
WITH answer AS (
SELECT
'What action does a company should do when it wanna get more funding' AS a
),
question AS (
SELECT 'what' AS q
UNION ALL SELECT 'should'
UNION ALL SELECT 'a'
UNION ALL SELECT 'company'
UNION ALL SELECT 'do'
UNION ALL SELECT 'when'
)
SELECT COUNT(result)
FROM (
SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result
FROM answer
) AS tbaux
WHERE result IN (select CAST(q AS VARCHAR) FROM question);
没有文字大写,以及一些解释:
SELECT COUNT(result)
FROM ( --count how many lines have in the subquery
SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result --this break the user input in one word per line, excluding ' '
FROM answer
) AS tbaux --name of the subquery
WHERE upper(result) IN (select upper(CAST(q AS VARCHAR)) FROM question); --upper turns lowercase letters in uppercase, only the line who match will remain to the COUNT()
这将统计用户输入的单词在问题表中的数量(在您的案例中为question\u elements
)
不需要
问题元素表
with ui(ui) as (
values ('What action does a company should do when it wanna get more funding with high debt ratio')
)
select id, count(*) as matches, question
from
(
select id, question, regexp_split_to_table(question, '\s+') as word
from question
) q
inner join
regexp_split_to_table((select ui from ui), '\s+') ui(word) using (word)
group by 1, 3
order by matches desc
question\u elements
表格不是必需的
with ui(ui) as (
values ('What action does a company should do when it wanna get more funding with high debt ratio')
)
select id, count(*) as matches, question
from
(
select id, question, regexp_split_to_table(question, '\s+') as word
from question
) q
inner join
regexp_split_to_table((select ui from ui), '\s+') ui(word) using (word)
group by 1, 3
order by matches desc
您使用的是json字段还是这是您向我们显示数据的方式?看起来您有两个问题。一个是按“
执行拆分,另一个是查看有多少匹配项。您要查找的输出列可能重复?如果认为我理解您的意思,请在可能的情况下尝试我的答案。您是使用json字段还是以这种方式向我们显示数据?看起来您有两个问题。一个是按“
执行拆分,另一个是查看有多少匹配项。您要查找的输出列可能重复?如果认为我理解您的意思,请尽可能尝试我的答案。