SQL：如何查询一组数据并计算给定字符串列表中匹配的字符串数_Sql_Postgresql

SQL：如何查询一组数据并计算给定字符串列表中匹配的字符串数

sql postgresql

SQL：如何查询一组数据并计算给定字符串列表中匹配的字符串数,sql,postgresql,Sql,Postgresql,我建立了一个简单的问答系统在我的数据库中，有三个表： question ( id int question varchar(200) answer_id int /* foreign key mapping to answer.id */ ); answer ( id int answer varchar(500) ) question_elements ( id int seq int /*vocabular

我建立了一个简单的问答系统

在我的数据库中，有三个表：

question (
  id         int
  question varchar(200)
  answer_id  int  /* foreign key mapping to answer.id */
);

answer (
  id  int
  answer    varchar(500)
)

question_elements (
    id    int
    seq   int    /*vocabulary in question location */
    question_id    int  /** foreign key mapping to question.id */
    vocabulary  varchar(40)
)

现在我有一个问题：

What approach should a company adopt when its debt ratio is higher than 50% and wanna continue to get funding ?

所以，在表问题中，记录是：

question {
  id: 1,
  question:"What approach should a company adopt when its debt ratio is higher than 50% and wanna continue to get funding ?",
  answer_id:1
}

表中的问题_元素

question_elements [
  {
    id: 1,
    seq: 1,
    question_id: 1,
    vocabulary: "what"
  },
  {
    id: 2,
    seq: 2,
    question_id: 1,
    vocabulary: "approach"
  },
  {
    id: 3,
    seq: 3,
    question_id: 1,
    vocabulary: "should"
  },
  {
    id: 4,
    seq: 4,
    question_id: 1,
    vocabulary: "a"
  },
  {
    id: 5,
    seq: 5,
    question_id: 1,
    vocabulary: "company"
  },
  {
    id: 6,
    seq: 6,
    question_id: 1,
    vocabulary: "adopt"
  },
  {
    id: 7,
    seq: 7,
    question_id: 1,
    vocabulary: "when"
  },
  ....
  ....
  {
    id: 19,
    seq: 19,
    question_id: 1,
    vocabulary: "get"
  },
  {
    id: 20,
    seq: 20,
    question_id: 1,
    vocabulary: "funding"
  }
]

现在，当用户输入：

What action does a company should do when it wanna get more funding with high debt ratio

我的想法是将上述语句拆分为一个字符串列表，并执行一个SQL查询，以便通过给出上述字符串列表来计算表question_元素中匹配的字符串

PostgreSQL中的SQL语句是什么？

如果我理解得很好，您需要这样的语句：

WITH answer AS (
    SELECT 
        'What action does a company should do when it wanna get more funding' AS a
),
question AS (
    SELECT 'what' AS q
    UNION ALL SELECT 'should'
    UNION ALL SELECT 'a'
    UNION ALL SELECT 'company'
    UNION ALL SELECT 'do'
    UNION ALL SELECT 'when'
)
SELECT COUNT(result)
FROM (
    SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result
    FROM answer
) AS tbaux
WHERE result IN (select CAST(q AS VARCHAR) FROM question);

没有文字大写，以及一些解释：

SELECT COUNT(result)
FROM (                                                 --count how many lines have in the subquery
    SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result        --this break the user input in one word per line, excluding ' '
    FROM answer
) AS tbaux                                                                  --name of the subquery
WHERE upper(result) IN (select upper(CAST(q AS VARCHAR)) FROM question);    --upper turns lowercase letters in uppercase, only the line who match will remain to the COUNT()

这将统计用户输入的单词在问题表中的数量（在您的案例中为

question\u elements

）

如果我理解得很好，你想要这样的东西：

WITH answer AS (
    SELECT 
        'What action does a company should do when it wanna get more funding' AS a
),
question AS (
    SELECT 'what' AS q
    UNION ALL SELECT 'should'
    UNION ALL SELECT 'a'
    UNION ALL SELECT 'company'
    UNION ALL SELECT 'do'
    UNION ALL SELECT 'when'
)
SELECT COUNT(result)
FROM (
    SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result
    FROM answer
) AS tbaux
WHERE result IN (select CAST(q AS VARCHAR) FROM question);

没有文字大写，以及一些解释：

SELECT COUNT(result)
FROM (                                                 --count how many lines have in the subquery
    SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result        --this break the user input in one word per line, excluding ' '
    FROM answer
) AS tbaux                                                                  --name of the subquery
WHERE upper(result) IN (select upper(CAST(q AS VARCHAR)) FROM question);    --upper turns lowercase letters in uppercase, only the line who match will remain to the COUNT()

这将统计用户输入的单词在问题表中的数量（在您的案例中为

question\u elements

）

不需要

问题元素表
with ui(ui) as (
    values ('What action does a company should do when it wanna get more funding with high debt ratio')
)
select id, count(*) as matches, question
from
    (
        select id, question, regexp_split_to_table(question, '\s+') as word
        from question
    ) q
    inner join
    regexp_split_to_table((select ui from ui), '\s+') ui(word) using (word)
group by 1, 3
order by matches desc

question\u elements
表格不是必需的
with ui(ui) as (
    values ('What action does a company should do when it wanna get more funding with high debt ratio')
)
select id, count(*) as matches, question
from
    (
        select id, question, regexp_split_to_table(question, '\s+') as word
        from question
    ) q
    inner join
    regexp_split_to_table((select ui from ui), '\s+') ui(word) using (word)
group by 1, 3
order by matches desc

您使用的是json字段还是这是您向我们显示数据的方式？看起来您有两个问题。一个是按“
执行拆分，另一个是查看有多少匹配项。您要查找的输出列可能重复？如果认为我理解您的意思，请在可能的情况下尝试我的答案。您是使用json字段还是以这种方式向我们显示数据？看起来您有两个问题。一个是按“
执行拆分，另一个是查看有多少匹配项。您要查找的输出列可能重复？如果认为我理解您的意思，请尽可能尝试我的答案。