String 在PostgreSQL中，根据字符串中的特定条件计算值？_String_Postgresql

String 在PostgreSQL中，根据字符串中的特定条件计算值？

string postgresql

String 在PostgreSQL中，根据字符串中的特定条件计算值？,string,postgresql,String,Postgresql,我的表中有一列条件，每行包含类似的文本：- Inclusion Criteria: - Female - > 40 years of age - Women who have first-degree relative suffered from breast cancer - Women who have first-degree relative suffered from ovarian cancer - Family history of male breast

我的表中有一列条件，每行包含类似的文本：-

Inclusion Criteria:

-  Female

-  > 40 years of age

-  Women who have first-degree relative suffered from breast cancer

-  Women who have first-degree relative suffered from ovarian cancer

-  Family history of male breast cancer

-  Family history of breast cancer (not necessarily first degree relatives) diagnosed before age of 40.

-  Family history of breast cancer (not necessarily first degree relatives) affecting 2 or more family members

-  Personal history of ovarian cancer

-  Personal history of premalignant conditions of breast and ovary

Exclusion Criteria:

     - Women with mammogram within one year
     -  adults aged 50-75

我需要找出PostgreSQL中包含和排除标准的计数。例如，此处包含标准为9，排除标准为2。

您是说上述所有内容都出现在一列中吗

如果是这样，您可以使用正则表达式模式匹配，从字符串“Inclusion Criteria:”到字符串“Exclution Criteria:”进行搜索，并计算其间的行数

Regex能让你头脑清醒。

您可以使用PL/pgSQL创建一个存储过程来进行解析和分离。一旦获得了它，就可以通过

SELECT

调用字符串或单元格，就像调用任何其他PostgreSQL函数一样

如果要在一个操作中同时返回两个值（包含和排除），最简单的方法是创建一个表，定义它们的名称和类型，如下所示：

CREATE TABLE condition_counts (
  num_of_inclusions VARCHAR,
  num_of_exclusions VARCHAR
);

然后，您可以在存储过程定义中使用它，如下所示：

CREATE OR REPLACE FUNCTION parse_conditions(conditions VARCHAR) RETURNS condition_counts AS $$
DECLARE
    condition_matches VARCHAR[2];
    inclusion_count INTEGER;
    exclusion_count INTEGER;
    parsed_conditions condition_counts%ROWTYPE;
BEGIN
    condition_matches = regexp_matches(conditions,
        E'^Inclusion Criteria:\\s*(.*)\\s*Exclusion Criteria:\\s*(.*)$');
    SELECT array_length(regexp_split_to_array(condition_matches[1], E'\\n\\s*-\\s*'), 1),
           array_length(regexp_split_to_array(condition_matches[2], E'\\n\\s*-\\s*'), 1)
      INTO parsed_conditions.num_of_inclusions, parsed_conditions.num_of_exclusions;
    return parsed_conditions;
END
$$ LANGUAGE plpgsql;

现在，您可以在提供的示例字符串上调用它，如下所示：

SELECT * FROM parse_conditions('Inclusion Criteria:

-  Female

-  > 40 years of age

-  Women who have first-degree relative suffered from breast cancer

-  Women who have first-degree relative suffered from ovarian cancer

-  Family history of male breast cancer

-  Family history of breast cancer (not necessarily first degree relatives) diagnosed before age of 40.

-  Family history of breast cancer (not necessarily first degree relatives) affecting 2 or more family members

-  Personal history of ovarian cancer

-  Personal history of premalignant conditions of breast and ovary

Exclusion Criteria:

     - Women with mammogram within one year
     -  adults aged 50-75');

并将按预期返回9和2的计数。您还可以从tablename执行

选择parse_条件（columnname）和其他各种组合，这对于PostgreSQL函数来说是正常的。
所以这实际上是一个文本处理/模式匹配/解析问题，而不是数据库本身。在您的示例中，整个文本都在一行中？或者不同的行在这里表示不同的行？@oto:不同的行表示不同的行。.我使用了代码数组长度（字符串到数组（子字符串（较低的（标准）来自“包含（+）排除”），“-”），1）-1作为cnt，我们有更好的解决方案吗？@user322101-好的，你还有其他列吗（例如，id
或timestamp
或类似的东西）确定这些行的顺序？请参见上面的@Feneric注释。