通过pl/sql过程将带有行距和空值的逗号分隔值拆分为表中的列_Sql_Oracle_Plsql_Regexp Substr

通过pl/sql过程将带有行距和空值的逗号分隔值拆分为表中的列

sql oracle plsql

通过pl/sql过程将带有行距和空值的逗号分隔值拆分为表中的列,sql,oracle,plsql,regexp-substr,Sql,Oracle,Plsql,Regexp Substr,我在表中有一个字符串clob值，需要将其拆分为列。源表查询： Insert into disp_data(id,data) values(100, '"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team. Also it is difficult to

我在表中有一个字符串clob值，需要将其拆分为列。源表查询：

Insert into disp_data(id,data) values(100,
'"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team. 
Also it is difficult to prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","","","","","25"');

clob列值中还有空格、空值和行间距。所以当我尝试使用

select regexp_substr(data,'[^,]+',1,level) from disp_data 
connect by regexp_substr(data,'[^,]+',1,level) is not null.

问题是对于有行距的大文本，它将其拆分为不同的行。我曾想过使用上述结果集和pivot，但无法实现

我需要以列的形式获取这些数据，并推入目标表push_data_temp

select pid,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11 from push_data_temp;

clob列有11个逗号分隔的值，需要作为列推送到该表中。整个过程需要通过pl/sql过程完成

push_data_temp中的结果应该如下所示。

任何帮助都将不胜感激。 DB是Oracle19c

您的正则表达式，即连续的逗号，但希望您在任何带引号的字符串中都没有逗号。。。。如果有多个源行，则使用递归CTE更容易拆分：

with rcte (id, data, lvl, result) as (
  select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
  from disp_data
  union all
  select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
  from rcte
  where lvl <= regexp_count(data, ',')
)
select id, lvl, result
from rcte
order by id, lvl;

删除CLOB和connect by示例，因为它们会破坏它

当我尝试在文本之间加逗号时，它会不均匀地分割数据

这就是为什么我希望在任何引用的字符串中都没有逗号。因为你没有任何真正的空元素-你有。。。，，。。。而不是…，-我想，您可以跳过这些问题，使用不同的模式：

with rcte (id, data, lvl, result) as (
  select id, data, 1,
    cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, 1, null, 1) as varchar2(4000))
  from disp_data
  union all
  select id, data, lvl + 1,
    cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, lvl + 1, null, 1) as varchar2(4000))
  from rcte
  where lvl <= regexp_count(data, '("[^"]*"|[^,]+)')
)
...

如果您必须处理空元素，那么它仍然是可能的。这也不会处理没有字符串的转义双引号。在某种程度上，用PL/SQL编写自己的解析器会更容易；甚至可以将数据写入磁盘，并将其作为一个外部表读回，该表可以为您处理所有这些。

输入

您可以使用这些来动态转换：

您是否只使用pivot，而将数据拆分成行也很好？不，拆分也不理解，如果我没有错的话，答案已经为您提供了解决方案。@Vini-我使用varchar2是因为dbfiddle与CLOB断开了。。。您只需要先强制转换元素。在我的回答中添加了这个。嗨，谢谢。。你是否有时间做这件事还有一点疑问。。我需要为更大的文本这样做。但是，当我尝试在文本之间加逗号时，它会不均匀地分割数据。我已经用sql FIDLE编辑了数据。你能看一下吗？对于col2 tet，我在文本中添加了两个逗号。您在右侧看到的结果可能是由于逗号导致数据拆分错误。这能处理吗？我为第二列更改的文本为，在以下列之后添加：。这些值是很高的，而且很难确定。我需要修改查询，以便在查询中包含逗号text@Vini-这就是为什么我说希望在任何带引号的字符串中都没有逗号，因为在问题的示例中没有。更新了一个和新示例一起工作的版本。我知道，他们在最后一刻添加了需求。尽管如此，还是非常感谢！！

insert into push_data_temp (pid,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11)
with rcte (id, data, lvl, result) as (
  select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
  from disp_data
  union all
  select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
  from rcte
  where lvl <= regexp_count(data, ',')
)
select *
from (
  select id, lvl, result
  from rcte
)
pivot (max(result) as col for (lvl) in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11));

with rcte (id, data, lvl, result) as (
  select id, data, 1,
    cast(regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1) as varchar2(4000))
  from disp_data
  union all
  select id, data, lvl + 1,
    cast(regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1) as varchar2(4000))
  from rcte
  where lvl <= regexp_count(data, ',')
)
...

with rcte (id, data, lvl, result) as (
  select id, data, 1,
    cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, 1, null, 1) as varchar2(4000))
  from disp_data
  union all
  select id, data, lvl + 1,
    cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, lvl + 1, null, 1) as varchar2(4000))
  from rcte
  where lvl <= regexp_count(data, '("[^"]*"|[^,]+)')
)
...

create table disp_data (
  id int, data varchar2(1000)
);
Insert into disp_data(id,data) values(100,
'"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team. 
Also it is difficult to prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","","","","","25"');
commit;

create or replace package csv_pkg as  
  /* The describe function defines the new columns */  
  function describe (  
    tab in out dbms_tf.table_t,  
    col_names varchar2  
  ) return dbms_tf.describe_t;  
  
  /* Fetch_rows sets the values for the new columns */  
  procedure fetch_rows (col_names varchar2);  
end csv_pkg;  
/

create or replace package body csv_pkg as  
  function describe(  
    tab in out dbms_tf.table_t,  
    col_names varchar2  
  )   
    return dbms_tf.describe_t as  
    new_cols dbms_tf.columns_new_t;  
    col_id   pls_integer := 2;  
  begin   
    
    /* Enable the source colun for reading */  
    tab.column(1).pass_through := FALSE;  
    tab.column(1).for_read     := TRUE;  
    new_cols(1) := tab.column(1).description;  
      
    /* Extract the column names from the header string,  
       creating a new column for each   
     */  
    for j in 1 .. ( length(col_names) - length(replace(col_names,',')) ) + 1 loop   
      new_cols(col_id) := dbms_tf.column_metadata_t(  
        name=>regexp_substr(col_names, '[^,]+', 1, j),--'c'||j,   
        type=>dbms_tf.type_varchar2  
      );  
      col_id := col_id + 1;  
    end loop;  
    
    return dbms_tf.describe_t( new_columns => new_cols );  
  end;  
  
  procedure fetch_rows (col_names varchar2) as   
    rowset    dbms_tf.row_set_t;  
    row_count pls_integer;  
  begin  
    /* read the input data set */  
    dbms_tf.get_row_set(rowset, row_count => row_count);  
      
    /* Loop through the input rows... */  
    for i in 1 .. row_count loop  
      /* ...and the defined columns, extracting the relevant value   
         start from 2 to skip the input string  
      */  
      for j in 2 .. ( length(col_names) - length(replace(col_names,',')) ) + 2 loop  
        rowset(j).tab_varchar2(i) :=   
          regexp_substr(rowset(1).tab_varchar2(i), '[^,]+', 1, j - 1);  
      end loop;  
    end loop;  
      
    /* Output the new columns and their values */  
    dbms_tf.put_row_set(rowset);  
      
  end;  
    
end csv_pkg; 
/

create or replace function csv_to_columns(  
  tab table, col_names varchar2  
) return table pipelined row polymorphic using csv_pkg; 
/

with rws as (
  select data from disp_data
)
select c1, c2, c4, c4, c5, c6, c11
from   csv_to_columns ( 
  rws, 'c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11'
);

C1                   C2                             C4         C4         C5         C6                   C11       
-------------------- ------------------------------ ---------- ---------- ---------- -------------------- ----------
"Project title as pe "The values are not with respe "Disabled" "Disabled" "25 tonnes "www.examplesites.co "25"      
r the outstanding Re ct to the requirement and anal                        of fuel"  m/html.asp&net;"               
quirements"          ysis done by the team.                                                                         
                     Also it is difficult to prepar                                                                 
                     e a scenario notwithstanding t                                                                 
                     he fact it is difficult. This                                                                  
                     user story is going to be slig                                                                 
                     htly complex however it is up                                                                  
                     to the team"