如何在groovy脚本/nifi中将两行数据合并到一行中?

如何在groovy脚本/nifi中将两行数据合并到一行中?,groovy,apache-nifi,Groovy,Apache Nifi,我的数据是非结构化数据的形式,其中列的末尾存储在两行中,如下所示 UID|Name|ID|Mail 1|Ester|991|sd gmail 2|Siva|992|siva hotmail 3|Hari|993|hi gmail UID|Name|ID|Mail 1|Ester|991|sd gmail 2|Siva|992|siva hotmail 3|Hari|993|hi gmail UID|Name|ID|Mail 1|Ester|991|sd gmail 2|Siva|992|si

我的数据是非结构化数据的形式,其中列的末尾存储在两行中,如下所示

UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
数据中的某些行已完成,但有些行已避免将这两行数据转换为单行,如下所示

UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
我不知道nifi处理器在哪些方面有助于这种转换

但我试着按照Groovy脚本来读行,却找不到将吐出的行组合成一行的方法

def flowfile = session.get()
if(!flowfile)return
flowfile = session.write(flowfile, {rawIn, rawOut->
    // ## transform streams into reader and writer
    rawIn.withReader("UTF-8"){reader->
        rawOut.withWriter("UTF-8"){writer->
            reader.eachLine{line, lineNum->
                    if(!line.isEmpty())
                    {// ## let use regular expression to transform each line
                    writer << line << '\n'
                }
            }
        }
    }
} as StreamCallback)
session.transfer(flowfile, REL_SUCCESS) 
def flowfile=session.get()
如果(!flowfile)返回
flowfile=session.write(flowfile,{rawIn,rawOut->
//##将流转换为读写器
rawIn.withReader(“UTF-8”){reader->
rawOut.withWriter(“UTF-8”){writer->
reader.eachLine{line,lineNum->
如果(!line.isEmpty())
{/##让我们使用正则表达式来转换每一行

writer我假设带有标题的第一行不能有新行符号,并提供分隔符的数量

以下几行只是检查分隔符的计数,并决定是否写入新行

但是,如果在最后一列中有新行,这个算法将起作用

代码段:

def reader = new StringReader('''UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail''')

def writer = new StringWriter()

def delimCount = 0
reader.eachWithIndex{line,id->
    if(id==0){
        //let's count delims in header
        delimCount = line.count('|')
        //write header as is
        writer << line
    }else{
        if( line.count('|')==delimCount ){
            writer << '\n' //write new line
        }else{
            writer << ' ' //write space to continue previous line
        }
        writer << line
    }
}

println writer.toString()

我假设带有标题的第一行不能有新行符号,并提供分隔符的数量

以下几行只是检查分隔符的计数,并决定是否写入新行

但是,如果在最后一列中有新行,这个算法将起作用

代码段:

def reader = new StringReader('''UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail''')

def writer = new StringWriter()

def delimCount = 0
reader.eachWithIndex{line,id->
    if(id==0){
        //let's count delims in header
        delimCount = line.count('|')
        //write header as is
        writer << line
    }else{
        if( line.count('|')==delimCount ){
            writer << '\n' //write new line
        }else{
            writer << ' ' //write space to continue previous line
        }
        writer << line
    }
}

println writer.toString()