ApacheNIFI ExecuteScript:通过映射文件替换Json值的Groovy脚本
我正在使用ApacheNifi0.5.1编写Groovy脚本,用映射文件中包含的值替换传入的Json值。映射文件如下所示(它是一个简单的.txt): 我从以下几点开始:ApacheNIFI ExecuteScript:通过映射文件替换Json值的Groovy脚本,json,groovy,apache-nifi,Json,Groovy,Apache Nifi,我正在使用ApacheNifi0.5.1编写Groovy脚本,用映射文件中包含的值替换传入的Json值。映射文件如下所示(它是一个简单的.txt): 我从以下几点开始: import groovy.json.JsonBuilder import groovy.json.JsonSlurper import java.nio.charset.StandardCharsets def flowFile = session.get(); if (flowFile == null) {
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def content = """
{
"field1": "A"
"field2": "A",
"field3": "A"
}"""
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
builder.content.field1 = "A"
builder.content.field2 = "some text"
builder.content.field3 = "A2"
outputStream.write(builder.toPrettyString().getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
class TestLoadingMappings {
static void main(String[] args) {
def content =
'''
{"field2":"A",
"field3":"A"
}
'''
def slurper = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurper)
println "This is the content of my builder " + builder
def propertiesFile = new File('D:\\properties.txt')
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
def conf = new ConfigSlurper().parse(props).flatten()
conf.each { k, v ->
if (builder.content[k]) {
builder.content[k] = v
}
println("This prints the resulting JSON :" + builder.toPrettyString())
}
}
}
"field1"="substitutionText"
"field2"="substitutionText2"
config.parse(props).flatten().each { k,v ->
if(json[k]) {
json[k] = v
}
}
这第一步工作得很好,尽管它是硬编码的,远不是理想的。我最初的想法是使用ReplaceTextWithMapping来执行替换,但是它不能很好地处理复杂的映射文件(例如多列)。我想更进一步,但我不知道该怎么做。首先,我希望读取传入的流文件,而不是传递整个harcoded JSON。在NiFi中这怎么可能?在作为ExecuteScript的一部分运行脚本之前,我已经通过UpdateAttribute输出了一个包含内容的.Json文件,其中filename=myResultingJSON.Json。此外,我知道如何使用Groovy(String mappingContent=new file('/path/to/file').getText('UTF-8'
)加载.txt文件,但是如何使用加载的文件执行替换,以便生成的JSON如下所示:
{
"field1": "A"
"field2": "some text",
"field3": "A2"
}
谢谢你的帮助
一,
编辑:
对脚本的第一次修改允许我从InputStream读取:
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def content = org.apache.commons.io.IOUtils.toString(inputStream, java.nio.charset.StandardCharsets.UTF_8)
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
builder.content.field1 = "A"
builder.content.field2 = "some text"
builder.content.field3 = "A2"
outputStream.write(builder.toPrettyString().getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
然后,我开始使用ConfigSlurper测试该方法,并在将逻辑注入Groovy ExecuteScript之前编写了一个泛型类:
class TestLoadingMappings {
static void main(String[] args) {
def content = '''
{"field2":"A",
"field3": "A"
}
'''
println "This is the content of the JSON file" + content
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
println "This is the content of my builder " + builder
def propertiesFile = new File("D:\\myFile.txt")
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
def config = new ConfigSlurper().parse(props).flatten()
println "This is the content of my config " + config
config.each { k, v ->
if (builder[k]) {
builder[k] = v
}
}
println(builder.toPrettyString())
}
}
我返回一个groovy.lang.MissinPropertyException,这是因为映射没有那么简单。所有字段/属性(从field1到field3)都以相同的值进入InpuStream(例如)这意味着,例如,每当field2具有该值时,您可以确定它对其他两个属性都有效。但是,我不能有映射“field2”:“someText”的映射字段,因为实际映射由映射文件中的第一个值驱动。举个例子:
{
"field1": "A"
"field2": "A",
"field3": "A"
}
在映射文件中,我有:
A;some text;A2
然而,Field1需要映射到(文件中的第一个值)或保持不变,如果您愿意。Field2需要映射到最后一列(A2)中的值,最后Field3需要映射到中间列中的“某些文本”。 你能帮忙吗?这是我用Groovy和ExecuteScript可以实现的。如果需要,我可以将配置文件分成两部分
另外,我快速查看了另一个选项(PutDistributedMapCache),我不确定我是否理解如何将键值对加载到分布式映射缓存中。看起来您需要一个DistributedMapCache客户端,我不确定这是否易于实现 谢谢大家! 编辑2: 还有一些其他的进展,我现在已经完成了映射,但不确定为什么在读取属性文件的第二行时它会失败:"A" someText
"A2" anotherText
class TestLoadingMappings {
static void main(String[] args) {
def content = '''
{"field2":"A",
"field3":"A"
}
'''
println "This is the content of the JSON file" + content
def slurper = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurper)
println "This is the content of my builder " + builder
assert builder.content.field2 == "A"
assert builder.content.field3 == "A"
def propertiesFile = new File('D:\\myTest.txt')
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
println "This is the content of the properties " + props
def config = new ConfigSlurper().parse(props).flatten()
config.each { k, v ->
if (builder.content.field2) {
builder.content.field2 = config[k]
}
if (builder.content.field3) {
builder.content.field3 = config[k]
}
println(builder.toPrettyString())
println "This is my builder " + builder
}
}
}
返回的结果是:这是我的生成器{“field2”:“someText”,“field3”:“someText”}
知道为什么吗
多谢各位
编辑3(从下方移动)
我写了以下内容:
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def content = """
{
"field1": "A"
"field2": "A",
"field3": "A"
}"""
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
builder.content.field1 = "A"
builder.content.field2 = "some text"
builder.content.field3 = "A2"
outputStream.write(builder.toPrettyString().getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
class TestLoadingMappings {
static void main(String[] args) {
def content =
'''
{"field2":"A",
"field3":"A"
}
'''
def slurper = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurper)
println "This is the content of my builder " + builder
def propertiesFile = new File('D:\\properties.txt')
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
def conf = new ConfigSlurper().parse(props).flatten()
conf.each { k, v ->
if (builder.content[k]) {
builder.content[k] = v
}
println("This prints the resulting JSON :" + builder.toPrettyString())
}
}
}
"field1"="substitutionText"
"field2"="substitutionText2"
config.parse(props).flatten().each { k,v ->
if(json[k]) {
json[k] = v
}
}
但是,我必须更改映射文件的结构,如下所示:
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def content = """
{
"field1": "A"
"field2": "A",
"field3": "A"
}"""
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
builder.content.field1 = "A"
builder.content.field2 = "some text"
builder.content.field3 = "A2"
outputStream.write(builder.toPrettyString().getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
class TestLoadingMappings {
static void main(String[] args) {
def content =
'''
{"field2":"A",
"field3":"A"
}
'''
def slurper = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurper)
println "This is the content of my builder " + builder
def propertiesFile = new File('D:\\properties.txt')
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
def conf = new ConfigSlurper().parse(props).flatten()
conf.each { k, v ->
if (builder.content[k]) {
builder.content[k] = v
}
println("This prints the resulting JSON :" + builder.toPrettyString())
}
}
}
"field1"="substitutionText"
"field2"="substitutionText2"
config.parse(props).flatten().each { k,v ->
if(json[k]) {
json[k] = v
}
}
然后,我将ConfigSlurper“合并”到ExecuteScript脚本中,如下所示:
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
import org.apache.commons.io.IOUtils
import org.apache.nifi.processor.io.StreamCallback
import java.nio.charset.StandardCharsets
def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def content = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
outputStream.write(builder.toPrettyString().getBytes(StandardCharsets.UTF_8))
def propertiesFile = new File(''D:\\properties.txt')
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
def conf = new ConfigSlurper().parse(props).flatten();
conf.each { k, v ->
if (builder.content[k]) {
builder.content[k] = v
}
}
outputStream.write(content.toString().getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
问题似乎在于,我无法通过使用类似于在TestLoadingMappings中为创建的逻辑来复制原始映射文件中的逻辑。正如我在前面的评论/编辑中提到的,映射应以以下方式工作:
字段2=如果为A,则替换为“某些文本”
字段3=如果为A,则替换为A2
字段2=B,然后替换为“其他文本”
字段3=B,然后替换为B2
还有儿子
简而言之,映射是由InputStream中的传入值驱动的(该值会有所不同),根据JSON属性有条件地映射到不同的值。您能推荐一种更好的方法通过Groovy/ExecuteScript实现此映射吗?我可以灵活地修改映射文件,您能看到一种方法吗?我可以更改它以实现所需的映射吗
谢谢我有一些关于如何读取包含JSON的流文件的示例: 上面的结构是正确的;基本上,您可以在闭包中使用“inputStream”变量来读取传入的流文件内容。如果您想一次全部读取它(对于JSON可能需要这样做),您可以使用IOUtils.toString()后跟JSONSluber,就像上面链接中的示例所做的那样 对于映射文件,尤其是JSON为“平面”时,可以使用Java属性文件,将字段名称映射到新值: 字段2=一些文本 字段3=A2 签出以读取属性文件 一旦您读入传入的JSON文件并读入映射文件,您就可以使用数组表示法而不是直接成员表示法来获取JSON的各个字段。因此,假设我将属性读入ConfigSlurper,并且我希望覆盖输入JSON中的任何现有属性(例如称为“JSON”)使用属性文件中的一个。可能如下所示:
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def content = """
{
"field1": "A"
"field2": "A",
"field3": "A"
}"""
def slurped = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurped)
builder.content.field1 = "A"
builder.content.field2 = "some text"
builder.content.field3 = "A2"
outputStream.write(builder.toPrettyString().getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
import groovy.json.JsonBuilder
import groovy.json.JsonSlurper
class TestLoadingMappings {
static void main(String[] args) {
def content =
'''
{"field2":"A",
"field3":"A"
}
'''
def slurper = new JsonSlurper().parseText(content)
def builder = new JsonBuilder(slurper)
println "This is the content of my builder " + builder
def propertiesFile = new File('D:\\properties.txt')
Properties props = new Properties()
props.load(new FileInputStream(propertiesFile))
def conf = new ConfigSlurper().parse(props).flatten()
conf.each { k, v ->
if (builder.content[k]) {
builder.content[k] = v
}
println("This prints the resulting JSON :" + builder.toPrettyString())
}
}
}
"field1"="substitutionText"
"field2"="substitutionText2"
config.parse(props).flatten().each { k,v ->
if(json[k]) {
json[k] = v
}
}
然后可以继续使用outputStream.write()
您也可以通过处理器将映射加载到分布式缓存中,而不是从文件中读取映射。您可以在ExecuteScript中从分布式缓存映射服务器读取映射,这里有一个示例:
如果映射比较复杂,则可能需要使用TransformJSON处理器,该处理器将在下一版本的NiFi(0.7.0)中提供。相关的Jira案例如下:
编辑:
对于您的编辑,我没有意识到您对各种值有多个规则。在这种情况下,属性文件可能不是表示映射的最佳方式。相反,您可以使用JSON: