Azure data factory Azure Data Factory数据流使用缓存查找进入选择任务_Azure Data Factory

Azure data factory Azure Data Factory数据流使用缓存查找进入选择任务

azure-data-factory

Azure data factory Azure Data Factory数据流使用缓存查找进入选择任务,azure-data-factory,Azure Data Factory,我尝试使用数据流缓存查找来实现从SQL源到SQL目标的动态列映射。其思想是将映射持久化到元数据表上，作为源读取到数据流中，并作为缓存查找存储。看起来是这样的：（键列设置为sourceField）现在，我们只需要将此映射使用到基于规则的映射到选择转换中，以便选择仅映射的列并应用目标命名。表达式如下所示：此配置将导致选择任务出现运行时错误：你知道为什么吗？错误消息没有帮助编辑：在完整脚本定义下面 parameters{ sourceSchema as string (&quo

我尝试使用数据流缓存查找来实现从SQL源到SQL目标的动态列映射。其思想是将映射持久化到元数据表上，作为源读取到数据流中，并作为缓存查找存储。看起来是这样的：

（键列设置为sourceField）

现在，我们只需要将此映射使用到基于规则的映射到选择转换中，以便选择仅映射的列并应用目标命名。表达式如下所示：

此配置将导致选择任务出现运行时错误：

你知道为什么吗？错误消息没有帮助

编辑：在完整脚本定义下面

parameters{
    sourceSchema as string ("dbo"),
    sourceTable as string ("RiepilogoSocieta")
}
source(allowSchemaDrift: true,
    validateSchema: false,
    inferDriftedColumnTypes: true,
    isolationLevel: 'READ_UNCOMMITTED',
    format: 'table') ~> DatabookSource
source(output(
        {_id} as integer,
        sourceSchema as string,
        sourceTable as string,
        sourceField as string,
        targetField as string,
        targetType as string,
        targetSchema as string,
        targetTable as string
    ),
    allowSchemaDrift: true,
    validateSchema: false,
    isolationLevel: 'READ_UNCOMMITTED',
    format: 'table') ~> Objmetadata
DatabookSource select(mapColumn(
        each(match(!isNull(CacheFieldsMap#lookup(name).targetField)),
            CacheFieldsMap#lookup($$).targetField = $$)
    ),
    skipDuplicateMapInputs: true,
    skipDuplicateMapOutputs: true) ~> MapColumns
Objmetadata filter(sourceSchema == $sourceSchema && sourceTable == $sourceTable) ~> FilterForSourceTable
FilterForSourceTable select(mapColumn(
        sourceField,
        targetField
    ),
    skipDuplicateMapInputs: true,
    skipDuplicateMapOutputs: true) ~> SelectFieldsMap
MapColumns derive({_createdAt} = currentTimestamp(),
        {_updatedAt} = currentTimestamp()) ~> AddMetaColums
AddMetaColums sink(allowSchemaDrift: true,
    validateSchema: false,
    deletable:false,
    insertable:true,
    updateable:false,
    upsertable:false,
    truncate:true,
    format: 'table',
    skipDuplicateMapInputs: true,
    skipDuplicateMapOutputs: true,
    errorHandlingOption: 'stopOnFirstError') ~> AnalyticsSink
SelectFieldsMap sink(skipDuplicateMapInputs: true,
    skipDuplicateMapOutputs: true,
    keys:['sourceField'],
    store: 'cache',
    format: 'inline',
    output: false,
    saveOrder: 1) ~> CacheFieldsMap

发布DSL脚本（单击脚本按钮）。我搞不清$$映射到什么模式。确保与数据类型兼容，并且不将字符串参数的整数类型传递给查找。

谢谢Kiran，刚刚用脚本定义更新了我的帖子。目前不支持该脚本，但它已在即时待办事项列表中。在此之前，您应该将其拆分为一个数据流活动（或查找活动），后跟一个数据流活动，并将此信息作为数组传递给第二个数据流。此功能应在几周后可用