Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/azure/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在Azure Data Factory中使用此Rest API_Azure_Rest_Azure Data Factory - Fatal编程技术网

如何在Azure Data Factory中使用此Rest API

如何在Azure Data Factory中使用此Rest API,azure,rest,azure-data-factory,Azure,Rest,Azure Data Factory,我需要从Azure数据工厂调用一个RESTAPI,并将数据插入SQL表 API返回的JSON格式如下: { "serviceResponse": { "supportOffice": "EUKO", "totalPages": 5, "pageNo": 1, "recordsPerPage": 1000, "projects": [ { "projectID":1 ...} , { "

我需要从Azure数据工厂调用一个RESTAPI,并将数据插入SQL表

API返回的JSON格式如下:

{
    "serviceResponse": {
        "supportOffice": "EUKO",
        "totalPages": 5,
        "pageNo": 1,
        "recordsPerPage": 1000,
        "projects": [
            { "projectID":1 ...} , { "projectID":2 ...} ,...
        ]
    }
}
URL的格式为

我已经设法设置了一个RestService来调用API并返回JSON和一个SQL接收器,该接收器将接收JSON并将其传递给存储过程,然后存储数据

然而,我正在努力解决的是如何处理分页

我试过:

  • RestService上的分页选项:我认为这不会起作用,因为它只允许XPATH返回完整的下一个URL。我看不出它是否允许从totalPages和pageNo计算URL。(或者我无法让它工作)

  • 我试图在处理之前向API添加一个Web调用,然后计算页面数。虽然不理想,但它确实起作用了,直到我达到1mb/1min的极限,因为有些响应相当大。这是行不通的

  • 我试图看看API是否可以更改,但这是不可能的


  • 我想知道是否有人对我如何使其工作有任何想法,或者是否成功地使用了类似的API?

    下面的解释将介绍如何创建一个如下所示的管道。请注意,它使用存储过程活动、Web活动以及每个活动。

    首先设置Azure SQL DB,设置AAD管理员,然后按照所述授予数据库中的ADF MSI权限。然后创建下表和两个存储过程:

    CREATE TABLE [dbo].[People](
        [id] [int] NULL,
        [email] [varchar](255) NULL,
        [first_name] [varchar](100) NULL,
        [last_name] [varchar](100) NULL,
        [avatar] [nvarchar](1000) NULL
    )
    
    GO
    /*
    sample call:
    exec uspInsertPeople @json = '{"page":1,"per_page":3,"total":12,"total_pages":4,"data":[{"id":1,"email":"george.bluth@reqres.in","first_name":"George","last_name":"Bluth","avatar":"https://s3.amazonaws.com/uifaces/faces/twitter/calebogden/128.jpg"},{"id":2,"email":"janet.weaver@reqres.in","first_name":"Janet","last_name":"Weaver","avatar":"https://s3.amazonaws.com/uifaces/faces/twitter/josephstein/128.jpg"},{"id":3,"email":"emma.wong@reqres.in","first_name":"Emma","last_name":"Wong","avatar":"https://s3.amazonaws.com/uifaces/faces/twitter/olegpogodaev/128.jpg"}]}'
    */
    create proc uspInsertPeople @json nvarchar(max)
    as
    begin
    insert into People (id, email, first_name, last_name, avatar)
    select d.*
    from OPENJSON(@json)
    WITH (
            [data] nvarchar(max) '$.data' as JSON
    )
    CROSS APPLY OPENJSON([data], '$')
        WITH (
            id int '$.id',
            email varchar(255) '$.email',
            first_name varchar(100) '$.first_name',
            last_name varchar(100) '$.last_name',
            avatar nvarchar(1000) '$.avatar'
        ) d;
    end
    
    GO
    
    create proc uspTruncatePeople
    as
    truncate table People
    
    
    
    接下来,在Azure Data Factory v2中创建一个新管道,将其重命名为ForEachPage,然后转到代码视图并粘贴到以下JSON中:

    {
        "name": "ForEachPage",
        "properties": {
            "activities": [
                {
                    "name": "GetTotalPages",
                    "type": "WebActivity",
                    "dependsOn": [
                        {
                            "activity": "Truncate SQL Table",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "policy": {
                        "timeout": "7.00:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "typeProperties": {
                        "url": {
                            "value": "https://reqres.in/api/users?page=1",
                            "type": "Expression"
                        },
                        "method": "GET"
                    }
                },
                {
                    "name": "ForEachPage",
                    "type": "ForEach",
                    "dependsOn": [
                        {
                            "activity": "GetTotalPages",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "userProperties": [],
                    "typeProperties": {
                        "items": {
                            "value": "@range(1,activity('GetTotalPages').output.total_pages)",
                            "type": "Expression"
                        },
                        "activities": [
                            {
                                "name": "GetPage",
                                "type": "WebActivity",
                                "dependsOn": [],
                                "policy": {
                                    "timeout": "7.00:00:00",
                                    "retry": 0,
                                    "retryIntervalInSeconds": 30,
                                    "secureOutput": false,
                                    "secureInput": false
                                },
                                "userProperties": [],
                                "typeProperties": {
                                    "url": {
                                        "value": "@concat('https://reqres.in/api/users?page=',item())",
                                        "type": "Expression"
                                    },
                                    "method": "GET"
                                }
                            },
                            {
                                "name": "uspInsertPeople stored procedure",
                                "type": "SqlServerStoredProcedure",
                                "dependsOn": [
                                    {
                                        "activity": "GetPage",
                                        "dependencyConditions": [
                                            "Succeeded"
                                        ]
                                    }
                                ],
                                "policy": {
                                    "timeout": "7.00:00:00",
                                    "retry": 0,
                                    "retryIntervalInSeconds": 30,
                                    "secureOutput": false,
                                    "secureInput": false
                                },
                                "userProperties": [],
                                "typeProperties": {
                                    "storedProcedureName": "[dbo].[uspInsertPeople]",
                                    "storedProcedureParameters": {
                                        "json": {
                                            "value": {
                                                "value": "@string(activity('GetPage').output)",
                                                "type": "Expression"
                                            },
                                            "type": "String"
                                        }
                                    }
                                },
                                "linkedServiceName": {
                                    "referenceName": "lsAzureDB",
                                    "type": "LinkedServiceReference"
                                }
                            }
                        ]
                    }
                },
                {
                    "name": "Truncate SQL Table",
                    "type": "SqlServerStoredProcedure",
                    "dependsOn": [],
                    "policy": {
                        "timeout": "7.00:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "typeProperties": {
                        "storedProcedureName": "[dbo].[uspTruncatePeople]"
                    },
                    "linkedServiceName": {
                        "referenceName": "lsAzureDB",
                        "type": "LinkedServiceReference"
                    }
                }
            ],
            "annotations": []
        }
    }
    
    创建一个与Azure SQL DB链接的lsAzureDB服务,将其设置为使用MSI进行身份验证


    该管道调用(目前可以工作,但它不是我管理的API,因此可能会在某个时候停止工作),以演示如何循环以及如何获取Web活动的结果,并通过存储过程调用和存储过程中的JSON解析将其插入SQL表。循环将以并行方式运行,但您当然可以更改ForEachPage活动的设置,使其以串行方式运行。

    这种方法不起作用,原因有几个,但主要问题是管道“复制数据”活动无法索引到深度嵌套的数组中

    我可以通配符数组的第一级,但任何更深层次的操作都需要实际的整数索引值。只要数组中只有一个条目,之后就很好了,但是我们会丢失数据

     {
       "source": {
          "path": "$['myObject']['element'][*]['externalUID'][0]['provider']"
        },
        sink": {
           name": "EXTERNALUID_PROVIDER"
        }
    },
    

    是否有可能将数据分阶段发布?例如,使用函数将数据从API传输到Blob存储,然后一旦完成,该函数可能会触发管道从那里提取数据。@SamaraSoucy MSFT我想这可能是一种可能性。我们对Azure非常陌生,我希望尽可能避免在Azure Data Factory之外创建功能,以避免它变得太复杂。感谢您的全面回复。这与我最初设置它的方式相似。然而,这种方式的问题在于网络活动。有一个1MB/1Min的限制,API返回的响应似乎没有达到这个限制(我不太清楚为什么现在json是700kb,几秒钟后返回——但这是另一个问题)@crackly我会在循环中添加一个等待活动,以确保你的速度不会超过允许的速度。并且你需要ForEach循环设置来避免使用并行性。