Azure Data Factory Connector 2.0.0.1
Prerequisites
- Version 9.2.9 or later
- Connection component 1.0.0.3 or later. Note that the Connections component will be installed or updated automatically if necessary when you install this extension.
- Azure Connections Extension (automatically installed by Catalog)
- Privileges Required to Use Azure Connections
- Privileges Required to Use the Azure Data Factory Connector
Installation
To install the Azure Data Factory Connector and create a connection to Azure AD:
-
Locate the Azure DataFactory component in the Catalog, select version 2.0.0.0, and install it.
-
Create an Azure AD Connection.
Note: The Process Server for the connector must have the ServiceForRedwood_DataFactory Process Server service.
-
To use the connector, you must first create an app registration with a service principle in Azure Active Directory (see https://docs.microsoft.com/en-gb/azure/active-directory/develop/howto-create-service-principal-portal#register-an-application-with-azure-ad-and-create-a-service-principal). This client application must be assigned the Data Factory Contributor permission. Make note of the following settings from the Data Factory:
- Resource Group Name
- Factory Name
Contents of the Component
Object Type | Name |
---|---|
Application | GLOBAL.Redwood.REDWOOD.DataFactory |
Process Definition | REDWOOD.Redwood_DataFactory_ImportJobTemplate |
Process Definition | REDWOOD.Redwood_DataFactory_ShowPipelines |
Process Definition | REDWOOD.Redwood_DataFactory_RunPipeline |
Process Definition | REDWOOD.Redwood_DataFactory_Template |
Library | REDWOOD.DataFactory |
Process Server Service | REDWOOD.ServiceForRedwood_DataFactory |
Running Data Factory Processes
The Resource Group name is defined on the Azure Subscription.
Finding Data Factory Pipelines
To retrieve the list of pipelines available for scheduling got to the Redwood_DataFactory application, navigate to Applications > Redwood_DataFactory > DataFactory_ShowPipelines and submit it.
Select a Connection, the Resource Group Name and the Factory Name you want to list the pipelines from. You can filter the list by adding a Process Name filter.
Once the process has finished, choose stdout.log
, and you will see the output as follows:
Here you can find the value later used as pipeline name, the first element straight after the index.
Schedule a Data Factory Pipeline
In the Redwood_DataFactory application choose DataFactory_RunPipeline and submit it
Again, specify the Subscription ID, the Resource Group Name and the Factory Name you want to run the pipelines from, as well as the name of the pipeline to execute.
Import Pipelines as Process Definitions
Submit DataFactory_ImportJobTemplate to import a pipeline as a Process Definition.
Here the pipeline name can be used to only import a selection of pipelines. Also, the Overwrite flag can be set to specify that existing definitions can be overwritten. On the Target tab it allows you to specify a target Partition, Application and prefix for the generated definition:
Troubleshooting
In the Control step of the Submit Wizard, where you select the Queue, you can add additional logging to stdout.log
by selecting debug in the Out Log and Error Log fields on the Advanced Options tab.