Azure Data Factory Connector 2.0.0.1

Prerequisites

Installation

To install the Azure Data Factory Connector and create a connection to Azure AD:

  1. Locate the Azure DataFactory component in the Catalog, select version 2.0.0.0, and install it.

  2. Create an Azure AD Connection.

    Note: The Process Server for the connector must have the ServiceForRedwood_DataFactory Process Server service.

  3. To use the connector, you must first create an app registration with a service principle in Azure Active Directory (see https://docs.microsoft.com/en-gb/azure/active-directory/develop/howto-create-service-principal-portal#register-an-application-with-azure-ad-and-create-a-service-principal). This client application must be assigned the Data Factory Contributor permission. Make note of the following settings from the Data Factory:

    • Resource Group Name
    • Factory Name

Contents of the Component

Object Type Name
Application GLOBAL.Redwood.REDWOOD.DataFactory
Process Definition REDWOOD.Redwood_DataFactory_ImportJobTemplate
Process Definition REDWOOD.Redwood_DataFactory_ShowPipelines
Process Definition REDWOOD.Redwood_DataFactory_RunPipeline
Process Definition REDWOOD.Redwood_DataFactory_Template
Library REDWOOD.DataFactory
Process Server Service REDWOOD.ServiceForRedwood_DataFactory

Running Data Factory Processes

The Resource Group name is defined on the Azure Subscription.

Finding Data Factory Pipelines

To retrieve the list of pipelines available for scheduling got to the Redwood_DataFactory application, navigate to Applications > Redwood_DataFactory > DataFactory_ShowPipelines and submit it.

DataFactory appplication

Select a Connection, the Resource Group Name and the Factory Name you want to list the pipelines from. You can filter the list by adding a Process Name filter.

Retrieve DataFactory pipelines

Once the process has finished, choose stdout.log, and you will see the output as follows:

Resulting pipeline list

Here you can find the value later used as pipeline name, the first element straight after the index.

Schedule a Data Factory Pipeline

In the Redwood_DataFactory application choose DataFactory_RunPipeline and submit it

Run a pipeline from the list

Again, specify the Subscription ID, the Resource Group Name and the Factory Name you want to run the pipelines from, as well as the name of the pipeline to execute.

Import Pipelines as Process Definitions

Submit DataFactory_ImportJobTemplate to import a pipeline as a Process Definition.

Import pipeline as process definition

Here the pipeline name can be used to only import a selection of pipelines. Also, the Overwrite flag can be set to specify that existing definitions can be overwritten. On the Target tab it allows you to specify a target Partition, Application and prefix for the generated definition:

Customize the import

Troubleshooting

In the Control step of the Submit Wizard, where you select the Queue, you can add additional logging to stdout.log by selecting debug in the Out Log and Error Log fields on the Advanced Options tab.

Troubleshooting the process by specifying advanced logging

See Also