First workflow with Data Integration Platform


Introduction

A first workflow to read text file and replace the specific content with given input.

Objective

Read the given input file and replace “Arizona” text into “Florida” and store the transformed content in given output directory with specific name.


List of processors used in this sample:

Processor

Comments

GetFile Reads content from the given file and generate as flowfile.
ReplaceText Replaces flowfile specific content with given replacement value.
UpdateAttribute Updates output file name in flowfile attribute.
PutFile Stores flowfile content in given output.


Input data used in this sample:

ID

NAME

AGE

ADDRESS

SALARY

1 Clinton 32 Arizona 2000.00
2 Harry 25 Arizona 1500.00
3 George 23 Los Angeles 2000.00
4 Michael 25 Texas 6500.00
5 Hardik 27 Arizona 8500.00
6 Jeny 22 Austin 4500.00


Workflow screenshot


Step 1: Configure GetFile processor

Drag and drop the GetFile processor to the canvas area. Configure Input directory and other required properties in configuration dialog as shown in the following screenshots.


Note: You must fill the value of required properties appear in bold. Other properties (not in bold) are considered optional.
Change “Run Schedule” field as per your requirement. For example, 1 day, 1 hour, 5 minutes, 30 seconds. In our sample, we set 1 day schedule.

Step 2: Configure ReplaceText processor

Add ReplaceText processor and configure the Search Value and Replacement Value property in configuration dialog as shown in the following screenshot.

Once configured, make connection between GetFile and ReplaceText with ‘Success’ relationship.

Note: If you want to ignore failure relationship, auto terminate the failure relationship in configuration popup as shown in the following screenshot.

Step 3: Configure UpdateAttribute processor

Drag and drop the UpdateAttribute processor to the canvas area. UpdateAttribute processor can be used to store resultant flowfile with specific file name. Add new property ‘filename’ with required file name as a value in configure dialogue. Also make connection between ReplaceText and UpdateAttribute with ‘Success’ relationship.

Step 4: Configure PutFile processor

Add the PutFile processor and configure required property fields in configuration dialog. Once configured, auto terminate Success and Failure relationship.

Step 5: Starting workflow

Once all processors are configured, start the workflow. You can see the data flow through the processors.

Output data:

ID

NAME

AGE

ADDRESS

SALARY

1 Clinton 32 Florida 2000.00
2 Harry 25 Florida 1500.00
3 George 23 Los Angeles 2000.00
4 Michael 25 Texas 6500.00
5 Hardik 27 Florida 8500.00
6 Jeny 22 Austin 4500.00