Description:

Reads records from an incoming FlowFile using the provided Record Reader, and writes those records to the specified Kudu’s table. The schema for the table must be provided in the processor properties or from your source. If any error occurs while reading records from the input, or writing records to Kudu, the FlowFile will be routed to failure

Tags:

put, database, NoSQL, kudu, HDFS, record

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Name

Default Value

Allowable Values

Description

Kudu Masters

List all kudu masters’ ip with port (e.g. 7051), comma separated.

Supports Expression Language: true



Table Name

The name of the Kudu Table to put data into

Supports Expression Language: true



Skip head line

true true
false
Set it to true if your first line is the header line e.g. column names

Record Reader

Controller Service API: 


RecordReaderFactory

Implementations:


CSVReader
GrokReader
AvroReader
JsonTreeReader
JsonPathReader
ScriptedReader
The service for reading records from incoming flow files.
Insert Operation INSERT * INSERT
* INSERT_IGNORE
* UPSERT
Specify operation Type for this processor. Insert-Ignore will ignore duplicated rows

Flush Mode

AUTO_FLUSH_BACKGROUND * AUTO_FLUSH_SYNC
* AUTO_FLUSH_BACKGROUND
* MANUAL_FLUSH
Set the new flush mode for a kudu session. AUTO_FLUSH_SYNC: the call returns when the operation is persisted, else it throws an exception. AUTO_FLUSH_BACKGROUND: the call returns when the operation has been added to the buffer. This call should normally perform only fast in-memory operations but it may have to wait when the buffer is full and there's another buffer being flushed. MANUAL_FLUSH: the call returns when the operation has been added to the buffer, else it throws a Kudu Exception if the buffer is full.

Batch Size

100 Set the number of operations that can be buffered, between 2 - 100000. Depending on your memory size, and data size per row set an appropriate batch size. Gradually increase this number to find out the best one for best performances.

Supports Expression Language: true



Relationships:

Name

Description

success A FlowFile is routed to this relationship after it has been successfully stored in Kudu
failure A FlowFile is routed to this relationship if it cannot be sent to Kudu

Reads Attributes:

None specified.

Writes Attributes:

Name

Description

record.count Number of records written to Kudu

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.