Description and usage of ScanHBase processor:

Scans and fetches rows from an HBase table. This processor may be used to fetch rows from hbase table by specifying a range of rowkey values (start and/or end ),by time range, by filter expression, or any combination of them. Order of records can be controlled by a property ReversedNumber of rows retrieved by the processor can be limited.

Tags:

hbase, scan, fetch, get

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the Expression Language Guide.

Name

Default Value

Allowable Values

Description

HBase Client Service

Controller Service API: 


HBaseClientService

Implementations: 

HBase_1_1_2_ClientService


HBase_2_ClientService

Specifies the Controller Service to use for accessing HBase.

Table Name

The name of the HBase Table to fetch from.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Authorizations The list of authorizations to pass to the scanner. This will be ignored if cell visibility labels are not in use.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Start rowkey The rowkey to start scan from.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


End rowkey The row key to end scan by.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Time range min Time range min value. Both min and max values for time range should be either blank or provided.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Time range max Time range max value. Both min and max values for time range should be either blank or provided.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Limit rows Limit number of rows retrieved by scan.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Reversed order false * true
* false
Set whether this scan is a reversed one. This is false by default which means forward(normal) scan.
Max rows per flow file 0 Limits number of rows in single flow file content. Set to 0 to avoid multiple flow files.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Filter expression An HBase filter expression that will be applied to the scan. This property can not be used when also using the Columns property. Example: "ValueFilter( =, 'binaryprefix:commit' )"

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Columns An optional comma-separated list of "<colFamily>:<colQualifier>" pairs to fetch. To return all columns for a given family, leave off the qualifier such as "<colFamily1>,<colFamily2>".

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


JSON Format

full-row * full-row  
* col-qual-and-val  
Specifies how to represent the HBase row as a JSON document.

Encode Character Set

UTF-8 The character set used to encode the JSON representation of the row.

Decode Character Set

UTF-8 The character set used to decode data from HBase.

Relationships:

Name

Description

success All successful fetches are routed to this relationship.
failure All failed fetches are routed to this relationship.
original The original input file will be routed to this destination, even if no rows are retrieved based on provided conditions.

Reads Attributes:

None specified.

Writes Attributes:

Name

Description

hbase.table The name of the HBase table that the row was fetched from
mime.type Set to application/json when using a Destination of flowfile-content, not set or modified otherwise
hbase.rows.count Number of rows in the content of given flow file
scanhbase.results.found Indicates whether at least one row has been found in given hbase table with provided conditions. Could be null (not present) if transfered to FAILURE

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.