Description and usage of CalculateRecordStats processor:
A processor that can count the number of items in a record set, as well as provide counts based on user-defined criteria on subsets of the record set.
Tags:
record, stats, metrics
Properties:
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the Expression Language Guide.
Name |
Default Value |
Allowable Values |
Description |
Record Reader |
Controller Service API: RecordReaderFactory Implementations: AvroReader SyslogReader ScriptedReader XMLReader GrokReader Syslog5424Reader CSVReader JsonTreeReader JsonPathReader |
A record reader to use for reading the records. | |
record-stats-limit |
10 |
Limit the number of individual stats that are returned for each record path to the top N results. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Relationships:
Name |
Description |
success | If a flowfile is successfully processed, it goes here. |
failure | If a flowfile fails to be processed, it goes here. |
Reads Attributes:
None specified.
Writes Attributes:
Name |
Description |
record.count | A count of the records in the record set in the flowfile. |
recordStats.<User Defined Property Name>.count | A count of the records that contain a value for the user defined property. |
recordStats.<User Defined Property Name>.<value>.count | Each value discovered for the user defined property will have its own count attribute. Total number of top N value counts to be added is defined by the limit configuration. |
State management:
This component does not store state.
Restricted:
This component is not restricted.
Input requirement:
This component requires an incoming relationship.
System Resource Considerations:
None specified.
Summary:
This processor takes in a record set and counts both the overall count and counts that are defined as dynamic properties that map a property name to a record path. Record path counts are provided at two levels:
- The overall count of all records that successfully evaluated a record path.
-
A breakdown of counts of unique values that matched the record path operation.
Consider the following record structure:{
“sport”: “Soccer”,
“name”: “John Smith”
}
A valid mapping here would be sport => /sport.
For a record set with JSON like that, five entries and 3 instances of soccer and two instances of football, it would set the following attributes:
- record_count: 5
- sport: 5
- sport.Soccer: 3
- sport.Football: 2