Description:
Publishes same metrics as the Ambari Reporting task using the Site To Site protocol.
Tags:
status, metrics, site, site to site
Properties:
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the Syncfusion Expression Language Guide, and whether a property is considered “sensitive”, meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.
Name |
Default Value |
Allowable Values |
Description |
Destination URL |
The URL of the destination NiFi instance or, if clustered, a comma-separated list of address in the format of http(s)://host:port/nifi. This destination URL will only be used to initiate the Site-to-Site connection. The data sent by this reporting task will be load-balanced on all the nodes of the destination (if clustered). Supports Expression Language: true (will be evaluated using variable registry only) |
||
Input Port Name |
The name of the Input Port to deliver data to. Supports Expression Language: true (will be evaluated using variable registry only) |
||
SSL Context Service |
Controller Service API: RestrictedSSLContextService Implementation: StandardRestrictedSSLContextService |
The SSL Context Service to use when communicating with the destination. If not specified, communications will not be secure. | |
Instance URL |
http://${hostname(true)}:8080/nifi |
The URL of this instance to use in the Content URI of each event. Supports Expression Language: true (will be evaluated using variable registry only) |
|
Compress Events |
true |
* true * false |
Indicates whether or not to compress the data being sent. |
Communications Timeout |
30 secs | Specifies how long to wait to a response from the destination before deciding that an error has occurred and canceling the transaction | |
Transport Protocol |
RAW |
* RAW * HTTP |
Specifies which transport protocol to use for Site-to-Site communication. |
HTTP Proxy hostname | Specify the proxy server's hostname to use. If not specified, HTTP traffics are sent directly to the target NiFi instance. | ||
HTTP Proxy port | Specify the proxy server's port number, optional. If not specified, default port 80 will be used. | ||
HTTP Proxy username | Specify an user name to connect to the proxy server, optional. | ||
HTTP Proxy password |
Specify an user password to connect to the proxy server, optional. Sensitive Property: true |
||
Record Writer |
Controller Service API: RecordSetWriterFactory Implementations: JsonRecordSetWriter CSVRecordSetWriter ScriptedRecordSetWriter FreeFormTextRecordSetWriter ParquetRecordSetWriter AvroRecordSetWriter XMLRecordSetWriter |
Specifies the Controller Service to use for writing out the records. | |
Hostname |
${hostname(true)} |
The Hostname of this NiFi instance to be included in the metrics Supports Expression Language: true (will be evaluated using variable registry only) |
|
Application ID |
nifi |
The Application ID to be included in the metrics Supports Expression Language: true (will be evaluated using variable registry only) |
|
Output Format |
ambari-format |
* Ambari Format ![]() * Record Format ![]() |
The output format that will be used for the metrics. If Record Format is selected, a Record Writer must be provided. If Ambari Format is selected, the Record Writer property should be empty. |
State management:
This component does not store state.
Restricted:
This component is not restricted.
System Resource Considerations:
None specified.
Summary:
The Site-to-Site Metrics Reporting Task allows the user to publish NiFi’s metrics (as in the Ambari reporting task) to the same NiFi instance or another NiFi instance. This provides a great deal of power because it allows the user to make use of all of the different Processors that are available in NiFi in order to process or distribute that data.
Ambari format
There are two available output formats. The first one is the Ambari format as defined in the Ambari Metrics Collector API which is a JSON with dynamic keys. If using this format you might be interested by the below Jolt specification to transform the data.
[ { "operation": "shift", "spec": { "metrics": { "*": { "metrics": { "*": { "$": "metrics.[#4].metrics.time", "@": "metrics.[#4].metrics.value" } }, "*": "metrics.[&1].&" } } } } ] |
This would transform the below sample:
{ "metrics": [{ "metricname": "jvm.gc.time.G1OldGeneration", "appid": "nifi", "instanceid": "8927f4c0-0160-1000-597a-ea764ccd81a7", "hostname": "localhost", "timestamp": "1520456854361", "starttime": "1520456854361", "metrics": { "1520456854361": "0" } }, { "metricname": "jvm.thread_states.terminated", "appid": "nifi", "instanceid": "8927f4c0-0160-1000-597a-ea764ccd81a7", "hostname": "localhost", "timestamp": "1520456854361", "starttime": "1520456854361", "metrics": { "1520456854361": "0" } }] } |
into:
{ "metrics": [{ "metricname": "jvm.gc.time.G1OldGeneration", "appid": "nifi", "instanceid": "8927f4c0-0160-1000-597a-ea764ccd81a7", "hostname": "localhost", "timestamp": "1520456854361", "starttime": "1520456854361", "metrics": { "time": "1520456854361", "value": "0" } }, { "metricname": "jvm.thread_states.terminated", "appid": "nifi", "instanceid": "8927f4c0-0160-1000-597a-ea764ccd81a7", "hostname": "localhost", "timestamp": "1520456854361", "starttime": "1520456854361", "metrics": { "time": "1520456854361", "value": "0" } }] } |
Record format
The second format is leveraging the record framework of NiFi so that the user can define a Record Writer and directly specify the output format and data with the assumption that the input schema is the following:
{ "type" : "record", "name" : "metrics", "namespace" : "metrics", "fields" : [ { "name" : "appid", "type" : "string" }, { "name" : "instanceid", "type" : "string" }, { "name" : "hostname", "type" : "string" }, { "name" : "timestamp", "type" : "long" }, { "name" : "loadAverage1min", "type" : "double" }, { "name" : "availableCores", "type" : "int" }, { "name" : "FlowFilesReceivedLast5Minutes", "type" : "int" }, { "name" : "BytesReceivedLast5Minutes", "type" : "long" }, { "name" : "FlowFilesSentLast5Minutes", "type" : "int" }, { "name" : "BytesSentLast5Minutes", "type" : "long" }, { "name" : "FlowFilesQueued", "type" : "int" }, { "name" : "BytesQueued", "type" : "long" }, { "name" : "BytesReadLast5Minutes", "type" : "long" }, { "name" : "BytesWrittenLast5Minutes", "type" : "long" }, { "name" : "ActiveThreads", "type" : "int" }, { "name" : "TotalTaskDurationSeconds", "type" : "long" }, { "name" : "TotalTaskDurationNanoSeconds", "type" : "long" }, { "name" : "jvmuptime", "type" : "long" }, { "name" : "jvmheap_used", "type" : "double" }, { "name" : "jvmheap_usage", "type" : "double" }, { "name" : "jvmnon_heap_usage", "type" : "double" }, { "name" : "jvmthread_statesrunnable", "type" : ["int", "null"] }, { "name" : "jvmthread_statesblocked", "type" : ["int", "null"] }, { "name" : "jvmthread_statestimed_waiting", "type" : ["int", "null"] }, { "name" : "jvmthread_statesterminated", "type" : ["int", "null"] }, { "name" : "jvmthread_count", "type" : "int" }, { "name" : "jvmdaemon_thread_count", "type" : "int" }, { "name" : "jvmfile_descriptor_usage", "type" : "double" }, { "name" : "jvmgcruns", "type" : ["long", "null"] }, { "name" : "jvmgctime", "type" : ["long", "null"] } ] } |