Description:

Sends the contents of a FlowFile as a message to Apache Kafka using the Kafka 0.10.x Producer API.The messages to send may be individual FlowFiles or may be delimited, using a user-specified delimiter, such as a new-line. Please note there are cases where the publisher can get into an indefinite stuck state. We are closely monitoring how this evolves in the Kafka community and will take advantage of those fixes as soon as we can. In the meantime it is possible to enter states where the only resolution will be to restart the JVM NiFi runs on. The complementary NiFi processor for fetching messages is ConsumeKafka_0_10.

Tags:

Apache, Kafka, Put, Send, Message, PubSub, 0.10.x

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the Expression Language Guide

Name Default Value Allowable Values Description
Kafka Brokers localhost:9092 A comma-separated list of known Kafka Brokers in the format (host):(port)
Supports Expression Language: true
Security Protocol PLAINTEXT
  • PLAINTEXT
  • SSL
  • SASL_PLAINTEXT
  • SASL_SSL
Protocol used to communicate with brokers. Corresponds to Kafka's 'security.protocol' property.
Kerberos Service Name The Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config. Corresponds to Kafka's 'security.protocol' property.It is ignored unless one of the SASL options of the 'Security Protocol' are selected.
Kerberos Principal The Kerberos principal that will be used to connect to brokers. If not set, it is expected to set a JAAS configuration file in the JVM properties defined in the bootstrap.conf file. This principal will be set into 'sasl.jaas.config' Kafka's property.
Kerberos Keytab The Kerberos keytab that will be used to connect to brokers. If not set, it is expected to set a JAAS configuration file in the JVM properties defined in the bootstrap.conf file. This principal will be set into 'sasl.jaas.config' Kafka's property.
SSL Context Service Controller Service API:
SSLContextService
Implementation:
StandardSSLContextService
StandardRestrictedSSLContextService
Specifies the SSL Context Service to use for communicating with Kafka.
Topic Name The name of the Kafka Topic to publish to.
Supports Expression Language: true
Delivery Guarantee Best Effort
Guarantee Single Node Delivery
Guarantee Replicated Delivery
Specifies the requirement for guaranteeing that a message is sent to Kafka. Corresponds to Kafka's 'acks' property.
Kafka Key The Key to use for the Message. If not specified, the flow file attribute 'kafka.key' is used as the message key, if it is present and we're not demarcating. Supports Expression Language: true
Key Attribute Encoding utf-8
  • UTF-8 Encoded
  • Hex Encoded
FlowFiles that are emitted have an attribute named 'kafka.key'. This property dictates how the value of the attribute should be encoded.
Message Demarcator Specifies the string (interpreted as UTF-8) to use for demarcating multiple messages within a single FlowFile. If not specified, the entire content of the FlowFile will be used as a single message. If specified, the contents of the FlowFile will be split on this delimiter and each section sent as a separate Kafka message. To enter special character such as 'new line' use CTRL+Enter or Shift+Enter, depending on your OS. Supports Expression Language: true
Max Request Size 1 MB The maximum size of a request in bytes. Corresponds to Kafka's 'max.request.size' property and defaults to 1 MB (1048576).
Acknowledgment Wait Time 5 secs After sending a message to Kafka, this indicates the amount of time that we are willing to wait for a response from Kafka. If Kafka does not acknowledge the message within this time period, the FlowFile will be routed to 'failure'.
Max Metadata Wait Time 5 sec The amount of time publisher will wait to obtain metadata or wait for the buffer to flush during the 'send' call before failing the entire 'send' call. Corresponds to Kafka's 'max.block.ms' property Supports Expression Language: true
Partitioner class org.apache.kafka.clients.producer.internals.DefaultPartitioner RoundRobinPartitioner
DefaultPartitioner
Specifies which class to use to compute a partition id for a message. Corresponds to Kafka's 'partitioner.class' property.
Compression Type none * none
* gzip
* snappy
* lz4
This parameter allows you to specify the compression codec for all data generated by this producer.

Dynamic Properties:

Dynamic Properties allow the user to specify both the name and value of a property.

Name Value Description
The name of a Kafka configuration property. The value of a given Kafka configuration property. These properties will be added on the Kafka configuration after loading any provided configuration properties. In the event a dynamic property represents a property that was already set, its value will be ignored and WARN message logged. For the list of available Kafka properties please refer to: http://kafka.apache.org/documentation.html#configuration.

Relationships:

Name Description
success FlowFiles for which all content was sent to Kafka.
failure Any FlowFile that cannot be sent to Kafka will be routed to this Relationship

Reads Attributes:

None specified.

Writes Attributes:

Name Description
msg.count The number of messages that were sent to Kafka for this FlowFile. This attribute is added only to FlowFiles that are routed to success. If the (Message Demarcator) Property is not set, this will always be 1, but if the Property is set, it may be greater than 1.

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

Summary:

This Processor puts the contents of a FlowFile to a Topic in Apache Kafka using KafkaProducer API available with Kafka 0.10.x API. The content of a FlowFile becomes the contents of a Kafka message. This message is optionally assigned a key by using the Property.

The Processor allows the user to configure an optional Message Demarcator that can be used to send many messages per FlowFile. For example, a \n could be used to indicate that the contents of the FlowFile should be used to send one message per line of text. It also supports multi-char demarcators (e.g., ‘my custom demarcator’). If the property is not set, the entire contents of the FlowFile will be sent as a single message. When using the demarcator, if some messages are successfully sent but other messages fail to send, the resulting FlowFile will be considered a failed FlowFile and will have additional attributes to that effect. One of such attributes is ‘failed.last.idx’ which indicates the index of the last message that was successfully ACKed by Kafka. (if no demarcator is used the value of this index will be -1). This will allow PublishKafka to only re-send un-ACKed messages on the next re-try.