Description:
Sends the contents of a FlowFile as individual records to Apache Kafka using the Kafka 0.10.x Producer API. The contents of the FlowFile are expected to be record-oriented data that can be read by the configured Record Reader. Please note there are cases where the publisher can get into an indefinite stuck state. We are closely monitoring how this evolves in the Kafka community and will take advantage of those fixes as soon as we can. In the meantime it is possible to enter states where the only resolution will be to restart the JVM NiFi runs on. The complementary NiFi processor for fetching messages is ConsumeKafka_0_10_Record.
Tags:
Apache, Kafka, Record, CSV, JSON, avro, logs, Put, Send, Message, PubSub, 0.10.x
Properties:
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.
Name |
Default Value |
Allowable Values |
Description |
Kafka Brokers |
localhost:9092 |
A comma-separated list of known Kafka Brokers in the format <host>:<port> Supports Expression Language: true |
|
Topic Name |
The name of the Kafka Topic to publish to. Supports Expression Language: true |
||
Record Reader |
Controller Service API: RecordReaderFactory Implementations: CSVReader JsonTreeReader GrokReader AvroReader JsonPathReader ScriptedReader |
The Record Reader to use for incoming FlowFiles | |
Record Writer |
Controller Service API: RecordSetWriterFactory Implementations: FreeFormTextRecordSetWriter JsonRecordSetWriter AvroRecordSetWriter CSVRecordSetWriter ScriptedRecordSetWriter |
The Record Writer to use in order to serialize the data before sending to Kafka | |
Security Protocol |
PLAINTEXT |
*PLAINTEXT ![]() *SSL ![]() *SASL_PLAINTEXT ![]() *SASL_SSL ![]() |
Protocol used to communicate with brokers. Corresponds to Kafka's 'security.protocol' property. |
Kerberos Service Name | The Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config. Corresponds to Kafka's 'security.protocol' property.It is ignored unless one of the SASL options of the <Security Protocol> are selected. | ||
Kerberos Principal | The Kerberos principal that will be used to connect to brokers. If not set, it is expected to set a JAAS configuration file in the JVM properties defined in the bootstrap.conf file. This principal will be set into 'sasl.jaas.config' Kafka's property. | ||
Kerberos Keytab | The Kerberos keytab that will be used to connect to brokers. If not set, it is expected to set a JAAS configuration file in the JVM properties defined in the bootstrap.conf file. This principal will be set into 'sasl.jaas.config' Kafka's property. | ||
SSL Context Service |
Controller Service API: SSLContextService Implementation: StandardSSLContextService |
Specifies the SSL Context Service to use for communicating with Kafka. | |
Delivery Guarantee |
0 |
*Best Effort ![]() *Guarantee Single Node Delivery ![]() *Guarantee Replicated Delivery ![]() |
Specifies the requirement for guaranteeing that a message is sent to Kafka. Corresponds to Kafka's 'acks' property. |
Message Key Field |
The name of a field in the Input Records that should be used as the Key for the Kafka message. Supports Expression Language: true |
||
Max Request Size |
1 MB | The maximum size of a request in bytes. Corresponds to Kafka's 'max.request.size' property and defaults to 1 MB (1048576). | |
Acknowledgment Wait Time |
5 secs | After sending a message to Kafka, this indicates the amount of time that we are willing to wait for a response from Kafka. If Kafka does not acknowledge the message within this time period, the FlowFile will be routed to 'failure'. | |
Max Metadata Wait Time |
5 sec |
The amount of time publisher will wait to obtain metadata or wait for the buffer to flush during the 'send' call before failing the entire 'send' call. Corresponds to Kafka's 'max.block.ms' property Supports Expression Language: true |
|
Partitioner class | org.apache.kafka.clients.producer.internals.DefaultPartitioner | * RoundRobinPartitioner * DefaultPartitioner | Specifies which class to use to compute a partition id for a message. Corresponds to Kafka's 'partitioner.class' property. |
Compression Type |
none | * none * gzip * snappy * lz4 | This parameter allows you to specify the compression codec for all data generated by this producer. |
Dynamic Properties:
Dynamic Properties allow the user to specify both the name and value of a property.
Name |
Value |
Description |
The name of a Kafka configuration property. | The value of a given Kafka configuration property. | These properties will be added on the Kafka configuration after loading any provided configuration properties. In the event a dynamic property represents a property that was already set, its value will be ignored and WARN message logged. For the list of available Kafka properties please refer to: http://kafka.apache.org/documentation.html#configuration. |
Relationships:
Name |
Description |
success | FlowFiles for which all content was sent to Kafka. |
failure | Any FlowFile that cannot be sent to Kafka will be routed to this Relationship |
Reads Attributes:
None specified.
Writes Attributes:
Name |
Description |
msg.count | The number of messages that were sent to Kafka for this FlowFile. This attribute is added only to FlowFiles that are routed to success. |
State management:
This component does not store state.
Restricted:
This component is not restricted.
Input requirement:
This component requires an incoming relationship.
See Also:
PublishKafka_0_10, ConsumeKafka_0_10, ConsumeKafkaRecord_0_10
Summary:
This Processor puts the contents of a FlowFile to a Topic in [Apache Kafka](http://kafka.apache.org/# “”) using KafkaProducer API available with Kafka 0.10.x API. The contents of the incoming FlowFile will be read using the configured Record Reader. Each record will then be serialized using the configured Record Writer, and this serialized form will be the content of a Kafka message. This message is optionally assigned a key by using the <Kafka Key> Property.
Security Configuration:
The Security Protocol property allows the user to specify the protocol for communicating with the Kafka broker. The following sections describe each of the protocols in further detail.
PLAINTEXT
This option provides an unsecured connection to the broker, with no client authentication and no encryption. In order to use this option the broker must be configured with a listener of the form:
PLAINTEXT://host.name:port
SSL
This option provides an encrypted connection to the broker, with optional client authentication. In order to use this option the broker must be configured with a listener of the form:
SSL://host.name:port
In addition, the processor must have an SSL Context Service selected.
If the broker specifies ssl.client.auth=none, or does not specify ssl.client.auth, then the client will not be required to present a certificate. In this case, the SSL Context Service selected may specify only a truststore containing the public key of the certificate authority used to sign the broker’s key.
If the broker specifies ssl.client.auth=required then the client will be required to present a certificate. In this case, the SSL Context Service must also specify a keystore containing a client key, in addition to a truststore as described above.
SASL_PLAINTEXT
This option uses SASL with a PLAINTEXT transport layer to authenticate to the broker. In order to use this option the broker must be configured with a listener of the form:
SASL_PLAINTEXT://host.name:port
In addition, the Kerberos Service Name must be specified in the processor.
SASL_PLAINTEXT - GSSAPI
If the SASL mechanism is GSSAPI, then the client must provide a JAAS configuration to authenticate. The JAAS configuration can be provided by specifying the java.security.auth.login.config system property in NiFi’s bootstrap.conf, such as:
java.arg.16=-Djava.security.auth.login.config=/path/to/kafka_client_jaas.conf
An example of the JAAS config file would be the following:
KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/path/to/nifi.keytab"
serviceName="kafka"
principal="nifi@YOURREALM.COM";
};
NOTE
The serviceName in the JAAS file must match the Kerberos Service Name in the processor.
Alternatively, starting with Apache NiFi 1.2.0 which uses the Kafka 0.10.2 client, the JAAS configuration when using GSSAPI can be provided by specifying the Kerberos Principal and Kerberos Keytab directly in the processor properties. This will dynamically create a JAAS configuration like above, and will take precedence over the java.security.auth.login.config system property.
SASL_PLAINTEXT - PLAIN
If the SASL mechanism is PLAIN, then client must provide a JAAS configuration to authenticate, but the JAAS configuration must use Kafka’s PlainLoginModule. An example of the JAAS config file would be the following:
KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="nifi"
password="nifi-password";
};
NOTE: It is not recommended to use a SASL mechanism of PLAIN with SASL_PLAINTEXT, as it would transmit the username and password unencrypted.
NOTE: Using the PlainLoginModule will cause it be registered in the JVM’s static list of Providers, making it visible to components in other NARs that may access the providers. There is currently a known issue where Kafka processors using the PlainLoginModule will cause HDFS processors with Keberos to no longer work.
SASL_SSL
This option uses SASL with an SSL/TLS transport layer to authenticate to the broker. In order to use this option the broker must be configured with a listener of the form:
SASL_SSL://host.name:port
See the SASL_PLAINTEXT section for a description of how to provide the proper JAAS configuration depending on the SASL mechanism (GSSAPI or PLAIN).
See the SSL section for a description of how to configure the SSL Context Service based on the ssl.client.auth property.