Kafka

Kafka stream data sink

Properties

Properties supported in this sink are shown below ( * indicates required fields )
Property
Description
Name *
Name of the data sink
Description
Description of the data sink
Processing Mode
Select for batch and un-select for streaming. If 'Batch' is selected the value of the switch is set to true. If 'Streaming' is selected the value of the switch is set to false.Default: false
Connection *
Pre-defined Kafka connection
Topic Name / Pattern to Write*
The topic name / pattern to write toExample: employeeData
Include Kafka Headers
Include the Kafka headers in the rowDefault: false
Buffer Memmory
The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for
after which it will throw an exception.Default: 33,554,432
Retries
Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the errorDefault: 2,147,483,647
Batch Size
The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes.Default: 16,384
Maximum idle time for connection
Close idle connections after the number of milliseconds specified by this config.Default: 540,000
Delivery Timeout
An upper bound on the time to report success or failure after a call to send() returns. This limits the total time that a record will be delayed prior to sending, the time to await acknowledgement from the broker (if expected), and the time allowed for retriable send failures.Default: 120,000
Linger Time
The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record, the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together.Default: 0
Maximum block size
The configuration controls how long the KafkaProducer's send(), partitionsFor(), initTransactions(), sendOffsetsToTransaction(), commitTransaction() and abortTransaction() methods will block. For send() this timeout bounds the total time waiting for both metadata fetch and buffer allocation (blocking in the user-supplied serializers or partitioner is not counted against this timeout).Default: 60,000
Maximum request size
The maximum size of a request in bytes. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests.Default: 1,048,576
Request timeout
The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. This should be larger than
(a broker configuration) to reduce the possibility of message duplication due to unnecessary producer retries.Default: 30,000
Send buffer size
The size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.Default: 131,072
Acknowledgment
The number of acknowledgments the producer requires the leader to have received before considering a request complete.Default: all
Enable Idempotence
When set to 'true', the producer will ensure that exactly one copy of each message is written in the stream. If 'false', producer retries due to broker failures, etc., may write duplicates of the retried message in the stream.Default: true
Maximum Inflight request per connection
The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this config is set to be greater than 1 and enable.idempotence is set to false, there is a risk of message re-ordering after a failed send due to retries (i.e., if retries are enabled).Default: 5
Checkpoint Location
Path to checkpoint directory and fileExample: s3a://output_bucket/some_folder
Watermark Field Name
Field name to be used as watermark. If unspecified in streaming mode, the default field name is 'tempWatermark'.Example: myConsumerWatermarkDefault: tempWatermark
Watermark Value
Watermark value settingExample: 10 seconds,2 minutes
Partition By
Comma separated column names to partition byExample: year, month, day