Skip to main content

Parallelism

Kafka Connect provides scalability through connector tasks. The AMPS Kafka sink can run multiple tasks so Kafka Connect can assign topic partitions across multiple AMPS clients.

AMPS Sink Parallelism

The following example allows up to three sink tasks:

{
"name": "amps-kafka-sink",
"config": {
"connector.class": "com.crankuptheamps.kafka.AMPSKafkaSink",
"topics": "Orders",
"tasks.max": "3",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"header.converter": "org.apache.kafka.connect.storage.StringConverter",
"clientFactoryClass": "com.crankuptheamps.kafka.AMPSBasicClientFunction",
"clientName": "KafkaSink",
"uri": "tcp://localhost:9007/amps/json",
"ampsTopic": "AMPSKafkaSinkTest",
"maxBatch": "100",
"useTopicHeader": "false"
}
}
info

The sink appends the task index to the configured clientName. For example, KafkaSink becomes KafkaSink_0, KafkaSink_1, and KafkaSink_2.

Topic Regular Expressions

The sink can consume from topics selected by a Java regular expression:

{
"topics.regex": "Test.*",
"useTopicHeader": "true"
}

When useTopicHeader is true, each Kafka record is published to an AMPS topic with the same name as the Kafka topic. When using topics.regex or multiple Kafka topics, set useTopicHeader to false if all matching Kafka topics should publish into a single AMPS topic.

Order

When tasks.max is greater than 1, Kafka Connect can process different Kafka partitions in different sink tasks. There is no global ordering guarantee across tasks. More than one sink task is not recommended if exact ordering is required.