Checkpointing
Flink's checkpointing provides both fault tolerance and delivery guarantees. For the AMPS Sink, checkpointing is used for fault tolerance and delivery guarantees.
Delivery Guarantees
In order to achieve at least once delivery guarantees, the AMPS Sink uses a Publish Store and coordinates with Flink's checkpoints to determine when to publish messages. The sink stores all messages within a given checkpoint, and when that checkpoint completes, the connector flushes all the stored messages to AMPS.
An AMPS Sink that flushes its stored messages on checkpoint completion can be constructed similar to the following:
AMPSSink<String> sink = AMPSSink.<String>builder()
.setUri(uri)
.setTopic(topic)
.setSerializationSchema(new SimpleStringSchema())
.setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.build();
Checkpointing should be enabled if a delivery guarantee is used. Otherwise, messages will be stored in the sink indefinitely, and the JVM will eventually run out of memory.
Publish Store
Alternatively, a Publish Store can be supplied to the AMPS Sink to enable at least once delivery guarantees. In this case, the sink publishes a message as soon as it receives the message from Flink, so checkpointing would not be required. However, the sink would not coordinate with Flink's checkpointing, which could result in some message replay in the event of failure.
An AMPS Sink that only uses a Publish Store can be constructed similar to the following:
AMPSSink<String> sink = AMPSSink.<String>builder()
.setUri(uri)
.setTopic(topic)
.setSerializationSchema(new SimpleStringSchema())
.setPublishStoreFunction(new SerializableFunction<String, Store>())
.build();
This will require an implementation of the Serializable Function from the AMPS Java client API with the following type parameters:
- String argument that is the client name the sink will use
- Store return type