Streams
Table of Contents
Introduction
Every KSML definition file contains a list of declared streams. There are three types of streams supported:
Type | Kafka Streams equivalent | Description |
---|---|---|
Stream | KStream |
KStream is an abstraction of a record stream of KeyValue pairs, i.e., each record is an independent entity/event in the real world. For example a user X might buy two items I1 and I2, and thus there might be two records A KStream is either defined from one or multiple Kafka topics that are consumed message by message or the result of a KStream transformation. A KTable can also be converted into a KStream. A KStream can be transformed record by record, joined with another KStream, KTable, GlobalKTable, or can be aggregated into a KTable. |
Table | KTable |
KTable is an abstraction of a changelog stream from a primary-keyed table. Each record in this changelog stream is an update on the primary-keyed table with the record key as the primary key. A KTable is either defined from a single Kafka topic that is consumed message by message or the result of a KTable transformation. An aggregation of a KStream also yields a KTable. A KTable can be transformed record by record, joined with another KTable or KStream, or can be re-partitioned and aggregated into a new KTable. |
GlobalTable | GlobalKTable |
GlobalKTable is an abstraction of a changelog stream from a primary-keyed table. Each record in this changelog stream is an update on the primary-keyed table with the record key as the primary key. GlobalKTable can only be used as right-hand side input for stream-table joins. In contrast to a KTable that is partitioned over all KafkaStreams instances, a GlobalKTable is fully replicated per KafkaStreams instance. Every partition of the underlying topic is consumed by each GlobalKTable, such that the full set of data is available in every KafkaStreams instance. This provides the ability to perform joins with KStream without having to repartition the input stream. All joins with the GlobalKTable require that a KeyValueMapper is provided that can map from the KeyValue of the left hand side KStream to the key of the right hand side GlobalKTable. |
The definitions of these stream types are done as described below.
Stream
Example:
streams:
my_stream_reference:
topic: some_kafka_topic
keyType: string
valueType: string
offsetResetPolicy: earliest
timestampExtractor: my_timestamp_extractor
Table
Example:
tables:
my_table_reference:
topic: some_kafka_topic
keyType: string
valueType: string
store: <keyValue state store reference or inline definition>
GlobalTable
Example:
globalTables:
my_global_table_reference:
topic: some_kafka_topic
keyType: string
valueType: string