Skip to the content.

« Back to index

Pipeline

Table of Contents

  1. Introduction
  2. Definition
  3. Elements

Introduction

Pipelines form the heart of KSML streams logic. They take one or more input streams and apply processing logic to them. Output is passed on from operation to operation.

Definition

Pipelines are contained in the pipelines section. Each pipeline has a name, which is the name of the YAML tag.

As an example, the following defines a pipeline called copy_pipeline, which consumes messages from some_input_stream and outputs the same messages to some_output_stream.

pipelines:
  copy_pipeline:
    from: some_input_stream
    to: some_output_stream

Elements

All pipelines contain three elements:

Source

The source of a pipeline is marked with the from keyword. The value of the YAML node is the name of a defined Stream, Table or GlobalTable. See Streams for more information.

Operations

Once a source was selected, a list of operations can be applied to the input. The list is started through the keyword via, below which a list of operations is defined.

Example:

pipelines:
  copy_pipeline:
    from: some_input_stream
    via:
    - type: peek
      forEach: my_peek_function
    - type: transformKeyValue
      mapper: my_transform_function
    to: some_output_stream

Each operation must have a type field, which indicates the type of operations applied to the input. See Operations for a full list of operations that can be applied.

Sink

After all operations are applied, a pipeline can define a sink to which all messages are sent. There are four sink types in KSML: