Skip to main content

Node Reference

TypeStream pipelines are built from nodes -- composable units that source, transform, enrich, aggregate, or sink data. All 18 node types are documented here, organized by category.

Every node implements inferOutputSchema(), which lets TypeStream propagate schema information through the pipeline at compile time -- before any Kafka Streams resources are allocated. This is how the GUI shows field names on every edge and validates configurations before execution.

Source Nodes

Source nodes read data from Kafka topics and produce a typed data stream.

StreamSource

Reads from a Kafka topic and resolves the schema from Schema Registry.

FieldTypeDescription
dataStreampathKafka topic path (e.g. /dev/kafka/local/topics/web_visits)
encodingAVRO | JSONRecord encoding format
unwrapCdcbooleanIf true, extracts the after payload from a Debezium CDC envelope

Schema behavior: Looks up the topic schema from Schema Registry via the catalog. When unwrapCdc=true, the schema is unwrapped to the inner after struct -- so downstream nodes see the table columns directly, not the CDC envelope.

Supported in: CLI DSL (cat/grep), config-as-code, GUI (Kafka Source / Postgres Source)

ShellSource

A virtual source used internally by the interactive shell. Not available in config-as-code or the GUI.

FieldTypeDescription
datalist of DataStreamsData streams bound by the shell session

Schema behavior: Uses the first data stream's schema. Always JSON encoding.

Supported in: CLI interactive shell only

Transform Nodes

Transform nodes modify, filter, or reshape records flowing through the pipeline.

Filter

Selects records that match a predicate expression.

FieldTypeDescription
predicateexpressionA predicate expression (e.g. .country == "US")
byKeybooleanIf true, filter matches against the record key instead of the value

Schema behavior: Pass-through (schema unchanged).

DSL equivalent: grep [pattern] or grep [.field == "value"]

Supported in: CLI DSL, config-as-code, GUI

Map

Transforms each record by applying a mapper expression.

FieldTypeDescription
mapperexpressionA transformation expression applied to each record

Schema behavior: Pass-through. Accurate field-level schema tracking is a TODO.

DSL equivalent: cut .field1 .field2

Supported in: CLI DSL, config-as-code

Group

Re-keys records by a field expression, preparing them for downstream aggregation.

FieldTypeDescription
keyMapperExprexpressionExpression to extract the new key (e.g. .country)

Schema behavior: Pass-through (re-keys records but doesn't change the value schema).

Supported in: Config-as-code, GUI (as part of Materialized View)

Join

Joins two data streams by key.

FieldTypeDescription
withpathThe second stream to join with
joinTypebyKeyJoin strategy (currently key-based only)

Schema behavior: Merges left and right schemas into a combined struct. The output contains all fields from both streams.

DSL equivalent: cat /dev/kafka/local/topics/topic1 | join /dev/kafka/local/topics/topic2

Supported in: CLI DSL, config-as-code

Each

Executes a side-effect expression for each record. Does not modify the stream.

FieldTypeDescription
fnblock expressionCode to execute per record

Schema behavior: Pass-through (side-effect only).

DSL equivalent: each { record -> ... }

Supported in: CLI DSL

NoOp

A structural placeholder node. Used internally as the root node of a pipeline graph.

Schema behavior: Pass-through.

Aggregation Nodes

Aggregation nodes reduce a stream into a stateful result. They typically follow a Group node.

Count

Counts records per key into a KTable.

Schema behavior: Pass-through (the count result is stored as the value).

DSL equivalent: wc

Supported in: CLI DSL, config-as-code, GUI (as part of Materialized View)

WindowedCount

Counts records per key within a tumbling time window.

FieldTypeDescription
windowSizeSecondsintegerSize of the tumbling window in seconds

Schema behavior: Pass-through.

Supported in: Config-as-code, GUI (as part of Materialized View)

ReduceLatest

Keeps only the latest value per key. Useful for building lookup tables or materialized views from changelog streams.

Schema behavior: Pass-through.

Supported in: Config-as-code, GUI (as part of Materialized View)

Enrichment Nodes

Enrichment nodes add new fields to each record by calling external services or applying models.

GeoIp

Resolves an IP address field to a geographic location using the bundled MaxMind GeoLite2 database.

FieldTypeDescription
ipFieldstringName of the field containing the IP address
outputFieldstringName of the new field to add with the geo result

Schema behavior: Adds outputField (type: string) to the schema. Validates that ipField exists in the input schema at compile time.

Supported in: CLI DSL (enrich), config-as-code, GUI

TextExtractor

Extracts text content from a file using Apache Tika.

FieldTypeDescription
filePathFieldstringName of the field containing the file path or URL
outputFieldstringName of the new field to add with the extracted text

Schema behavior: Adds outputField (type: string) to the schema. Validates that filePathField exists.

Supported in: Config-as-code, GUI

EmbeddingGenerator

Generates vector embeddings from a text field using an embedding model.

FieldTypeDescription
textFieldstringName of the field containing the source text
outputFieldstringName of the new field for the embedding vector
modelstringEmbedding model name

Schema behavior: Adds outputField (type: list of floats) to the schema. Validates that textField exists.

Supported in: Config-as-code, GUI

OpenAiTransformer

Applies an OpenAI LLM prompt to each record.

FieldTypeDescription
promptstringThe prompt template (can reference record fields)
outputFieldstringName of the new field for the LLM response
modelstringOpenAI model name

Schema behavior: Adds outputField (type: string) to the schema.

Supported in: Config-as-code, GUI

Sink / Inspection Nodes

Sink nodes write pipeline output to a destination. Inspector nodes tap the stream without altering data flow.

Sink

Writes records to a Kafka topic. Can also trigger Kafka Connect connectors for database, Weaviate, or Elasticsearch sinks.

FieldTypeDescription
outputpathDestination topic path
encodingAVRO | JSONOutput encoding (defaults to source encoding)

Schema behavior: Copies input schema to the output topic.

Supported in: CLI DSL (>), config-as-code, GUI (Kafka Sink, DB Sink, Weaviate Sink, Elasticsearch Sink)

Inspector

Taps the data stream for live preview and debugging. Does not modify or persist data.

FieldTypeDescription
labelstringA label for the inspection point

Schema behavior: Pass-through.

Supported in: GUI