Set Up Postgres CDC
This guide shows how to capture changes from a PostgreSQL database and process them as a streaming pipeline in TypeStream.
Prerequisites
- TypeStream installed and running with
typestream local dev - The local stack includes a pre-configured PostgreSQL instance with logical replication enabled
How it works
TypeStream uses Debezium via Kafka Connect to capture row-level changes from PostgreSQL:
- PostgreSQL is configured with
wal_level = logicalfor change data capture - On startup, TypeStream auto-registers a Debezium connector for the local PostgreSQL instance
- Debezium streams INSERT/UPDATE/DELETE events into Kafka topics
- Topics appear in TypeStream's virtual filesystem as
/dev/kafka/local/topics/dbserver.public.<table>
Verify CDC topics
The demo data generators write to PostgreSQL automatically. After starting the local environment, check for CDC topics:
echo 'ls /dev/kafka/local/topics' | typestream
You should see topics like:
dbserver.public.orders
dbserver.public.users
dbserver.public.file_uploads
Read CDC records
CDC records arrive in Debezium's envelope format, which wraps each change in metadata (before/after values, operation type, source info). Use unwrapCdc to extract just the row data.
- CLI DSL
- Config-as-Code
- GUI
cat /dev/kafka/local/topics/dbserver.public.orders
{
"name": "orders-stream",
"version": "1",
"description": "Stream orders from Postgres CDC",
"graph": {
"nodes": [
{
"id": "source-1",
"postgresSource": {
"topicPath": "/dev/kafka/local/topics/dbserver.public.orders"
}
},
{
"id": "sink-1",
"kafkaSink": {
"topicName": "orders_clean"
}
}
],
"edges": [
{ "fromId": "source-1", "toId": "sink-1" }
]
}
}
- Drag a Postgres Source node onto the canvas
- Select the
dbserver.public.orderstopic -- CDC unwrapping is enabled automatically for Postgres sources - Add downstream nodes as needed
Join CDC streams
A common pattern is joining CDC data from related tables. For example, enriching orders with user information:
cat /dev/kafka/local/topics/dbserver.public.orders | join /dev/kafka/local/topics/dbserver.public.users > /dev/kafka/local/topics/orders_enriched
Joins are currently available in the CLI DSL only. Config-as-code support for joins is planned.
Topic naming convention
Debezium uses the format <server>.<schema>.<table> for topic names:
| PostgreSQL | Kafka Topic |
|---|---|
public.orders | dbserver.public.orders |
public.users | dbserver.public.users |
public.file_uploads | dbserver.public.file_uploads |
The dbserver prefix comes from the Debezium connector's topic.prefix configuration.
See also
- CDC and Debezium -- how CDC works under the hood
- Node Reference: StreamSource --
unwrapCdcand encoding options