Skip to main content

TypeStream gRPC API Reference

Overview

TypeStream exposes its functionality through six gRPC services.

ServiceDescription
ConnectionServiceConnectionService manages external data source connections for databases, Weaviate, and Elasticsearch. Register connections for ongoing health monitoring, test connectivity on demand, and create Kafka Connect sink connectors. Credentials are stored server-side and never exposed to clients.
FileSystemServiceFileSystemService provides a virtual filesystem abstraction over Kafka topics and connected data sources. Use it to mount external data sources, browse available topics, and retrieve topic schemas.
InteractiveSessionServiceInteractiveSessionService provides a REPL-like interface for executing TypeStream DSL programs. Start a session, run programs, stream their output, get tab completions, and stop sessions when done.
JobServiceJobService manages streaming jobs — the core execution units of TypeStream. Create jobs from DSL source code or visual pipeline graphs, list running jobs, run preview jobs for live data inspection, infer schemas across a pipeline, and list available OpenAI models for AI-powered transforms.
PipelineServicePipelineService manages declarative pipeline definitions using a GitOps-style workflow. Validate pipeline graphs before deploying, apply them to create or update running jobs, list active pipelines, delete them, and plan changes (dry-run diff) before applying.
StateQueryServiceStateQueryService provides interactive query capabilities for KTable state stores from running Kafka Streams jobs. State stores are only queryable when the underlying Kafka Streams application is in RUNNING state.

Key Serialization: Keys returned by GetAllValues are JSON-serialized representations of the key schema. For example: a string key "hello" becomes the JSON string ""hello"", a struct becomes {"field": "value"}.

Limitation: GetValue currently only supports simple string keys. For complex keys (structs, etc.), use GetAllValues and filter on the client side. |

Services

ConnectionService

ConnectionService manages external data source connections for databases, Weaviate, and Elasticsearch. Register connections for ongoing health monitoring, test connectivity on demand, and create Kafka Connect sink connectors. Credentials are stored server-side and never exposed to clients.

Methods

MethodRequestResponseDescription
RegisterConnectionRegisterConnectionRequestRegisterConnectionResponseRegister a database connection for ongoing health monitoring.
UnregisterConnectionUnregisterConnectionRequestUnregisterConnectionResponseUnregister a database connection and stop monitoring it.
GetConnectionStatusesGetConnectionStatusesRequestGetConnectionStatusesResponseGet the current health status of all registered database connections.
TestConnectionTestConnectionRequestTestConnectionResponseTest a database connection immediately without registering it.
CreateJdbcSinkConnectorCreateJdbcSinkConnectorRequestCreateJdbcSinkConnectorResponseCreate a JDBC sink connector using a registered database connection. The server resolves credentials from the connection ID — credentials are never sent from the client.
RegisterWeaviateConnectionRegisterWeaviateConnectionRequestRegisterWeaviateConnectionResponseRegister a Weaviate vector database connection for health monitoring.
GetWeaviateConnectionStatusesGetWeaviateConnectionStatusesRequestGetWeaviateConnectionStatusesResponseGet the current health status of all registered Weaviate connections.
CreateWeaviateSinkConnectorCreateWeaviateSinkConnectorRequestCreateWeaviateSinkConnectorResponseCreate a Weaviate sink connector using a registered connection.
RegisterElasticsearchConnectionRegisterElasticsearchConnectionRequestRegisterElasticsearchConnectionResponseRegister an Elasticsearch connection for health monitoring.
GetElasticsearchConnectionStatusesGetElasticsearchConnectionStatusesRequestGetElasticsearchConnectionStatusesResponseGet the current health status of all registered Elasticsearch connections.
CreateElasticsearchSinkConnectorCreateElasticsearchSinkConnectorRequestCreateElasticsearchSinkConnectorResponseCreate an Elasticsearch sink connector using a registered connection.

FileSystemService

FileSystemService provides a virtual filesystem abstraction over Kafka topics and connected data sources. Use it to mount external data sources, browse available topics, and retrieve topic schemas.

Methods

MethodRequestResponseDescription
MountMountRequestMountResponseMount a data source (e.g., a Kafka cluster or database) into the virtual filesystem. Once mounted, its topics become browsable via Ls and their schemas via GetSchema.
UnmountUnmountRequestUnmountResponseRemove a previously mounted data source from the virtual filesystem.
LsLsRequestLsResponseList topics and directories at the given path in the virtual filesystem.
GetSchemaGetSchemaRequestGetSchemaResponseRetrieve the schema (field names) for a topic at the given path.

InteractiveSessionService

InteractiveSessionService provides a REPL-like interface for executing TypeStream DSL programs. Start a session, run programs, stream their output, get tab completions, and stop sessions when done.

Methods

MethodRequestResponseDescription
StartSessionStartSessionRequestStartSessionResponseStart a new interactive session for the given user. Returns a session ID used to identify the session in subsequent calls.
RunProgramRunProgramRequestRunProgramResponseExecute a TypeStream DSL program within an existing session. Returns the initial output; use GetProgramOutput to stream additional results.
GetProgramOutputGetProgramOutputRequestGetProgramOutputResponse streamStream additional output from a running program. Use this after RunProgram when hasMoreOutput is true.
CompleteProgramCompleteProgramRequestCompleteProgramResponseGet tab-completion suggestions for a partial program at the given cursor position.
StopSessionStopSessionRequestStopSessionResponseStop an interactive session and release its resources.

JobService

JobService manages streaming jobs — the core execution units of TypeStream. Create jobs from DSL source code or visual pipeline graphs, list running jobs, run preview jobs for live data inspection, infer schemas across a pipeline, and list available OpenAI models for AI-powered transforms.

Methods

MethodRequestResponseDescription
CreateJobCreateJobRequestCreateJobResponseCreate a streaming job from TypeStream DSL source code.
CreateJobFromGraphCreateJobFromGraphRequestCreateJobResponseCreate a streaming job from a visual pipeline graph with optional sink configurations.
ListJobsListJobsRequestListJobsResponseList all running and recently stopped jobs.
CreatePreviewJobCreatePreviewJobRequestCreatePreviewJobResponseCreate a short-lived preview job for inspecting live data flowing through a pipeline.
StopPreviewJobStopPreviewJobRequestStopPreviewJobResponseStop a running preview job.
StreamPreviewStreamPreviewRequestStreamPreviewResponse streamStream live preview data from a running preview job.
InferGraphSchemasInferGraphSchemasRequestInferGraphSchemasResponseInfer the output schema for each node in a pipeline graph. Useful for showing field types in the UI before running a job.
ListOpenAIModelsListOpenAIModelsRequestListOpenAIModelsResponseList available OpenAI models for use with AI-powered transform nodes.

PipelineService

PipelineService manages declarative pipeline definitions using a GitOps-style workflow. Validate pipeline graphs before deploying, apply them to create or update running jobs, list active pipelines, delete them, and plan changes (dry-run diff) before applying.

Methods

MethodRequestResponseDescription
ValidatePipelineValidatePipelineRequestValidatePipelineResponseValidate a pipeline definition without deploying it. Returns validation errors and warnings.
ApplyPipelineApplyPipelineRequestApplyPipelineResponseApply a pipeline definition — creates a new job or updates an existing one. Returns the job ID and whether the pipeline was created, updated, or unchanged.
ListPipelinesListPipelinesRequestListPipelinesResponseList all currently registered pipelines and their job status.
DeletePipelineDeletePipelineRequestDeletePipelineResponseDelete a pipeline by name, stopping its underlying job.
PlanPipelinesPlanPipelinesRequestPlanPipelinesResponseDry-run a set of pipeline definitions against the current state. Returns a plan showing which pipelines would be created, updated, deleted, or unchanged.

StateQueryService

StateQueryService provides interactive query capabilities for KTable state stores from running Kafka Streams jobs. State stores are only queryable when the underlying Kafka Streams application is in RUNNING state.

Key Serialization: Keys returned by GetAllValues are JSON-serialized representations of the key schema. For example: a string key "hello" becomes the JSON string ""hello"", a struct becomes {"field": "value"}.

Limitation: GetValue currently only supports simple string keys. For complex keys (structs, etc.), use GetAllValues and filter on the client side.

Methods

MethodRequestResponseDescription
ListStoresListStoresRequestListStoresResponseLists all queryable state stores from running Kafka Streams jobs. Only stores from jobs in RUNNING state are included. Stores that cannot be queried (e.g., during rebalancing) are omitted.
GetAllValuesGetAllValuesRequestKeyValuePair streamStreams all key-value pairs from a state store with optional limit. Keys are JSON-serialized and may be primitives or complex objects depending on the key schema. Returns NOT_FOUND if the store doesn't exist, UNAVAILABLE if the job isn't running.
GetValueGetValueRequestGetValueResponseRetrieves a single value from a state store by key. Note: Currently only supports simple string keys. For complex keys, use GetAllValues. Returns NOT_FOUND if the store doesn't exist, UNAVAILABLE if the job isn't running.

Messages

ConnectionStatus

Connection health status (excludes credentials — credentials stay server-side).

FieldTypeLabelDescription
idstring
namestring
stateConnectionState
errorstring
last_checkedgoogle.protobuf.Timestamp
configDatabaseConnectionConfigPublicPublic config (no password).

CreateElasticsearchSinkConnectorRequest

FieldTypeLabelDescription
connection_idstringID of a registered Elasticsearch connection.
connector_namestringName for the Kafka Connect connector.
topicsstringKafka topic(s) to consume from.
index_namestringTarget Elasticsearch index name.
document_id_strategystringDocument ID strategy: "RECORD_KEY" or "TOPIC_PARTITION_OFFSET".
write_methodstringWrite method: "INSERT" or "UPSERT".
behavior_on_null_valuesstringBehavior on null values: "IGNORE", "DELETE", or "FAIL".

CreateElasticsearchSinkConnectorResponse

FieldTypeLabelDescription
successbool
errorstring
connector_namestring

CreateJdbcSinkConnectorRequest

FieldTypeLabelDescription
connection_idstringID of a registered connection (server resolves credentials).
connector_namestringName for the Kafka Connect connector.
topicsstringKafka topic(s) to consume from.
table_namestringTarget database table name.
insert_modestringWrite mode: "insert", "upsert", or "update".
primary_key_fieldsstringComma-separated primary key fields (required for upsert/update mode).

CreateJdbcSinkConnectorResponse

FieldTypeLabelDescription
successbool
errorstring
connector_namestringName of the created connector.

CreateWeaviateSinkConnectorRequest

FieldTypeLabelDescription
connection_idstringID of a registered Weaviate connection.
connector_namestringName for the Kafka Connect connector.
topicsstringKafka topic(s) to consume from.
collection_namestring
document_id_strategystringDocument ID strategy: "NoIdStrategy", "KafkaIdStrategy", or "FieldIdStrategy".
document_id_fieldstringField to use as document ID (when using FieldIdStrategy).
vector_strategystringVector strategy: "NoVectorStrategy" or "FieldVectorStrategy".
vector_fieldstringField containing the vector (when using FieldVectorStrategy).
timestamp_fieldstringOptional: field name for timestamp conversion (empty = no transform).

CreateWeaviateSinkConnectorResponse

FieldTypeLabelDescription
successbool
errorstring
connector_namestring

DatabaseConnectionConfig

Database connection configuration including credentials.

FieldTypeLabelDescription
idstring
namestring
database_typeDatabaseType
hostnamestringHostname for server health checks (e.g., "localhost").
portint32
databasestring
usernamestring
passwordstring
connector_hostnamestringHostname for Kafka Connect connectors (e.g., "postgres" in Docker).

DatabaseConnectionConfigPublic

Public database connection config (excludes sensitive fields like password).

FieldTypeLabelDescription
idstring
namestring
database_typeDatabaseType
hostnamestring
portint32
databasestring
usernamestring
connector_hostnamestringpassword intentionally excluded

ElasticsearchConnectionConfig

Elasticsearch connection configuration.

FieldTypeLabelDescription
idstring
namestring
connection_urlstringElasticsearch URL (e.g., "http://localhost:9200").
usernamestringOptional: username for authentication.
passwordstringOptional: password for authentication.
connector_urlstringURL for Kafka Connect (e.g., "http://elasticsearch:9200").

ElasticsearchConnectionConfigPublic

Public Elasticsearch connection config (excludes password).

FieldTypeLabelDescription
idstring
namestring
connection_urlstring
usernamestring
connector_urlstringpassword intentionally excluded

ElasticsearchConnectionStatus

Health status of an Elasticsearch connection.

FieldTypeLabelDescription
idstring
namestring
stateConnectionState
errorstring
last_checkedgoogle.protobuf.Timestamp
configElasticsearchConnectionConfigPublic

GetConnectionStatusesRequest

GetConnectionStatusesResponse

FieldTypeLabelDescription
statusesConnectionStatusrepeatedHealth statuses of all registered database connections.

GetElasticsearchConnectionStatusesRequest

GetElasticsearchConnectionStatusesResponse

FieldTypeLabelDescription
statusesElasticsearchConnectionStatusrepeated

GetWeaviateConnectionStatusesRequest

GetWeaviateConnectionStatusesResponse

FieldTypeLabelDescription
statusesWeaviateConnectionStatusrepeated

RegisterConnectionRequest

FieldTypeLabelDescription
connectionDatabaseConnectionConfigDatabase connection configuration (including credentials for initial setup).

RegisterConnectionResponse

FieldTypeLabelDescription
successbool
errorstring
statusConnectionStatusHealth status of the newly registered connection.

RegisterElasticsearchConnectionRequest

FieldTypeLabelDescription
connectionElasticsearchConnectionConfigElasticsearch connection configuration (including password for initial setup).

RegisterElasticsearchConnectionResponse

FieldTypeLabelDescription
successbool
errorstring
statusElasticsearchConnectionStatus

RegisterWeaviateConnectionRequest

FieldTypeLabelDescription
connectionWeaviateConnectionConfigWeaviate connection configuration (including API key for initial setup).

RegisterWeaviateConnectionResponse

FieldTypeLabelDescription
successbool
errorstring
statusWeaviateConnectionStatus

TestConnectionRequest

FieldTypeLabelDescription
connectionDatabaseConnectionConfigDatabase connection configuration to test (one-shot, not registered).

TestConnectionResponse

FieldTypeLabelDescription
successbool
errorstring
latency_msint64Round-trip latency of the connection test in milliseconds.

UnregisterConnectionRequest

FieldTypeLabelDescription
connection_idstringID of the connection to unregister.

UnregisterConnectionResponse

FieldTypeLabelDescription
successbool
errorstring

WeaviateConnectionConfig

Weaviate vector database connection configuration.

FieldTypeLabelDescription
idstring
namestring
rest_urlstringWeaviate REST API URL (e.g., "http://localhost:8090").
grpc_urlstringWeaviate gRPC URL (e.g., "localhost:50051").
grpc_securedbool
auth_schemestringAuthentication scheme: "NONE" or "API_KEY".
api_keystringAPI key for Weaviate Cloud (stored server-side only).
connector_rest_urlstringREST URL for Kafka Connect (e.g., "http://weaviate:8080").
connector_grpc_urlstringgRPC URL for Kafka Connect (e.g., "weaviate:50051").

WeaviateConnectionConfigPublic

Public Weaviate connection config (excludes API key).

FieldTypeLabelDescription
idstring
namestring
rest_urlstring
grpc_urlstring
grpc_securedbool
auth_schemestring
connector_rest_urlstringapi_key intentionally excluded
connector_grpc_urlstring

WeaviateConnectionStatus

Health status of a Weaviate connection.

FieldTypeLabelDescription
idstring
namestring
stateConnectionState
errorstring
last_checkedgoogle.protobuf.Timestamp
configWeaviateConnectionConfigPublic

FileInfo

Metadata about a single entry in the virtual filesystem.

FieldTypeLabelDescription
namestringName of the file or directory.
encodingEncodingData encoding of the topic (only set for topic entries).

GetSchemaRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user session.
pathstringVirtual filesystem path to the topic (e.g., "/local/topics/my-topic").

GetSchemaResponse

FieldTypeLabelDescription
fieldsstringrepeatedField names in the topic's schema.
errorstringError message if schema retrieval failed.

LsRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user session.
pathstringVirtual filesystem path to list (e.g., "/local/topics").

LsResponse

FieldTypeLabelDescription
filesFileInforepeatedList of entries at the requested path.
errorstringError message if the listing failed.

MountRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user session.
configstringConfiguration string for the data source to mount (e.g., connection details).

MountResponse

FieldTypeLabelDescription
successboolWhether the mount operation succeeded.
errorstringError message if the mount failed.

UnmountRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user session.
endpointstringThe endpoint path of the mounted data source to remove.

UnmountResponse

FieldTypeLabelDescription
successboolWhether the unmount operation succeeded.
errorstringError message if the unmount failed.

CompleteProgramRequest

FieldTypeLabelDescription
session_idstringSession to get completions for.
sourcestringPartial TypeStream DSL source code.
cursorint32Cursor position (character offset) within the source for completion.

CompleteProgramResponse

FieldTypeLabelDescription
valuestringrepeatedList of completion suggestions.

GetProgramOutputRequest

FieldTypeLabelDescription
session_idstringSession containing the program.
idstringIdentifier of the running program (from RunProgramResponse.id).

GetProgramOutputResponse

FieldTypeLabelDescription
stdOutstringStandard output chunk from the running program.
stdErrstringStandard error chunk from the running program.

RunProgramRequest

FieldTypeLabelDescription
session_idstringSession to run the program in.
sourcestringTypeStream DSL program source code to execute.

RunProgramResponse

FieldTypeLabelDescription
idstringIdentifier for the running program instance.
envRunProgramResponse.EnvEntryrepeatedEnvironment variables set by the program (e.g., topic assignments).
stdOutstringInitial standard output from the program.
stdErrstringStandard error output from the program.
hasMoreOutputboolIf true, more output is available via GetProgramOutput.

RunProgramResponse.EnvEntry

FieldTypeLabelDescription
keystring
valuestring

StartSessionRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user starting the session.

StartSessionResponse

FieldTypeLabelDescription
session_idstringUnique identifier for the created session.

StopSessionRequest

FieldTypeLabelDescription
session_idstringSession to stop.

StopSessionResponse

FieldTypeLabelDescription
stdOutstringAny remaining standard output from the session.
stdErrstringAny remaining standard error from the session.

CreateJobFromGraphRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user creating the job.
graphPipelineGraphPipeline graph defining the job topology.
db_sink_configsDbSinkConfigrepeatedDatabase sink configurations for DB sink nodes.
weaviate_sink_configsWeaviateSinkConfigrepeatedWeaviate sink configurations for Weaviate sink nodes.
elasticsearch_sink_configsElasticsearchSinkConfigrepeatedElasticsearch sink configurations for Elasticsearch sink nodes.

CreateJobRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user creating the job.
sourcestringTypeStream DSL source code defining the pipeline.

CreateJobResponse

FieldTypeLabelDescription
successboolWhether the job was created successfully.
job_idstringUnique identifier for the created job.
errorstringError message if job creation failed.
created_connectorsstringrepeatedNames of Kafka Connect connectors created for DB sinks.

CreatePreviewJobRequest

FieldTypeLabelDescription
graphPipelineGraphPipeline graph to run as a preview.
inspector_node_idstringID of the inspector node that triggered this preview.

CreatePreviewJobResponse

FieldTypeLabelDescription
successbool
job_idstringID of the created preview job.
inspect_topicstringKafka topic to consume for live preview data.
errorstring

DbSinkConfig

Configuration for a database sink connector. Credentials are resolved server-side from the connection ID.

FieldTypeLabelDescription
node_idstringID of the node this sink config belongs to.
connection_idstringRegistered connection ID (server resolves credentials).
table_namestringTarget database table name.
insert_modestringWrite mode: "insert", "upsert", or "update".
primary_key_fieldsstringComma-separated primary key fields (for upsert/update mode).
intermediate_topicstringIntermediate Kafka topic (auto-generated if empty).

DbSinkNode

Database sink — writes records to a database table.

FieldTypeLabelDescription
connection_idstringRegistered database connection ID.
table_namestringTarget database table name.
insert_modestringWrite mode: "insert", "upsert", or "update".
primary_key_fieldsstringComma-separated primary key fields (for upsert/update mode).

ElasticsearchSinkConfig

Configuration for an Elasticsearch sink connector.

FieldTypeLabelDescription
node_idstringID of the node this sink config belongs to.
connection_idstringRegistered Elasticsearch connection ID.
intermediate_topicstringIntermediate Kafka topic for the connector.
index_namestringTarget Elasticsearch index name.
document_id_strategystringDocument ID strategy: "RECORD_KEY" or "TOPIC_PARTITION_OFFSET".
write_methodstringWrite method: "INSERT" or "UPSERT".
behavior_on_null_valuesstringBehavior on null values: "IGNORE", "DELETE", or "FAIL".

ElasticsearchSinkNode

Elasticsearch sink — writes records to an Elasticsearch index.

FieldTypeLabelDescription
connection_idstringRegistered Elasticsearch connection ID.
index_namestringTarget Elasticsearch index name.
document_id_strategystringDocument ID strategy: "RECORD_KEY" or "TOPIC_PARTITION_OFFSET".
write_methodstringWrite method: "INSERT" or "UPSERT".
behavior_on_null_valuesstringBehavior on null values: "IGNORE", "DELETE", or "FAIL".

EmbeddingGeneratorNode

Embedding generator — generates vector embeddings from text using OpenAI.

FieldTypeLabelDescription
text_fieldstringField containing the text to embed.
output_fieldstringOutput field name for the embedding vector (default: "embedding").
modelstringOpenAI model to use (default: "text-embedding-3-small").

GeoIpNode

GeoIP enrichment — resolves an IP address to a country code.

FieldTypeLabelDescription
ip_fieldstringField containing the IP address to look up.
output_fieldstringOutput field name for the country code (default: "country_code").

InferGraphSchemasRequest

FieldTypeLabelDescription
graphPipelineGraphPipeline graph to infer schemas for.

InferGraphSchemasResponse

FieldTypeLabelDescription
schemasInferGraphSchemasResponse.SchemasEntryrepeatedMap of node ID to its inferred schema.

InferGraphSchemasResponse.SchemasEntry

FieldTypeLabelDescription
keystring
valueNodeSchemaResult

InspectorNode

Inspector — taps into the data stream for live preview.

FieldTypeLabelDescription
labelstringOptional label for the inspector tap point.

JobInfo

Information about a streaming job.

FieldTypeLabelDescription
job_idstringUnique job identifier.
stateJobStateCurrent lifecycle state.
start_timeint64Unix timestamp in milliseconds when the job started.
graphPipelineGraphThe job's pipeline graph.
throughputJobThroughputThroughput metrics.
weaviate_sinksWeaviateSinkConfigrepeatedWeaviate sink configs for this job.
elasticsearch_sinksElasticsearchSinkConfigrepeatedElasticsearch sink configs for this job.

JobThroughput

Throughput metrics for a running job.

FieldTypeLabelDescription
messages_per_seconddoubleCurrent processing rate (messages/sec).
total_messagesint64Total messages processed since job start.
bytes_per_seconddoubleCurrent bandwidth consumption (bytes/sec).
total_bytesint64Total bytes processed since job start.

KafkaSinkNode

Kafka topic sink — writes output to a Kafka topic.

FieldTypeLabelDescription
topic_namestringName of the target Kafka topic.

KafkaSourceNode

Kafka topic source — reads from a Kafka topic.

FieldTypeLabelDescription
topic_pathstringVirtual filesystem path to the topic (e.g., "/local/topics/my-topic").
encodingEncodingData encoding of the topic.
unwrap_cdcboolIf true, extract the "after" payload from a CDC envelope.

ListJobsRequest

FieldTypeLabelDescription
user_idstringIdentifier for the user listing jobs.

ListJobsResponse

FieldTypeLabelDescription
jobsJobInforepeatedAll known jobs.

ListOpenAIModelsRequest

ListOpenAIModelsResponse

FieldTypeLabelDescription
modelsOpenAIModelrepeatedAvailable OpenAI models.

MaterializedViewNode

Materialized view — aggregates data into a queryable state store.

FieldTypeLabelDescription
group_by_fieldstringField to group by for the aggregation.
aggregation_typestringAggregation type: "count" or "latest".
enable_windowingboolWhether to apply windowed aggregation.
window_size_secondsint64Window size in seconds (only used when enable_windowing is true).

NodeSchemaResult

Inferred schema for a single pipeline node.

FieldTypeLabelDescription
fieldsstringrepeatedField names (simplified view).
typed_fieldsSchemaFieldrepeatedFields with type information.
encodingstringData encoding of the node's output.
errorstringError if schema inference failed for this node.

OpenAIModel

An available OpenAI model.

FieldTypeLabelDescription
idstringModel ID (e.g., "gpt-4o-mini").
namestringHuman-readable display name.

OpenAiTransformerNode

OpenAI transformer — enriches records using an OpenAI language model.

FieldTypeLabelDescription
promptstringInstruction prompt describing the transformation.
output_fieldstringOutput field name for the AI response (default: "ai_response").
modelstringOpenAI model ID to use (default: "gpt-4o-mini").

PostgresSourceNode

Postgres CDC source — reads change events from a Postgres table via Debezium.

FieldTypeLabelDescription
topic_pathstringVirtual filesystem path to the CDC topic.

SchemaField

A single field in a schema.

FieldTypeLabelDescription
namestringField name.
typestringField type (e.g., "STRING", "INT64", "STRUCT").

StopPreviewJobRequest

FieldTypeLabelDescription
job_idstringID of the preview job to stop.

StopPreviewJobResponse

FieldTypeLabelDescription
successbool
errorstring

StreamPreviewRequest

FieldTypeLabelDescription
job_idstringID of the preview job to stream from.

StreamPreviewResponse

A single record from the preview stream.

FieldTypeLabelDescription
keystringRecord key.
valuestringRecord value (JSON string).
timestampint64Record timestamp (Unix milliseconds).

TextExtractorNode

Text extractor — extracts text content from files referenced in records.

FieldTypeLabelDescription
file_path_fieldstringField containing the file path to extract text from.
output_fieldstringOutput field name for the extracted text (default: "text").

UserFilterNode

Filter transform — keeps only records matching an expression.

FieldTypeLabelDescription
expressionstringFilter expression (e.g., ".status == 'active'").

UserPipelineGraph

A user-facing pipeline graph consisting of nodes and edges.

FieldTypeLabelDescription
nodesUserPipelineNoderepeatedNodes in the pipeline.
edgesPipelineEdgerepeatedEdges connecting nodes (directed, from source to sink).

UserPipelineNode

A node in a user-facing pipeline graph.

FieldTypeLabelDescription
idstringUnique node identifier within the graph.
kafka_sourceKafkaSourceNodeSources
postgres_sourcePostgresSourceNode
filterUserFilterNodeTransforms
geo_ipGeoIpNode
text_extractorTextExtractorNode
embedding_generatorEmbeddingGeneratorNode
open_ai_transformerOpenAiTransformerNode
kafka_sinkKafkaSinkNodeSinks
inspectorInspectorNode
materialized_viewMaterializedViewNode
db_sinkDbSinkNode
weaviate_sinkWeaviateSinkNode
elasticsearch_sinkElasticsearchSinkNode

WeaviateSinkConfig

Configuration for a Weaviate vector database sink connector.

FieldTypeLabelDescription
node_idstringID of the node this sink config belongs to.
connection_idstringRegistered Weaviate connection ID.
intermediate_topicstringIntermediate Kafka topic for the connector.
collection_namestringTarget Weaviate collection name.
document_id_strategystringDocument ID strategy: "NoIdStrategy", "KafkaIdStrategy", or "FieldIdStrategy".
document_id_fieldstringField to use as document ID (when using FieldIdStrategy).
vector_strategystringVector strategy: "NoVectorStrategy" or "FieldVectorStrategy".
vector_fieldstringField containing the vector.
timestamp_fieldstringOptional: field name for timestamp conversion (empty = no transform).

WeaviateSinkNode

Weaviate sink — writes records to a Weaviate vector database collection.

FieldTypeLabelDescription
connection_idstringRegistered Weaviate connection ID.
collection_namestringTarget Weaviate collection name.
document_id_strategystringDocument ID strategy: "NoIdStrategy", "KafkaIdStrategy", or "FieldIdStrategy".
document_id_fieldstringField to use as document ID (when using FieldIdStrategy).
vector_strategystringVector strategy: "NoVectorStrategy" or "FieldVectorStrategy".
vector_fieldstringField containing the vector.
timestamp_fieldstringOptional: field name for timestamp conversion.

ApplyPipelineRequest

FieldTypeLabelDescription
metadataPipelineMetadataPipeline metadata (name, version, description).
graphUserPipelineGraphThe pipeline graph to deploy.

ApplyPipelineResponse

FieldTypeLabelDescription
successboolWhether the apply operation succeeded.
job_idstringID of the created or updated job.
errorstringError message if the apply failed.
statePipelineStateWhether the pipeline was created, updated, or unchanged.

DeletePipelineRequest

FieldTypeLabelDescription
namestringName of the pipeline to delete.

DeletePipelineResponse

FieldTypeLabelDescription
successboolWhether the delete operation succeeded.
errorstringError message if the delete failed.

ListPipelinesRequest

ListPipelinesResponse

FieldTypeLabelDescription
pipelinesPipelineInforepeatedAll registered pipelines.

PipelineInfo

Information about a registered pipeline and its running job.

FieldTypeLabelDescription
namestringPipeline name.
versionstringPipeline version.
descriptionstringPipeline description.
job_idstringID of the underlying job.
job_stateJobStateCurrent state of the underlying job.
applied_atint64Unix timestamp (milliseconds) when this pipeline was last applied.
graphPipelineGraphInternal pipeline graph (compiler representation).
user_graphUserPipelineGraphUser-facing pipeline graph as originally submitted.

PipelineMetadata

Metadata describing a pipeline definition.

FieldTypeLabelDescription
namestringUnique name identifying this pipeline.
versionstringVersion string for tracking changes (e.g., "v1", "2024-01-15").
descriptionstringHuman-readable description of what this pipeline does.

PipelinePlan

A pipeline definition included in a plan request.

FieldTypeLabelDescription
metadataPipelineMetadataPipeline metadata (name, version, description).
graphUserPipelineGraphThe pipeline graph to plan.

PipelinePlanResult

Result for a single pipeline in a plan response.

FieldTypeLabelDescription
namestringPipeline name.
actionPipelineActionAction that would be taken.
current_versionstringCurrent version on the server (empty if pipeline is new).
new_versionstringNew version from the plan (empty if pipeline would be deleted).

PlanPipelinesRequest

FieldTypeLabelDescription
pipelinesPipelinePlanrepeatedPipeline definitions to plan against current state.

PlanPipelinesResponse

FieldTypeLabelDescription
resultsPipelinePlanResultrepeatedPlan results for each pipeline.
errorsstringrepeatedErrors encountered during planning.

ValidatePipelineRequest

FieldTypeLabelDescription
metadataPipelineMetadataPipeline metadata (name, version, description).
graphUserPipelineGraphThe pipeline graph to validate.

ValidatePipelineResponse

FieldTypeLabelDescription
validboolWhether the pipeline is valid.
errorsstringrepeatedValidation errors (pipeline cannot be applied).
warningsstringrepeatedValidation warnings (pipeline can be applied but may have issues).

GetAllValuesRequest

FieldTypeLabelDescription
store_namestringName of the state store to query
limitint32Maximum number of entries to return (default: 100)
from_keystringReserved for future pagination support - not currently implemented

GetValueRequest

FieldTypeLabelDescription
store_namestringName of the state store to query
keystringThe key to look up. Note: This is used directly as a string key. Complex key types (structs, etc.) are not supported. For stores with complex keys, use GetAllValues and filter client-side.

GetValueResponse

FieldTypeLabelDescription
foundboolTrue if the key was found in the store
valuestringString representation of the value, empty if not found

KeyValuePair

FieldTypeLabelDescription
keystringJSON-serialized representation of the key. Format depends on the key's schema type: - String: ""value"" (JSON string) - Int/Long: "123" (JSON number) - Struct: "{"field": "value"}" (JSON object)
valuestringString representation of the value (for count operations, this is the count as a string)

ListStoresRequest

ListStoresResponse

FieldTypeLabelDescription
storesStoreInforepeated

StoreInfo

FieldTypeLabelDescription
namestringThe name of the state store (e.g., "job-id-count-store-0")
job_idstringThe ID of the job that owns this store
approximate_countint64Approximate number of entries in the store (from RocksDB metadata)

Enums

ConnectionState

Health state of a monitored connection.

NameNumberDescription
CONNECTION_STATE_UNSPECIFIED0
CONNECTED1
DISCONNECTED2
ERROR3
CONNECTING4

DatabaseType

Supported database types for connections.

NameNumberDescription
DATABASE_TYPE_UNSPECIFIED0
POSTGRES1
MYSQL2

Encoding

Data encoding format for Kafka topics.

NameNumberDescription
STRING0
NUMBER1
JSON2
AVRO3
PROTOBUF4

JobState

Lifecycle state of a streaming job.

NameNumberDescription
JOB_STATE_UNSPECIFIED0
STARTING1
RUNNING2
STOPPING3
STOPPED4
FAILED5
UNKNOWN6

PipelineAction

Action that would be taken for a pipeline in a plan.

NameNumberDescription
PIPELINE_ACTION_UNSPECIFIED0
CREATE1Pipeline does not exist and would be created.
UPDATE2Pipeline exists and would be updated.
NO_CHANGE3Pipeline exists and has not changed.
DELETE4Pipeline exists on server but is not in the plan, so it would be deleted.

PipelineState

Result state after applying a pipeline.

NameNumberDescription
PIPELINE_STATE_UNSPECIFIED0
CREATED1Pipeline was newly created.
UPDATED2Pipeline was updated with new configuration.
UNCHANGED3Pipeline configuration has not changed.

Scalar Value Types

.proto TypeNotesC++JavaPythonGoC#PHPRuby
doubledoubledoublefloatfloat64doublefloatFloat
floatfloatfloatfloatfloat32floatfloatFloat
int32Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.int32intintint32intintegerBignum or Fixnum (as required)
int64Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.int64longint/longint64longinteger/stringBignum
uint32Uses variable-length encoding.uint32intint/longuint32uintintegerBignum or Fixnum (as required)
uint64Uses variable-length encoding.uint64longint/longuint64ulonginteger/stringBignum or Fixnum (as required)
sint32Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.int32intintint32intintegerBignum or Fixnum (as required)
sint64Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.int64longint/longint64longinteger/stringBignum
fixed32Always four bytes. More efficient than uint32 if values are often greater than 2^28.uint32intintuint32uintintegerBignum or Fixnum (as required)
fixed64Always eight bytes. More efficient than uint64 if values are often greater than 2^56.uint64longint/longuint64ulonginteger/stringBignum
sfixed32Always four bytes.int32intintint32intintegerBignum or Fixnum (as required)
sfixed64Always eight bytes.int64longint/longint64longinteger/stringBignum
boolboolbooleanbooleanboolboolbooleanTrueClass/FalseClass
stringA string must always contain UTF-8 encoded or 7-bit ASCII text.stringStringstr/unicodestringstringstringString (UTF-8)
bytesMay contain any arbitrary sequence of bytes.stringByteStringstr[]byteByteStringstringString (ASCII-8BIT)