TypeStream gRPC API Reference

Overview

TypeStream exposes its functionality through six gRPC services.

Service	Description
ConnectionService	ConnectionService manages external data source connections for databases, Weaviate, and Elasticsearch. Register connections for ongoing health monitoring, test connectivity on demand, and create Kafka Connect sink connectors. Credentials are stored server-side and never exposed to clients.
FileSystemService	FileSystemService provides a virtual filesystem abstraction over Kafka topics and connected data sources. Use it to mount external data sources, browse available topics, and retrieve topic schemas.
InteractiveSessionService	InteractiveSessionService provides a REPL-like interface for executing TypeStream DSL programs. Start a session, run programs, stream their output, get tab completions, and stop sessions when done.
JobService	JobService manages streaming jobs — the core execution units of TypeStream. Create jobs from DSL source code or visual pipeline graphs, list running jobs, run preview jobs for live data inspection, infer schemas across a pipeline, and list available OpenAI models for AI-powered transforms.
PipelineService	PipelineService manages declarative pipeline definitions using a GitOps-style workflow. Validate pipeline graphs before deploying, apply them to create or update running jobs, list active pipelines, delete them, and plan changes (dry-run diff) before applying.
StateQueryService	StateQueryService provides interactive query capabilities for KTable state stores from running Kafka Streams jobs. State stores are only queryable when the underlying Kafka Streams application is in RUNNING state.

Key Serialization: Keys returned by GetAllValues are JSON-serialized representations of the key schema. For example: a string key "hello" becomes the JSON string ""hello"", a struct becomes {"field": "value"}.

Limitation: GetValue currently only supports simple string keys. For complex keys (structs, etc.), use GetAllValues and filter on the client side. |

Services

ConnectionService

ConnectionService manages external data source connections for databases, Weaviate, and Elasticsearch. Register connections for ongoing health monitoring, test connectivity on demand, and create Kafka Connect sink connectors. Credentials are stored server-side and never exposed to clients.

Methods

Method	Request	Response	Description
RegisterConnection	RegisterConnectionRequest	RegisterConnectionResponse	Register a database connection for ongoing health monitoring.
UnregisterConnection	UnregisterConnectionRequest	UnregisterConnectionResponse	Unregister a database connection and stop monitoring it.
GetConnectionStatuses	GetConnectionStatusesRequest	GetConnectionStatusesResponse	Get the current health status of all registered database connections.
TestConnection	TestConnectionRequest	TestConnectionResponse	Test a database connection immediately without registering it.
CreateJdbcSinkConnector	CreateJdbcSinkConnectorRequest	CreateJdbcSinkConnectorResponse	Create a JDBC sink connector using a registered database connection. The server resolves credentials from the connection ID — credentials are never sent from the client.
RegisterWeaviateConnection	RegisterWeaviateConnectionRequest	RegisterWeaviateConnectionResponse	Register a Weaviate vector database connection for health monitoring.
GetWeaviateConnectionStatuses	GetWeaviateConnectionStatusesRequest	GetWeaviateConnectionStatusesResponse	Get the current health status of all registered Weaviate connections.
CreateWeaviateSinkConnector	CreateWeaviateSinkConnectorRequest	CreateWeaviateSinkConnectorResponse	Create a Weaviate sink connector using a registered connection.
RegisterElasticsearchConnection	RegisterElasticsearchConnectionRequest	RegisterElasticsearchConnectionResponse	Register an Elasticsearch connection for health monitoring.
GetElasticsearchConnectionStatuses	GetElasticsearchConnectionStatusesRequest	GetElasticsearchConnectionStatusesResponse	Get the current health status of all registered Elasticsearch connections.
CreateElasticsearchSinkConnector	CreateElasticsearchSinkConnectorRequest	CreateElasticsearchSinkConnectorResponse	Create an Elasticsearch sink connector using a registered connection.

FileSystemService

FileSystemService provides a virtual filesystem abstraction over Kafka topics and connected data sources. Use it to mount external data sources, browse available topics, and retrieve topic schemas.

Methods

Method	Request	Response	Description
Mount	MountRequest	MountResponse	Mount a data source (e.g., a Kafka cluster or database) into the virtual filesystem. Once mounted, its topics become browsable via Ls and their schemas via GetSchema.
Unmount	UnmountRequest	UnmountResponse	Remove a previously mounted data source from the virtual filesystem.
Ls	LsRequest	LsResponse	List topics and directories at the given path in the virtual filesystem.
GetSchema	GetSchemaRequest	GetSchemaResponse	Retrieve the schema (field names) for a topic at the given path.

InteractiveSessionService

InteractiveSessionService provides a REPL-like interface for executing TypeStream DSL programs. Start a session, run programs, stream their output, get tab completions, and stop sessions when done.

Methods

Method	Request	Response	Description
StartSession	StartSessionRequest	StartSessionResponse	Start a new interactive session for the given user. Returns a session ID used to identify the session in subsequent calls.
RunProgram	RunProgramRequest	RunProgramResponse	Execute a TypeStream DSL program within an existing session. Returns the initial output; use GetProgramOutput to stream additional results.
GetProgramOutput	GetProgramOutputRequest	GetProgramOutputResponse stream	Stream additional output from a running program. Use this after RunProgram when hasMoreOutput is true.
CompleteProgram	CompleteProgramRequest	CompleteProgramResponse	Get tab-completion suggestions for a partial program at the given cursor position.
StopSession	StopSessionRequest	StopSessionResponse	Stop an interactive session and release its resources.

JobService

JobService manages streaming jobs — the core execution units of TypeStream. Create jobs from DSL source code or visual pipeline graphs, list running jobs, run preview jobs for live data inspection, infer schemas across a pipeline, and list available OpenAI models for AI-powered transforms.

Methods

Method	Request	Response	Description
CreateJob	CreateJobRequest	CreateJobResponse	Create a streaming job from TypeStream DSL source code.
CreateJobFromGraph	CreateJobFromGraphRequest	CreateJobResponse	Create a streaming job from a visual pipeline graph with optional sink configurations.
ListJobs	ListJobsRequest	ListJobsResponse	List all running and recently stopped jobs.
CreatePreviewJob	CreatePreviewJobRequest	CreatePreviewJobResponse	Create a short-lived preview job for inspecting live data flowing through a pipeline.
StopPreviewJob	StopPreviewJobRequest	StopPreviewJobResponse	Stop a running preview job.
StreamPreview	StreamPreviewRequest	StreamPreviewResponse stream	Stream live preview data from a running preview job.
InferGraphSchemas	InferGraphSchemasRequest	InferGraphSchemasResponse	Infer the output schema for each node in a pipeline graph. Useful for showing field types in the UI before running a job.
ListOpenAIModels	ListOpenAIModelsRequest	ListOpenAIModelsResponse	List available OpenAI models for use with AI-powered transform nodes.

PipelineService

PipelineService manages declarative pipeline definitions using a GitOps-style workflow. Validate pipeline graphs before deploying, apply them to create or update running jobs, list active pipelines, delete them, and plan changes (dry-run diff) before applying.

Methods

Method	Request	Response	Description
ValidatePipeline	ValidatePipelineRequest	ValidatePipelineResponse	Validate a pipeline definition without deploying it. Returns validation errors and warnings.
ApplyPipeline	ApplyPipelineRequest	ApplyPipelineResponse	Apply a pipeline definition — creates a new job or updates an existing one. Returns the job ID and whether the pipeline was created, updated, or unchanged.
ListPipelines	ListPipelinesRequest	ListPipelinesResponse	List all currently registered pipelines and their job status.
DeletePipeline	DeletePipelineRequest	DeletePipelineResponse	Delete a pipeline by name, stopping its underlying job.
PlanPipelines	PlanPipelinesRequest	PlanPipelinesResponse	Dry-run a set of pipeline definitions against the current state. Returns a plan showing which pipelines would be created, updated, deleted, or unchanged.

StateQueryService

StateQueryService provides interactive query capabilities for KTable state stores from running Kafka Streams jobs. State stores are only queryable when the underlying Kafka Streams application is in RUNNING state.

Limitation: GetValue currently only supports simple string keys. For complex keys (structs, etc.), use GetAllValues and filter on the client side.

Methods

Method	Request	Response	Description
ListStores	ListStoresRequest	ListStoresResponse	Lists all queryable state stores from running Kafka Streams jobs. Only stores from jobs in RUNNING state are included. Stores that cannot be queried (e.g., during rebalancing) are omitted.
GetAllValues	GetAllValuesRequest	KeyValuePair stream	Streams all key-value pairs from a state store with optional limit. Keys are JSON-serialized and may be primitives or complex objects depending on the key schema. Returns NOT_FOUND if the store doesn't exist, UNAVAILABLE if the job isn't running.
GetValue	GetValueRequest	GetValueResponse	Retrieves a single value from a state store by key. Note: Currently only supports simple string keys. For complex keys, use GetAllValues. Returns NOT_FOUND if the store doesn't exist, UNAVAILABLE if the job isn't running.

Messages

ConnectionStatus

Connection health status (excludes credentials — credentials stay server-side).

Field	Type	Description
id	string
name	string
state	ConnectionState
error	string
last_checked	google.protobuf.Timestamp
config	DatabaseConnectionConfigPublic	Public config (no password).

CreateElasticsearchSinkConnectorRequest

Field	Type	Description
connection_id	string	ID of a registered Elasticsearch connection.
connector_name	string	Name for the Kafka Connect connector.
topics	string	Kafka topic(s) to consume from.
index_name	string	Target Elasticsearch index name.
document_id_strategy	string	Document ID strategy: "RECORD_KEY" or "TOPIC_PARTITION_OFFSET".
write_method	string	Write method: "INSERT" or "UPSERT".
behavior_on_null_values	string	Behavior on null values: "IGNORE", "DELETE", or "FAIL".

CreateElasticsearchSinkConnectorResponse

Field	Type	Label	Description
success	bool
error	string
connector_name	string

CreateJdbcSinkConnectorRequest

Field	Type	Description
connection_id	string	ID of a registered connection (server resolves credentials).
connector_name	string	Name for the Kafka Connect connector.
topics	string	Kafka topic(s) to consume from.
table_name	string	Target database table name.
insert_mode	string	Write mode: "insert", "upsert", or "update".
primary_key_fields	string	Comma-separated primary key fields (required for upsert/update mode).

CreateJdbcSinkConnectorResponse

Field	Type	Description
success	bool
error	string
connector_name	string	Name of the created connector.

CreateWeaviateSinkConnectorRequest

Field	Type	Description
connection_id	string	ID of a registered Weaviate connection.
connector_name	string	Name for the Kafka Connect connector.
topics	string	Kafka topic(s) to consume from.
collection_name	string
document_id_strategy	string	Document ID strategy: "NoIdStrategy", "KafkaIdStrategy", or "FieldIdStrategy".
document_id_field	string	Field to use as document ID (when using FieldIdStrategy).
vector_strategy	string	Vector strategy: "NoVectorStrategy" or "FieldVectorStrategy".
vector_field	string	Field containing the vector (when using FieldVectorStrategy).
timestamp_field	string	Optional: field name for timestamp conversion (empty = no transform).

CreateWeaviateSinkConnectorResponse

Field	Type	Label	Description
success	bool
error	string
connector_name	string

DatabaseConnectionConfig

Database connection configuration including credentials.

Field	Type	Description
id	string
name	string
database_type	DatabaseType
hostname	string	Hostname for server health checks (e.g., "localhost").
port	int32
database	string
username	string
password	string
connector_hostname	string	Hostname for Kafka Connect connectors (e.g., "postgres" in Docker).

DatabaseConnectionConfigPublic

Public database connection config (excludes sensitive fields like password).

Field	Type	Description
id	string
name	string
database_type	DatabaseType
hostname	string
port	int32
database	string
username	string
connector_hostname	string	password intentionally excluded

ElasticsearchConnectionConfig

Elasticsearch connection configuration.

Field	Type	Description
id	string
name	string
connection_url	string	Elasticsearch URL (e.g., "http://localhost:9200").
username	string	Optional: username for authentication.
password	string	Optional: password for authentication.
connector_url	string	URL for Kafka Connect (e.g., "http://elasticsearch:9200").

ElasticsearchConnectionConfigPublic

Public Elasticsearch connection config (excludes password).

Field	Type	Description
id	string
name	string
connection_url	string
username	string
connector_url	string	password intentionally excluded

ElasticsearchConnectionStatus

Health status of an Elasticsearch connection.

Field	Type	Label	Description
id	string
name	string
state	ConnectionState
error	string
last_checked	google.protobuf.Timestamp
config	ElasticsearchConnectionConfigPublic

GetConnectionStatusesRequest

GetConnectionStatusesResponse

Field	Type	Label	Description
statuses	ConnectionStatus	repeated	Health statuses of all registered database connections.

GetElasticsearchConnectionStatusesRequest

GetElasticsearchConnectionStatusesResponse

Field	Type	Label	Description
statuses	ElasticsearchConnectionStatus	repeated

GetWeaviateConnectionStatusesRequest

GetWeaviateConnectionStatusesResponse

Field	Type	Label	Description
statuses	WeaviateConnectionStatus	repeated

RegisterConnectionRequest

Field	Type	Label	Description
connection	DatabaseConnectionConfig		Database connection configuration (including credentials for initial setup).

RegisterConnectionResponse

Field	Type	Description
success	bool
error	string
status	ConnectionStatus	Health status of the newly registered connection.

RegisterElasticsearchConnectionRequest

Field	Type	Label	Description
connection	ElasticsearchConnectionConfig		Elasticsearch connection configuration (including password for initial setup).

RegisterElasticsearchConnectionResponse

Field	Type	Label	Description
success	bool
error	string
status	ElasticsearchConnectionStatus

RegisterWeaviateConnectionRequest

Field	Type	Label	Description
connection	WeaviateConnectionConfig		Weaviate connection configuration (including API key for initial setup).

RegisterWeaviateConnectionResponse

Field	Type	Label	Description
success	bool
error	string
status	WeaviateConnectionStatus

TestConnectionRequest

Field	Type	Label	Description
connection	DatabaseConnectionConfig		Database connection configuration to test (one-shot, not registered).

TestConnectionResponse

Field	Type	Description
success	bool
error	string
latency_ms	int64	Round-trip latency of the connection test in milliseconds.

UnregisterConnectionRequest

Field	Type	Label	Description
connection_id	string		ID of the connection to unregister.

UnregisterConnectionResponse

Field	Type	Label	Description
success	bool
error	string

WeaviateConnectionConfig

Weaviate vector database connection configuration.

Field	Type	Description
id	string
name	string
rest_url	string	Weaviate REST API URL (e.g., "http://localhost:8090").
grpc_url	string	Weaviate gRPC URL (e.g., "localhost:50051").
grpc_secured	bool
auth_scheme	string	Authentication scheme: "NONE" or "API_KEY".
api_key	string	API key for Weaviate Cloud (stored server-side only).
connector_rest_url	string	REST URL for Kafka Connect (e.g., "http://weaviate:8080").
connector_grpc_url	string	gRPC URL for Kafka Connect (e.g., "weaviate:50051").

WeaviateConnectionConfigPublic

Public Weaviate connection config (excludes API key).

Field	Type	Description
id	string
name	string
rest_url	string
grpc_url	string
grpc_secured	bool
auth_scheme	string
connector_rest_url	string	api_key intentionally excluded
connector_grpc_url	string

WeaviateConnectionStatus

Health status of a Weaviate connection.

Field	Type	Label	Description
id	string
name	string
state	ConnectionState
error	string
last_checked	google.protobuf.Timestamp
config	WeaviateConnectionConfigPublic

FileInfo

Metadata about a single entry in the virtual filesystem.

Field	Type	Label	Description
name	string		Name of the file or directory.
encoding	Encoding		Data encoding of the topic (only set for topic entries).

GetSchemaRequest

Field	Type	Label	Description
user_id	string		Identifier for the user session.
path	string		Virtual filesystem path to the topic (e.g., "/local/topics/my-topic").

GetSchemaResponse

Field	Type	Label	Description
fields	string	repeated	Field names in the topic's schema.
error	string		Error message if schema retrieval failed.

LsRequest

Field	Type	Label	Description
user_id	string		Identifier for the user session.
path	string		Virtual filesystem path to list (e.g., "/local/topics").

LsResponse

Field	Type	Label	Description
files	FileInfo	repeated	List of entries at the requested path.
error	string		Error message if the listing failed.

MountRequest

Field	Type	Label	Description
user_id	string		Identifier for the user session.
config	string		Configuration string for the data source to mount (e.g., connection details).

MountResponse

Field	Type	Label	Description
success	bool		Whether the mount operation succeeded.
error	string		Error message if the mount failed.

UnmountRequest

Field	Type	Label	Description
user_id	string		Identifier for the user session.
endpoint	string		The endpoint path of the mounted data source to remove.

UnmountResponse

Field	Type	Label	Description
success	bool		Whether the unmount operation succeeded.
error	string		Error message if the unmount failed.

CompleteProgramRequest

Field	Type	Description
session_id	string	Session to get completions for.
source	string	Partial TypeStream DSL source code.
cursor	int32	Cursor position (character offset) within the source for completion.

CompleteProgramResponse

Field	Type	Label	Description
value	string	repeated	List of completion suggestions.

GetProgramOutputRequest

Field	Type	Label	Description
session_id	string		Session containing the program.
id	string		Identifier of the running program (from RunProgramResponse.id).

GetProgramOutputResponse

Field	Type	Label	Description
stdOut	string		Standard output chunk from the running program.
stdErr	string		Standard error chunk from the running program.

RunProgramRequest

Field	Type	Label	Description
session_id	string		Session to run the program in.
source	string		TypeStream DSL program source code to execute.

RunProgramResponse

Field	Type	Label	Description
id	string		Identifier for the running program instance.
env	RunProgramResponse.EnvEntry	repeated	Environment variables set by the program (e.g., topic assignments).
stdOut	string		Initial standard output from the program.
stdErr	string		Standard error output from the program.
hasMoreOutput	bool		If true, more output is available via GetProgramOutput.

RunProgramResponse.EnvEntry

Field	Type	Label	Description
key	string
value	string

StartSessionRequest

Field	Type	Label	Description
user_id	string		Identifier for the user starting the session.

StartSessionResponse

Field	Type	Label	Description
session_id	string		Unique identifier for the created session.

StopSessionRequest

Field	Type	Label	Description
session_id	string		Session to stop.

StopSessionResponse

Field	Type	Label	Description
stdOut	string		Any remaining standard output from the session.
stdErr	string		Any remaining standard error from the session.

CreateJobFromGraphRequest

Field	Type	Label	Description
user_id	string		Identifier for the user creating the job.
graph	PipelineGraph		Pipeline graph defining the job topology.
db_sink_configs	DbSinkConfig	repeated	Database sink configurations for DB sink nodes.
weaviate_sink_configs	WeaviateSinkConfig	repeated	Weaviate sink configurations for Weaviate sink nodes.
elasticsearch_sink_configs	ElasticsearchSinkConfig	repeated	Elasticsearch sink configurations for Elasticsearch sink nodes.

CreateJobRequest

Field	Type	Label	Description
user_id	string		Identifier for the user creating the job.
source	string		TypeStream DSL source code defining the pipeline.

CreateJobResponse

Field	Type	Label	Description
success	bool		Whether the job was created successfully.
job_id	string		Unique identifier for the created job.
error	string		Error message if job creation failed.
created_connectors	string	repeated	Names of Kafka Connect connectors created for DB sinks.

CreatePreviewJobRequest

Field	Type	Label	Description
graph	PipelineGraph		Pipeline graph to run as a preview.
inspector_node_id	string		ID of the inspector node that triggered this preview.

CreatePreviewJobResponse

Field	Type	Description
success	bool
job_id	string	ID of the created preview job.
inspect_topic	string	Kafka topic to consume for live preview data.
error	string

DbSinkConfig

Configuration for a database sink connector. Credentials are resolved server-side from the connection ID.

Field	Type	Description
node_id	string	ID of the node this sink config belongs to.
connection_id	string	Registered connection ID (server resolves credentials).
table_name	string	Target database table name.
insert_mode	string	Write mode: "insert", "upsert", or "update".
primary_key_fields	string	Comma-separated primary key fields (for upsert/update mode).
intermediate_topic	string	Intermediate Kafka topic (auto-generated if empty).

DbSinkNode

Database sink — writes records to a database table.

Field	Type	Description
connection_id	string	Registered database connection ID.
table_name	string	Target database table name.
insert_mode	string	Write mode: "insert", "upsert", or "update".
primary_key_fields	string	Comma-separated primary key fields (for upsert/update mode).

ElasticsearchSinkConfig

Configuration for an Elasticsearch sink connector.

Field	Type	Description
node_id	string	ID of the node this sink config belongs to.
connection_id	string	Registered Elasticsearch connection ID.
intermediate_topic	string	Intermediate Kafka topic for the connector.
index_name	string	Target Elasticsearch index name.
document_id_strategy	string	Document ID strategy: "RECORD_KEY" or "TOPIC_PARTITION_OFFSET".
write_method	string	Write method: "INSERT" or "UPSERT".
behavior_on_null_values	string	Behavior on null values: "IGNORE", "DELETE", or "FAIL".

ElasticsearchSinkNode

Elasticsearch sink — writes records to an Elasticsearch index.

Field	Type	Description
connection_id	string	Registered Elasticsearch connection ID.
index_name	string	Target Elasticsearch index name.
document_id_strategy	string	Document ID strategy: "RECORD_KEY" or "TOPIC_PARTITION_OFFSET".
write_method	string	Write method: "INSERT" or "UPSERT".
behavior_on_null_values	string	Behavior on null values: "IGNORE", "DELETE", or "FAIL".

EmbeddingGeneratorNode

Embedding generator — generates vector embeddings from text using OpenAI.

Field	Type	Description
text_field	string	Field containing the text to embed.
output_field	string	Output field name for the embedding vector (default: "embedding").
model	string	OpenAI model to use (default: "text-embedding-3-small").

GeoIpNode

GeoIP enrichment — resolves an IP address to a country code.

Field	Type	Label	Description
ip_field	string		Field containing the IP address to look up.
output_field	string		Output field name for the country code (default: "country_code").

InferGraphSchemasRequest

Field	Type	Label	Description
graph	PipelineGraph		Pipeline graph to infer schemas for.

InferGraphSchemasResponse

Field	Type	Label	Description
schemas	InferGraphSchemasResponse.SchemasEntry	repeated	Map of node ID to its inferred schema.

InferGraphSchemasResponse.SchemasEntry

Field	Type	Label	Description
key	string
value	NodeSchemaResult

InspectorNode

Inspector — taps into the data stream for live preview.

Field	Type	Label	Description
label	string		Optional label for the inspector tap point.

JobInfo

Information about a streaming job.

Field	Type	Label	Description
job_id	string		Unique job identifier.
state	JobState		Current lifecycle state.
start_time	int64		Unix timestamp in milliseconds when the job started.
graph	PipelineGraph		The job's pipeline graph.
throughput	JobThroughput		Throughput metrics.
weaviate_sinks	WeaviateSinkConfig	repeated	Weaviate sink configs for this job.
elasticsearch_sinks	ElasticsearchSinkConfig	repeated	Elasticsearch sink configs for this job.

JobThroughput

Throughput metrics for a running job.

Field	Type	Description
messages_per_second	double	Current processing rate (messages/sec).
total_messages	int64	Total messages processed since job start.
bytes_per_second	double	Current bandwidth consumption (bytes/sec).
total_bytes	int64	Total bytes processed since job start.

KafkaSinkNode

Kafka topic sink — writes output to a Kafka topic.

Field	Type	Label	Description
topic_name	string		Name of the target Kafka topic.

KafkaSourceNode

Kafka topic source — reads from a Kafka topic.

Field	Type	Description
topic_path	string	Virtual filesystem path to the topic (e.g., "/local/topics/my-topic").
encoding	Encoding	Data encoding of the topic.
unwrap_cdc	bool	If true, extract the "after" payload from a CDC envelope.

ListJobsRequest

Field	Type	Label	Description
user_id	string		Identifier for the user listing jobs.

ListJobsResponse

Field	Type	Label	Description
jobs	JobInfo	repeated	All known jobs.

ListOpenAIModelsRequest

ListOpenAIModelsResponse

Field	Type	Label	Description
models	OpenAIModel	repeated	Available OpenAI models.

MaterializedViewNode

Materialized view — aggregates data into a queryable state store.

Field	Type	Description
group_by_field	string	Field to group by for the aggregation.
aggregation_type	string	Aggregation type: "count" or "latest".
enable_windowing	bool	Whether to apply windowed aggregation.
window_size_seconds	int64	Window size in seconds (only used when enable_windowing is true).

NodeSchemaResult

Inferred schema for a single pipeline node.

Field	Type	Label	Description
fields	string	repeated	Field names (simplified view).
typed_fields	SchemaField	repeated	Fields with type information.
encoding	string		Data encoding of the node's output.
error	string		Error if schema inference failed for this node.

OpenAIModel

An available OpenAI model.

Field	Type	Label	Description
id	string		Model ID (e.g., "gpt-4o-mini").
name	string		Human-readable display name.

OpenAiTransformerNode

OpenAI transformer — enriches records using an OpenAI language model.

Field	Type	Description
prompt	string	Instruction prompt describing the transformation.
output_field	string	Output field name for the AI response (default: "ai_response").
model	string	OpenAI model ID to use (default: "gpt-4o-mini").

PostgresSourceNode

Postgres CDC source — reads change events from a Postgres table via Debezium.

Field	Type	Label	Description
topic_path	string		Virtual filesystem path to the CDC topic.

SchemaField

A single field in a schema.

Field	Type	Label	Description
name	string		Field name.
type	string		Field type (e.g., "STRING", "INT64", "STRUCT").

StopPreviewJobRequest

Field	Type	Label	Description
job_id	string		ID of the preview job to stop.

StopPreviewJobResponse

Field	Type	Label	Description
success	bool
error	string

StreamPreviewRequest

Field	Type	Label	Description
job_id	string		ID of the preview job to stream from.

StreamPreviewResponse

A single record from the preview stream.

Field	Type	Description
key	string	Record key.
value	string	Record value (JSON string).
timestamp	int64	Record timestamp (Unix milliseconds).

TextExtractorNode

Text extractor — extracts text content from files referenced in records.

Field	Type	Label	Description
file_path_field	string		Field containing the file path to extract text from.
output_field	string		Output field name for the extracted text (default: "text").

UserFilterNode

Filter transform — keeps only records matching an expression.

Field	Type	Label	Description
expression	string		Filter expression (e.g., ".status == 'active'").

UserPipelineGraph

A user-facing pipeline graph consisting of nodes and edges.

Field	Type	Label	Description
nodes	UserPipelineNode	repeated	Nodes in the pipeline.
edges	PipelineEdge	repeated	Edges connecting nodes (directed, from source to sink).

UserPipelineNode

A node in a user-facing pipeline graph.

Field	Type	Description
id	string	Unique node identifier within the graph.
kafka_source	KafkaSourceNode	Sources
postgres_source	PostgresSourceNode
filter	UserFilterNode	Transforms
geo_ip	GeoIpNode
text_extractor	TextExtractorNode
embedding_generator	EmbeddingGeneratorNode
open_ai_transformer	OpenAiTransformerNode
kafka_sink	KafkaSinkNode	Sinks
inspector	InspectorNode
materialized_view	MaterializedViewNode
db_sink	DbSinkNode
weaviate_sink	WeaviateSinkNode
elasticsearch_sink	ElasticsearchSinkNode

WeaviateSinkConfig

Configuration for a Weaviate vector database sink connector.

Field	Type	Description
node_id	string	ID of the node this sink config belongs to.
connection_id	string	Registered Weaviate connection ID.
intermediate_topic	string	Intermediate Kafka topic for the connector.
collection_name	string	Target Weaviate collection name.
document_id_strategy	string	Document ID strategy: "NoIdStrategy", "KafkaIdStrategy", or "FieldIdStrategy".
document_id_field	string	Field to use as document ID (when using FieldIdStrategy).
vector_strategy	string	Vector strategy: "NoVectorStrategy" or "FieldVectorStrategy".
vector_field	string	Field containing the vector.
timestamp_field	string	Optional: field name for timestamp conversion (empty = no transform).

WeaviateSinkNode

Weaviate sink — writes records to a Weaviate vector database collection.

Field	Type	Description
connection_id	string	Registered Weaviate connection ID.
collection_name	string	Target Weaviate collection name.
document_id_strategy	string	Document ID strategy: "NoIdStrategy", "KafkaIdStrategy", or "FieldIdStrategy".
document_id_field	string	Field to use as document ID (when using FieldIdStrategy).
vector_strategy	string	Vector strategy: "NoVectorStrategy" or "FieldVectorStrategy".
vector_field	string	Field containing the vector.
timestamp_field	string	Optional: field name for timestamp conversion.

ApplyPipelineRequest

Field	Type	Label	Description
metadata	PipelineMetadata		Pipeline metadata (name, version, description).
graph	UserPipelineGraph		The pipeline graph to deploy.

ApplyPipelineResponse

Field	Type	Description
success	bool	Whether the apply operation succeeded.
job_id	string	ID of the created or updated job.
error	string	Error message if the apply failed.
state	PipelineState	Whether the pipeline was created, updated, or unchanged.

DeletePipelineRequest

Field	Type	Label	Description
name	string		Name of the pipeline to delete.

DeletePipelineResponse

Field	Type	Label	Description
success	bool		Whether the delete operation succeeded.
error	string		Error message if the delete failed.

ListPipelinesRequest

ListPipelinesResponse

Field	Type	Label	Description
pipelines	PipelineInfo	repeated	All registered pipelines.

PipelineInfo

Information about a registered pipeline and its running job.

Field	Type	Description
name	string	Pipeline name.
version	string	Pipeline version.
description	string	Pipeline description.
job_id	string	ID of the underlying job.
job_state	JobState	Current state of the underlying job.
applied_at	int64	Unix timestamp (milliseconds) when this pipeline was last applied.
graph	PipelineGraph	Internal pipeline graph (compiler representation).
user_graph	UserPipelineGraph	User-facing pipeline graph as originally submitted.

PipelineMetadata

Metadata describing a pipeline definition.

Field	Type	Description
name	string	Unique name identifying this pipeline.
version	string	Version string for tracking changes (e.g., "v1", "2024-01-15").
description	string	Human-readable description of what this pipeline does.

PipelinePlan

A pipeline definition included in a plan request.

Field	Type	Label	Description
metadata	PipelineMetadata		Pipeline metadata (name, version, description).
graph	UserPipelineGraph		The pipeline graph to plan.

PipelinePlanResult

Result for a single pipeline in a plan response.

Field	Type	Description
name	string	Pipeline name.
action	PipelineAction	Action that would be taken.
current_version	string	Current version on the server (empty if pipeline is new).
new_version	string	New version from the plan (empty if pipeline would be deleted).

PlanPipelinesRequest

Field	Type	Label	Description
pipelines	PipelinePlan	repeated	Pipeline definitions to plan against current state.

PlanPipelinesResponse

Field	Type	Label	Description
results	PipelinePlanResult	repeated	Plan results for each pipeline.
errors	string	repeated	Errors encountered during planning.

ValidatePipelineRequest

Field	Type	Label	Description
metadata	PipelineMetadata		Pipeline metadata (name, version, description).
graph	UserPipelineGraph		The pipeline graph to validate.

ValidatePipelineResponse

Field	Type	Label	Description
valid	bool		Whether the pipeline is valid.
errors	string	repeated	Validation errors (pipeline cannot be applied).
warnings	string	repeated	Validation warnings (pipeline can be applied but may have issues).

GetAllValuesRequest

Field	Type	Description
store_name	string	Name of the state store to query
limit	int32	Maximum number of entries to return (default: 100)
from_key	string	Reserved for future pagination support - not currently implemented

GetValueRequest

Field	Type	Label	Description
store_name	string		Name of the state store to query
key	string		The key to look up. Note: This is used directly as a string key. Complex key types (structs, etc.) are not supported. For stores with complex keys, use GetAllValues and filter client-side.

GetValueResponse

Field	Type	Label	Description
found	bool		True if the key was found in the store
value	string		String representation of the value, empty if not found

KeyValuePair

Field	Type	Label	Description
key	string		JSON-serialized representation of the key. Format depends on the key's schema type: - String: ""value"" (JSON string) - Int/Long: "123" (JSON number) - Struct: "{"field": "value"}" (JSON object)
value	string		String representation of the value (for count operations, this is the count as a string)

ListStoresRequest

ListStoresResponse

Field	Type	Label	Description
stores	StoreInfo	repeated

StoreInfo

Field	Type	Description
name	string	The name of the state store (e.g., "job-id-count-store-0")
job_id	string	The ID of the job that owns this store
approximate_count	int64	Approximate number of entries in the store (from RocksDB metadata)

Enums

ConnectionState

Health state of a monitored connection.

Name	Number	Description
CONNECTION_STATE_UNSPECIFIED	0
CONNECTED	1
DISCONNECTED	2
ERROR	3
CONNECTING	4

DatabaseType

Supported database types for connections.

Name	Number	Description
DATABASE_TYPE_UNSPECIFIED	0
POSTGRES	1
MYSQL	2

Encoding

Data encoding format for Kafka topics.

Name	Number	Description
STRING	0
NUMBER	1
JSON	2
AVRO	3
PROTOBUF	4

JobState

Lifecycle state of a streaming job.

Name	Number	Description
JOB_STATE_UNSPECIFIED	0
STARTING	1
RUNNING	2
STOPPING	3
STOPPED	4
FAILED	5
UNKNOWN	6

PipelineAction

Action that would be taken for a pipeline in a plan.

Name	Number	Description
PIPELINE_ACTION_UNSPECIFIED	0
CREATE	1	Pipeline does not exist and would be created.
UPDATE	2	Pipeline exists and would be updated.
NO_CHANGE	3	Pipeline exists and has not changed.
DELETE	4	Pipeline exists on server but is not in the plan, so it would be deleted.

PipelineState

Result state after applying a pipeline.

Name	Number	Description
PIPELINE_STATE_UNSPECIFIED	0
CREATED	1	Pipeline was newly created.
UPDATED	2	Pipeline was updated with new configuration.
UNCHANGED	3	Pipeline configuration has not changed.

Scalar Value Types

.proto Type	Notes	C++	Java	Python	Go	C#	PHP	Ruby
double		double	double	float	float64	double	float	Float
float		float	float	float	float32	float	float	Float
int32	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.	int32	int	int	int32	int	integer	Bignum or Fixnum (as required)
int64	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.	int64	long	int/long	int64	long	integer/string	Bignum
uint32	Uses variable-length encoding.	uint32	int	int/long	uint32	uint	integer	Bignum or Fixnum (as required)
uint64	Uses variable-length encoding.	uint64	long	int/long	uint64	ulong	integer/string	Bignum or Fixnum (as required)
sint32	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.	int32	int	int	int32	int	integer	Bignum or Fixnum (as required)
sint64	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.	int64	long	int/long	int64	long	integer/string	Bignum
fixed32	Always four bytes. More efficient than uint32 if values are often greater than 2^28.	uint32	int	int	uint32	uint	integer	Bignum or Fixnum (as required)
fixed64	Always eight bytes. More efficient than uint64 if values are often greater than 2^56.	uint64	long	int/long	uint64	ulong	integer/string	Bignum
sfixed32	Always four bytes.	int32	int	int	int32	int	integer	Bignum or Fixnum (as required)
sfixed64	Always eight bytes.	int64	long	int/long	int64	long	integer/string	Bignum
bool		bool	boolean	boolean	bool	bool	boolean	TrueClass/FalseClass
string	A string must always contain UTF-8 encoded or 7-bit ASCII text.	string	String	str/unicode	string	string	string	String (UTF-8)
bytes	May contain any arbitrary sequence of bytes.	string	ByteString	str	[]byte	ByteString	string	String (ASCII-8BIT)

Overview​

Services​

ConnectionService​

Methods​

FileSystemService​

Methods​

InteractiveSessionService​

Methods​

JobService​

Methods​

PipelineService​

Methods​

StateQueryService​

Methods​

Messages​

ConnectionStatus​

CreateElasticsearchSinkConnectorRequest​

CreateElasticsearchSinkConnectorResponse​

CreateJdbcSinkConnectorRequest​

CreateJdbcSinkConnectorResponse​

CreateWeaviateSinkConnectorRequest​

CreateWeaviateSinkConnectorResponse​

DatabaseConnectionConfig​

DatabaseConnectionConfigPublic​

ElasticsearchConnectionConfig​

ElasticsearchConnectionConfigPublic​

ElasticsearchConnectionStatus​

GetConnectionStatusesRequest​

GetConnectionStatusesResponse​

GetElasticsearchConnectionStatusesRequest​

GetElasticsearchConnectionStatusesResponse​

GetWeaviateConnectionStatusesRequest​

GetWeaviateConnectionStatusesResponse​

RegisterConnectionRequest​

RegisterConnectionResponse​

RegisterElasticsearchConnectionRequest​

RegisterElasticsearchConnectionResponse​

RegisterWeaviateConnectionRequest​

RegisterWeaviateConnectionResponse​

TestConnectionRequest​

TestConnectionResponse​

UnregisterConnectionRequest​

UnregisterConnectionResponse​

WeaviateConnectionConfig​

WeaviateConnectionConfigPublic​

WeaviateConnectionStatus​

FileInfo​

GetSchemaRequest​

GetSchemaResponse​

LsRequest​

LsResponse​

MountRequest​

MountResponse​

UnmountRequest​

UnmountResponse​

CompleteProgramRequest​

CompleteProgramResponse​

GetProgramOutputRequest​

GetProgramOutputResponse​

RunProgramRequest​

RunProgramResponse​

RunProgramResponse.EnvEntry​

StartSessionRequest​

StartSessionResponse​

StopSessionRequest​

StopSessionResponse​

CreateJobFromGraphRequest​

CreateJobRequest​

CreateJobResponse​

CreatePreviewJobRequest​

CreatePreviewJobResponse​

DbSinkConfig​

DbSinkNode​

ElasticsearchSinkConfig​

ElasticsearchSinkNode​

EmbeddingGeneratorNode​

GeoIpNode​

InferGraphSchemasRequest​

InferGraphSchemasResponse​

InferGraphSchemasResponse.SchemasEntry​

Overview

Services

ConnectionService

Methods

FileSystemService

Methods

InteractiveSessionService

Methods

JobService

Methods

PipelineService

Methods

StateQueryService

Methods

Messages

ConnectionStatus

CreateElasticsearchSinkConnectorRequest

CreateElasticsearchSinkConnectorResponse

CreateJdbcSinkConnectorRequest

CreateJdbcSinkConnectorResponse

CreateWeaviateSinkConnectorRequest

CreateWeaviateSinkConnectorResponse

DatabaseConnectionConfig

DatabaseConnectionConfigPublic

ElasticsearchConnectionConfig

ElasticsearchConnectionConfigPublic

ElasticsearchConnectionStatus

GetConnectionStatusesRequest

GetConnectionStatusesResponse

GetElasticsearchConnectionStatusesRequest

GetElasticsearchConnectionStatusesResponse

GetWeaviateConnectionStatusesRequest

GetWeaviateConnectionStatusesResponse

RegisterConnectionRequest

RegisterConnectionResponse

RegisterElasticsearchConnectionRequest

RegisterElasticsearchConnectionResponse

RegisterWeaviateConnectionRequest

RegisterWeaviateConnectionResponse

TestConnectionRequest

TestConnectionResponse

UnregisterConnectionRequest

UnregisterConnectionResponse

WeaviateConnectionConfig

WeaviateConnectionConfigPublic

WeaviateConnectionStatus

FileInfo

GetSchemaRequest

GetSchemaResponse

LsRequest

LsResponse

MountRequest

MountResponse

UnmountRequest

UnmountResponse

CompleteProgramRequest

CompleteProgramResponse

GetProgramOutputRequest

GetProgramOutputResponse

RunProgramRequest

RunProgramResponse

RunProgramResponse.EnvEntry

StartSessionRequest

StartSessionResponse

StopSessionRequest

StopSessionResponse

CreateJobFromGraphRequest

CreateJobRequest

CreateJobResponse

CreatePreviewJobRequest

CreatePreviewJobResponse

DbSinkConfig

DbSinkNode

ElasticsearchSinkConfig

ElasticsearchSinkNode

EmbeddingGeneratorNode

GeoIpNode

InferGraphSchemasRequest

InferGraphSchemasResponse

InferGraphSchemasResponse.SchemasEntry