k0rdent AI Docs
WIP

Workflows

Durable workflow orchestration for long-running operations

Workflows API

Workflow orchestration for long-running operations.

Design Principle: All infrastructure mutations flow through workflows. This provides durable execution, automatic retries, audit trails, and observability. Workflows double as the event log for the platform.

Workflow Runs

Query and manage workflow executions.

Query Parameters:

ParameterTypeDescription
typestringFilter by workflow type
statusstringpending, running, completed, failed, cancelled
resourceTypestringserver, cluster, vm
resourceIdstringSpecific resource ID
triggeredBystringUser ID who triggered
sinceISO dateRuns started after this time
untilISO dateRuns started before this time

Response:

Response:

Response:

Design Decision: Retries create new run records linked to the original. This preserves audit history. fromStep allows resuming from a specific step when earlier steps succeeded.

Workflow Types

Available workflow types and their purposes:

TypeTrigger SourceDescription
server.registerAtlas APIRegister new bare metal server
server.inspectAtlas APIHardware inspection
server.provisionAtlas APIOS provisioning
server.lifecycleAtlas/Arc APILifecycle actions (power, provision, etc.)
server.decommissionAtlas APIRemove from inventory
cluster.createArc APICreate Kubernetes cluster
cluster.scaleArc APIScale cluster nodes
cluster.upgradeArc APIUpgrade Kubernetes version
cluster.deleteArc APIDelete cluster
vm.createArc APICreate virtual machine
vm.powerArc APIVM power actions
vm.deleteArc APIDelete VM
sync.server-stateProjectorSync K8s state to cache
sync.cluster-stateProjectorSync cluster state to cache

Workflow Metrics

Aggregate statistics for workflow performance.

Example:

Response:


Workflow Operations

Long-running operations that interact with infrastructure (BMC, Kubernetes) return 202 Accepted immediately with a workflow ID for tracking. All infrastructure mutations flow through durable workflows.

Design Principles

  1. Immediate Response: Return 202 within < 1 second, don't wait for completion
  2. Workflow ID: Provide workflow run ID for polling or webhook correlation
  3. Estimated Duration: Give clients a hint for progress UI
  4. Status Endpoint: Query workflow status via /v1/workflows/runs/:id
  5. Webhook Integration: Support webhooks for completion notifications

Workflow Orchestration

Use a durable workflow engine for retryable task execution:

Pattern: Compensating Actions. Use onFailure to clean up partial state. Release allocated resources, update status to error, notify via webhook.

Workflow Status Endpoint

Event Sourcing Pattern

  • We will need to get updated to use event sourcing pattern for the workflows API.
  • Need to decide on K8s Informers, Watchers, or Controller-based approach.

Last updated on

How is this guide?

On this page