sched

Workflows that don’t lose state.

A workflow engine in Go that runs your functions reliably across restarts, retries, and worker failure. Postgres-backed history, Redis-stream dispatch, replay on yield, standby HA. Operate it yourself.

Apache 2.0Go 1.25+Postgres + Redis
MonthlyReportwf-1a2b3c
Running
    0/9 events

    Three ideas the engine takes seriously.

    Durable state

    Every state transition persists to Postgres before the RPC returns. Workflows are append-only event logs. A restart loses nothing.

    Replay on yield

    Workflow functions are deterministic over their event history. A worker dies, the engine re-dispatches against the same history, recorded commands become no-ops on the second run.

    Standby HA

    Engine replicas hold a Postgres advisory lock. The leader serves traffic, standbys idle. Lose the leader and one promotes within a single retry interval.

    Write workflows as plain Go.

    A workflow is a function. Inside it, the SDK exposes a few durable primitives: schedule an activity, sleep, wait for a signal. Each one is recorded in history; the engine replays the function against that history on re-dispatch.

    No DSL, no YAML, no state machine generator. Just Go with an honest set of constraints.

    client.RegisterWorkflow("MonthlyReport", func(ctx sdk.WorkflowContext, input any) (any, error) {
        for i := range 3 {
            ctx.QueueActivity("SendEmail", fmt.Sprintf("user%d@example.com", i))
            ctx.Sleep(2 * time.Second)
        }
        return "report complete", nil
    })
    Architecture

    Three processes, two data stores, one binary.

    Workers run your code and poll the engine over bidi gRPC streams. The engine owns the durable side: it writes history to Postgres, queues tasks on Redis Streams, and emits metrics and traces. No sidecars, no coordinator, no leader election service beyond a Postgres advisory lock.

    WorkerSDK · GoWorkerSDK · GoWorkerSDK · GoEnginegRPC · streamingPostgresstate + historyRedistask streamsPrometheusmetrics + OTelpoll · streampersist · dispatch

    What the engine does for you.

    Durable workflow execution

    Workflows survive engine restart, worker crash, and network blips. Every transition is in the event log.

    Retries with exponential backoff

    Failing activities are retried per the registered RetryPolicy. Backoff lives in a real timer, not a sleep loop.

    Signals and waits

    WaitForSignal yields the workflow function. When the signal arrives the engine re-dispatches; replay returns the recorded value.

    Bidi-streamed dispatch

    Workers open a stream once and ack credits per task. Sub-millisecond delivery latency. Graceful shutdown is instant.

    Activity heartbeats

    Long-running activities heartbeat to extend visibility. Cancellation propagates via the same channel.

    Full observability

    Structured slog logs, Prometheus metrics on /metrics, OpenTelemetry tracing across engine/worker/dashboard.

    Three commands to a running engine.

    01

    Clone and bring up the stack.

    git clone https://github.com/edaywalid/sched.git
    cd sched
    make up
    02

    Build the dashboard bundle.

    make web-build
    docker compose build dashboard
    docker compose up -d dashboard
    03

    Open the dashboard and start a workflow.

    # Visit http://localhost:8080
    # Or queue one via the API:
    curl -X POST http://localhost:8080/api/workflows \
      -H "Content-Type: application/json" \
      -d '{"workflowName":"HelloWorld","input":"world"}'
    Frequently asked

    Honest answers.