# Singletons and cron jobs

An [entity](https://effect.plants.sh/cluster/entities/) gives you **one owner per id** — every message for
`"counter-123"` lands on a single live instance. A **singleton** gives you **one owner
across the whole cluster**: a single named effect that runs on exactly one runner at a
time, no matter how many runners are up.

The mechanism is the same sharding you already use for entities. A singleton's `name`
(plus an optional shard group) hashes to a shard; whichever runner currently owns that
shard runs the effect, and everyone else stays idle. When the owning runner leaves, the
shard moves and another runner resumes the singleton. You get a fleet-wide background
worker that **fails over automatically** without writing any leader-election code.

Reach for a singleton when you want exactly one of something for the entire deployment:
a poller, a queue consumer, a scheduler, a metrics aggregator, a maintenance loop.

## A cluster singleton

`Singleton.make(name, run, options)` returns a `Layer` that registers `run` with
sharding under `name`. You merge it into your cluster layer next to your entity layers.

```ts
import { NodeClusterSocket, NodeRuntime } from "@effect/platform-node"
import { Effect, Layer, Schedule } from "effect"
import { Singleton } from "effect/unstable/cluster"
import type { SqlClient } from "effect/unstable/sql"

// A background poller that should run on exactly one runner in the cluster.
// `run` is a long-lived effect: it loops forever until interrupted.
const PollInbox = Singleton.make(
  "poll-inbox",
  Effect.gen(function*() {
    yield* Effect.log("starting inbox poller")

    const pollOnce = Effect.gen(function*() {
      // ... fetch new items, process them ...
      yield* Effect.log("polled inbox")
    }).pipe(
      // Handle expected failures HERE. Failures that escape `run` are surfaced
      // as defects by sharding, which tears the singleton down.
      Effect.catch((error) => Effect.logWarning("poll failed, will retry", error))
    )

    // A simple polling loop. It is interruptible at every `sleep`, so when
    // ownership moves the fiber stops cleanly between iterations.
    yield* pollOnce.pipe(Effect.repeat(Schedule.spaced("10 seconds")))
  })
)

// Wiring: a singleton layer needs `Sharding`, which the cluster transport
// provides. Merge it alongside your entity layers, then provide the cluster.
declare const SqlClientLayer: Layer.Layer<SqlClient.SqlClient>

const ClusterLayer = NodeClusterSocket.layer().pipe(
  Layer.provide(SqlClientLayer)
)

const AppLayer = Layer.mergeAll(
  PollInbox
  // ...CounterLayer, OtherEntityLayer, etc.
).pipe(Layer.provide(ClusterLayer))

// `Layer.launch` keeps the runner alive so it can hold shards and run work.
Layer.launch(AppLayer).pipe(NodeRuntime.runMain)
```

### How ownership works

1. **Placement.** The singleton's `name` and shard group hash to a `ShardId`, exactly
   like an entity id does. The singleton is "located" on that shard. See
   [Sharding & runners](https://effect.plants.sh/cluster/sharding-and-runners/) for how shards map to runners.

2. **Activation.** When the local runner acquires the shard, sharding forks `run`.
   While `run` is alive, no other runner runs this singleton.

3. **Failover.** When the owning runner leaves (crash, deploy, rebalance), its shards
   move to surviving runners. The new owner starts a fresh copy of `run`; the old
   fiber is interrupted.

4. **Teardown.** Closing the layer scope removes the registration and interrupts the
   fiber if it is running.
**Design `run` to be interruptible and idempotent:** Ownership can move **while work is in flight**. Two consequences:

- **Make `run` interruptible.** Long blocking sections should yield (`Effect.sleep`,
  loop boundaries) so the fiber can be interrupted promptly when the shard moves.
- **Make the work idempotent.** A handoff can replay the same step on the new owner.
  Anything with side effects (writing a row, sending a message) should tolerate being
  run more than once.
**Failures become defects:** A failure that escapes `run` is converted to a **defect** by sharding — it does not
quietly restart. If the singleton should keep running, **handle expected errors inside
`run`** with [`Effect.catch`](https://effect.plants.sh/error-management/) and retry, as the example does. An
effect that completes *successfully* is kept alive until ownership moves or the scope
closes; it is not re-run on its own.
**Names must be unique per shard group:** Registering the **same name in the same shard group twice dies during registration**.
Give each singleton a distinct name (the same name in two *different* shard groups is
fine — they are different addresses).

## Reference

### Singleton.make(name, run, options)

The only public export of the `Singleton` module. Registers `run` as a cluster-wide
singleton under `name` and returns a `Layer`.

- `name: string` — the singleton's identity; participates in shard placement and must
  be unique within its shard group.
- `run: Effect<void, E, R>` — the effect to run on the owning runner. Typically a loop
  that runs until interrupted.
- `options.shardGroup?: string` — pin the singleton to a specific shard group so its
  ownership is coordinated with that group's work (defaults to the standard group).

The returned layer's type is `Layer<never, never, Sharding | Exclude<R, Scope>>`: it
requires `Sharding` (supplied by the cluster transport) plus whatever services `run`
needs, **except** `Scope` (the registration provides its own).

```ts
import { Effect, Layer } from "effect"
import { Singleton } from "effect/unstable/cluster"

const ReconcileBalances = Singleton.make(
  "reconcile-balances",
  Effect.log("reconciling..."),
  { shardGroup: "billing" }
)
// ReconcileBalances: Layer<never, never, Sharding>
// => one runner in the "billing" shard group runs the effect

// Merge several singletons together like any other layers.
const Workers = Layer.mergeAll(ReconcileBalances /*, ...other singletons */)
```

`Singleton.make` is a thin wrapper over the lower-level
`Sharding.registerSingleton(name, run, options)`, which does the same thing as an
effect inside an existing scope rather than as a layer. Use the layer form unless you
are already composing inside a `Sharding`-aware effect; see
[Sharding & runners](https://effect.plants.sh/cluster/sharding-and-runners/).

### SingletonAddress

The runtime address assigned to a registered singleton. It is a `Schema.Class` pairing
the singleton `name` with the `ShardId` chosen from that name and shard group. You
encounter it in sharding registration events, metrics, and singleton-ownership
diagnostics — you rarely construct it yourself.

```ts
import { ShardId, SingletonAddress } from "effect/unstable/cluster"

const address = new SingletonAddress.SingletonAddress({
  name: "poll-inbox",
  shardId: ShardId.make("default", 7)
})

address.name // => "poll-inbox"
address.shardId // => ShardId for group "default", id 7
// Equality and hashing are by (name, shardId), so the same singleton in a
// different shard group is a different address.
```
**Note:** A `SingletonAddress` identifies the **shard responsible** for a singleton, not the
runner currently executing it. Which runner owns that shard can change at any time.

## Scheduled jobs across the cluster (ClusterCron)

`ClusterCron` runs a [cron schedule](https://effect.plants.sh/scheduling/cron/) as a clustered job: the effect
fires on its schedule, on **exactly one runner**, with failover.

It is built from two cluster pieces you have already seen:

1. A [**singleton**](#a-cluster-singleton) performs the *initial scheduling step* —
   computing the first run time when the cluster starts.
2. Each run is delivered as a **persisted entity message** at its scheduled time. After
   a run exits, the handler schedules the *next* occurrence the same way.

Because the schedule lives in persisted messages rather than in memory, the job
survives restarts and migrates with the shard. That durability is why `ClusterCron`
needs a real [message storage](https://effect.plants.sh/cluster/message-storage/) backend (e.g. the SQL-backed
layer); the no-op in-memory store will not retain scheduled runs across a crash.

```ts
import { NodeClusterSocket, NodeRuntime } from "@effect/platform-node"
import { Cron, Effect, Layer } from "effect"
import { ClusterCron } from "effect/unstable/cluster"
import type { SqlClient } from "effect/unstable/sql"

// 1. Build the schedule. Either parse a cron expression...
const dailyAt4am = Cron.parseUnsafe("0 0 4 * * *") // sec min hour day month weekday
// ...or construct one from explicit fields with Cron.make.

// 2. The work to run on each tick. Keep it idempotent (see gotchas).
const cleanup = Effect.gen(function*() {
  yield* Effect.log("running daily cleanup")
  // ... delete expired rows, rotate files, etc ...
})

// 3. Register the clustered cron job as a layer.
const DailyCleanup = ClusterCron.make({
  name: "daily-cleanup",
  cron: dailyAt4am,
  execute: cleanup,
  // skip a scheduled run if it is being delivered more than 2 hours late
  skipIfOlderThan: "2 hours"
})

// 4. Wire it into the cluster. ClusterCron needs persisted message storage,
//    which the SQL-backed NodeClusterSocket transport provides.
declare const SqlClientLayer: Layer.Layer<SqlClient.SqlClient>

const ClusterLayer = NodeClusterSocket.layer().pipe(
  Layer.provide(SqlClientLayer)
)

const AppLayer = Layer.mergeAll(
  DailyCleanup
  // ...entity and singleton layers...
).pipe(Layer.provide(ClusterLayer))

Layer.launch(AppLayer).pipe(NodeRuntime.runMain)
```
**Jobs must be idempotent:** A clustered cron run is a persisted message that can be **retried** and can be
**replayed after failover**. Treat `execute` as something that may run more than once
for the same scheduled time, and make its side effects safe to repeat.

### ClusterCron.make(options)

The only public export of the `ClusterCron` module. Returns a
`Layer<never, never, Sharding | Exclude<R, Scope>>` — like a singleton, it requires
`Sharding` plus the services `execute` needs (minus `Scope`).

| Option | Type | Default | Meaning |
| --- | --- | --- | --- |
| `name` | `string` | — | Unique job name; backs the internal singleton and entity (`ClusterCron/<name>`). |
| `cron` | `Cron.Cron` | — | The schedule, from [`Cron.make`](https://effect.plants.sh/scheduling/cron/) or `Cron.parse`/`parseUnsafe`. |
| `execute` | `Effect<void, E, R>` | — | The work to run on each tick. |
| `shardGroup` | `string` | `"default"` | Shard group to run the job on. |
| `calculateNextRunFromPrevious` | `boolean` | `false` | How the next run time is computed (see below). |
| `skipIfOlderThan` | `Duration.Input` | `"1 day"` | Skip a scheduled run delivered later than this. |

```ts
import { Cron, Effect } from "effect"
import { ClusterCron } from "effect/unstable/cluster"

const job = ClusterCron.make({
  name: "hourly-report",
  cron: Cron.parseUnsafe("0 0 * * * *"), // top of every hour
  execute: Effect.log("generating report")
})
// job: Layer<never, never, Sharding>
// => one runner generates the report each hour, with failover
```

**`calculateNextRunFromPrevious` — catch up vs. preserve cadence.** After a run
finishes, the next occurrence must be computed:

- `false` (default): compute the next run from **now** (when the handler exited). If a
  run was delayed, the schedule effectively *catches up from the current time* and you
  never get a burst of backlogged runs.
- `true`: compute the next run from the **previous scheduled time**, preserving the
  original cadence even when a run was late. Use this when the schedule's rhythm
  matters more than catching up (e.g. "every hour on the hour, no drift").

**`skipIfOlderThan` — drop stale runs.** A long outage can leave a scheduled message
sitting in storage. When it finally delivers, if its scheduled time is older than
`skipIfOlderThan`, `execute` is **skipped** (the job still reschedules the next run).
The default of `"1 day"` means runs more than a day late are dropped. Align this with
your job's semantics: a nightly cleanup that missed yesterday probably *should* skip
straight to tonight; a billing run might need a larger window or none of this behavior.

```ts
import { Cron, Effect } from "effect"
import { ClusterCron } from "effect/unstable/cluster"

// Preserve the exact cadence and only skip runs more than 6 hours stale.
const billing = ClusterCron.make({
  name: "nightly-billing",
  cron: Cron.parseUnsafe("0 0 2 * * *"), // 02:00 every day
  execute: Effect.log("billing run"),
  calculateNextRunFromPrevious: true,
  skipIfOlderThan: "6 hours"
})
```
**ClusterCron requires persisted storage:** Scheduled runs are **persisted entity messages**. With the no-op in-memory store they
are lost on restart and the schedule will not survive a crash. Run `ClusterCron` on a
transport backed by durable storage — see [Message storage](https://effect.plants.sh/cluster/message-storage/).

## Next steps

- [Sharding & runners](https://effect.plants.sh/cluster/sharding-and-runners/) — how names hash to shards, how
  shards are placed on runners, and what shard groups mean for singleton placement.
- [Message storage](https://effect.plants.sh/cluster/message-storage/) — the persistence layer backing
  `ClusterCron`'s scheduled runs and persisted entity messages.
- [Cron](https://effect.plants.sh/scheduling/cron/) — building `Cron.Cron` schedules with `Cron.make` and
  `Cron.parse`, which `ClusterCron` consumes.