Supervisors (3/3)

Mar 26, 2024   Kero van Gelder
We will look at three approaches to work with a single crashing process, i.e. prevent it from taking down our entire Gleam program, namely:

Using Supervisors

A supervisor starts children, and monitors them; these children are supposed to be available all the time, the supervisor will restart children when they terminate.

The supervisor can only set the initial state of any child, but never gets updates – normally the child should not modify its state. Example: A web server receives a TCP socket to use, and accepts connections, for which it spawns other processes to deal with. If the web server ever dies, the supervisor will start a new one and pass it the same TCP socket.

Just looking at the supervisor docs, we can launch a supervisor with one child like this:

import gleam/erlang/process.{type Subject}
import gleam/otp/actor.{type StartError}
import gleam/otp/supervisor.{type Children} as sv

pub fn main() {
  let worker =
    sv.worker(fn(_: Nil) -> Result(Subject(Bool), StartError) {
      actor.start(42, loop)
    })

  let assert Ok(_) =
    fn(children: Children(Nil)) {
      children
      |> sv.add(worker)
    }
    |> sv.start

  process.sleep_forever()
}

fn loop(msg: Bool, state: Int) {
  let assert True = msg
  actor.continue(state)
}
Which sleeps forever; our actor just sits there and never crashes.

Communicate with the actor

We want to crash our actor, and thus need to send it a message. We must pass it a subject, over which the start function can pass us the actor subject. If other gleam processes need to communicate with the child, you need the same technique; contrary to the case of a web server, where the outside world can communicate via tcp server socket passed to the child.
import gleam/erlang/process.{type Subject}
import gleam/otp/actor
import gleam/otp/supervisor.{type Children} as sv

type Created {
  Created(Subject(Bool))
}

pub fn main() {
  let subject: Subject(Created) = process.new_subject()
  let worker =
    sv.worker(fn(_) {
      let assert Ok(s) = actor.start(42, loop)
      process.send(subject, Created(s))
      Ok(s)
    })

  let assert Ok(_) =
    fn(children: Children(Nil)) {
      children
      |> sv.add(worker)
    }
    |> sv.start

  let assert Ok(Created(actor)) = process.receive(subject, 1234)
  process.send(actor, False)

  process.sleep_forever()
}

fn loop(msg: Bool, state: Int) {
  let assert True = msg
  actor.continue(state)
}
When you run this program, you can see the actor crash, whereas the program sleeps forever. The actor is restarted by the supervisor (even if you do not see it).

Repeat!

Note that when the actor is restarted, the same start function is called, and the same subject is passed as the first time. You can therefore receive a new Created message, if you want, and communicate with the new actor. To know when to do that, you probably want to use process.selecting() to wrap the actor subject, and then loop with select_forever().

! However, when other processes need access to the newly created actor, passing the subject around feels unwieldy. This reinforces that a supervisor is more suited to monitor and restart processes that are independent once launched, such as in the web server example.