The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech LeadsNov 27-28

Join

Suture

Go Reference

import "github.com/thejerf/suture/v4"

Suture provides Erlang-ish supervisor trees for Go. "Supervisor trees" -> "sutree" -> "suture" -> holds your code together when it's trying to die.

Gopher graphic of a surgeon holding suture tools

Because AI means no project needs to go without a silly logo. Gopher derived from Renee French's CC work.

If you are reading this on pkg.go.dev, you should visit the v4 docs.

It is intended to deal gracefully with the real failure cases that can occur with supervision trees (such as burning all your CPU time endlessly restarting dead services), while also making no unnecessary demands on the "service" code, and providing hooks to perform adequate logging with in a production environment.

A blog post describing the design decisions is available.

This module is fairly fully covered with godoc including an example, usage, and everything else you might expect from a README.md on GitHub. (DRY.)

v3 and before (which existed before go module support) documentation is also available.

A default slog-based logger is provided in github.com/thejerf/sutureslog. This is a separate Go module in order to avoid "infecting" the main suture/v4 with a new requirement to be on at least Go 1.21. Using this will require an additional go get github.com/thejerf/sutureslog.

Special Thanks

Special thanks to the Syncthing team, who have been fantastic about working with me to push fixes upstream of them.

synctest in Go 1.25

suture's supervisors are safe to use in synctest bubbles, but it turns out synctest doubles as a great way to discover that your services don't shut down correctly when your supervisor does. You'll need to create a context that will be cancelled within the synctest Test function (the context for the test itself gets cancelled too late), and use that to start the supervisor(s).

If you get panic: deadlock: main bubble goroutine has exited but blocked goroutines remain, you'll see a few suture-related goroutines in the resulting panic block. One will look something like:

github.com/thejerf/suture/v4.(*Supervisor).runService.func1()
        /home/jbowers/go/pkg/mod/github.com/thejerf/suture/v4@v4.0.6/supervisor.go:541 +0x2e
github.com/thejerf/suture/v4.(*Supervisor).stopSupervisor.func1(0x0)
        /home/jbowers/go/pkg/mod/github.com/thejerf/suture/v4@v4.0.6/supervisor.go:618 +0x27

at the top. This is a goroutine suture is using trying to shut your errant service down. It is blocked because your service is not shutting down.

You should see another that has a top like this:

github.com/thejerf/suture/v4.(*Supervisor).stopSupervisor(0xc00016c8c0)
        /home/jbowers/go/pkg/mod/github.com/thejerf/suture/v4@v4.0.6/supervisor.go:627 +0x373
github.com/thejerf/suture/v4.(*Supervisor).Serve(0xc00016c8c0, {0xa4d4f8?, 0xc0001505a0?})
        /home/jbowers/go/pkg/mod/github.com/thejerf/suture/v4@v4.0.6/supervisor.go:356 +0xd3b

This is the core supervisor's goroutine. You'll see below this part of the stack trace where you called .Serve(ctx) on the supervisor. This is also waiting for your service to shut down.

You'll see another for the test itself, pointing at the synctest.Test call.

There should be at least one more stack trace which points to one of your services, and what line it is currently blocked on. This should point directly at the location the service got "stuck" and where you need to be additionally examining or using the passed-in context value for the service to shut the service down. Once you do that for any blocked services, synctest should shut your supervision tree down normally and you should no longer get synctest errors.

Major Versions

v4 is a rewrite to make Suture function with contexts. If you are using suture for the first time, I recommend it. It also changes how logging works, to get a single function from the user that is presented with a defined set of structs, rather than requiring a number of closures from the consumer.

suture v3 is the latest version that does not feature contexts. It is still supported and getting backported fixes as of now.

Code Signing

Starting with the commit after ac7cf8591b, I will be signing this repository with the "jerf" keybase account. If you are viewing this repository through GitHub, you should see the commits as showing as "verified" in the commit view.

November 2024: My GPG key, as expected, expired. I have added a new subkey with a later expiration date, but GitHub now views all the previous commits as unsigned. Again, the nature of commit signing is that each signature is technically a signature on the entire repo, so the commit that adds this update is also a handover signature signing the repo.

Aspiration

One of the big wins the Erlang community has with their pervasive OTP support is that it makes it easy for them to distribute libraries that easily fit into the OTP paradigm. It ought to someday be considered a good idea to distribute libraries that provide some sort of supervisor tree functionality out of the box. It is possible to provide this functionality without explicitly depending on the Suture library.

A nice thing about the v4 interface that suture offers is that Serve(ctx context.Context) isn't even a bad interface for a library to offer; it creates no dependency on suture, it just integrates with it cleanly. But anyone can just go yourthing.Serve(ctx) perfectly reasonably themselves too.

Changelog

suture uses semantic versioning and go modules.

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.