Alternatives to sum types in Go
Many statically typed programming languages, such as Haskell or Swift, have a feature called “sum types”. Sum types (also known as tagged unions or variant types) allows a new type to be defined as the “union” of a set of other types and values, and allows users to “pattern match” on values to find out the underlying type.
But Go, the predominant language used at Pusher, does not support sum types. As former Haskell users, we miss them so we wanted to describe some workarounds and alternative approaches, including the approach we recommend here at Pusher.
An example type definition in Haskell would be
type Event = EventPublish PublishData | EventSubscribe SubscribeData
The Event
type is the union of types PublishData
and SubscribeData
.
If ("public-donuts", "2 for $5")
is a PublishData
,
then EventPublish ("public-donuts", "2 for $5")
is an Event
.
A user of an Event
value can use a “pattern match”
to find out whether the value is a publish or a subscribe,
and to do different things in each case.
For example:
eventToString :: Event -> String
eventToString e = case e of
EventPublish p -> publishDataToString p
EventSubscribe s -> subscribeDataToString s
Sum types are useful in statically typed programming languages
because they allow a function to accept a set of types and handle them in different ways.
For example handlers for different types of event on a subscription/chan
,
traversing recursive data structures,
or even expression evaluators.
Let’s see whether we can acheive something similar in Go…
Alternative 0: interface-and-switch
A common alternative,
which we can call “interface-and-switch”,
is to use an interface{}
type for the sum type,
and a type switch
for the pattern match.
Let’s see an example of this for a pub/sub message bus.
The bus runs an event loop in its own goroutine,
and clients interact with the event loop by writing different events onto its eventChan
.
Ideally the type of this eventChan
would be a sum type over all types of events it can handle,
but here we use a interface{}
:
type subscribeEvent struct {
messageChan chan<- string
}
type publishEvent struct {
message string
}
type pubsubBus struct {
subs []chan<- string
eventChan chan interface{} // Note the interface{} type for events
}
func (p *pubsubBus) Run() {
for event := range p.eventChan {
// This is not type safe because we might remove a handler, but
// forget to remove the function which sends the now unhandled
// event on the channel.
switch e := event.(type) {
case subscribeEvent:
p.handleSubscribe(e)
case publishEvent:
p.handlePublish(e)
default:
panic(fmt.Sprint("Unknown event type"))
}
}
}
func (p *pubsubBus) handleSubscribe(subscribeEvent subscribeEvent) {
p.subs = append(p.subs, subscribeEvent.messageChan)
}
func (p *pubsubBus) handlePublish(publishEvent publishEvent) {
for _, sub := range p.subs {
sub <- publishEvent.message
}
}
A significant downside to the interface-and-switch approach is that it is not “type safe”
The handler might not handle all types that are passed in,
leading to runtime errors.
The consumer of the eventChan
can only handle subscribeEvent
and publishEvent
values,
but Go’s type system will allow the producer to pass in other values (e.g. 5
or true
).
As a result, the type switch has a default
action, which is to panic
at runtime.
By contrast, notice that our Haskell pattern match needs no default
case,
because the type system verifies that no invalid values will enter the pattern match.
Can we achieve this similar type safety in Go?
Alternative 1: a “sum type” interface
To improve the situation, we can replace the interface{}
with an interface with a single, dummy, private method.
This means that there will be a type error if an unexpected type is used where our interface is expected:
+type event interface {
+ isEvent()
+}
type subscribeEvent struct {
messageChan chan<- string
}
+func (subscribeEvent) isEvent() {}
type publishEvent struct {
message string
}
+func (publishEvent) isEvent() {}
type pubsubBus struct {
subs []chan<- string
- eventChan chan interface{} // Note the interface{} type for events
+ eventChan chan event // Now only types which implement `event` can be sent
}
Runtime errors are still possible however. For example, during a refactor a handler might be removed but a type that implements the interface is not.
This approach is described in more detail by Jeremy Bowers.
Alternative 2: the visitor pattern
A fully type-safe way of solving this is to attach the handlers as “visit
” methods to the types themselves.
Then an interface is defined with a matching visit
method, so we can call this in our event loop.
This technique is known as the visitor pattern in OO languages.
type event interface {
- isEvent()
+ // Instances now implement the handler in this method
+ visit(*pubsubBus)
}
type subscribeEvent struct {
messageChan chan<- string
}
-func (subscribeEvent) isEvent() {}
+func (sE subscribeEvent) visit(p *pubsubBus) {
+ p.handleSubscribe(sE)
+}
type publishEvent struct {
message string
}
-func (publishEvent) isEvent() {}
+func (pE publishEvent) visit(p *pubsubBus) {
+ p.handlePublish(pE)
+}
type pubsubBus struct {
subs []chan<- string
- eventChan chan event // Now only types which implement `event` can be sent
+ eventChan chan event
}
func (p *pubsubBus) Run() {
for event := range p.eventChan {
- // This is not type safe because we might remove a handler, but
- // forget to remove the function which sends the now unhandled
- // event on the channel.
- switch e := event.(type) {
- case subscribeEvent:
- p.subs = append(p.subs, e.messageChan)
- case publishEvent:
- for _, sub := range p.subs {
- sub <- e.message
- }
- default:
- panic(fmt.Sprint("Unknown event type"))
- }
+ // Type switch is not required, so it's type-safe
+ event.visit(p)
}
}
The disadvantage here is that the handlers are now coupled to the types. Ideally we want to define types independent of a particular way in which they are handled.
Alternative 3: decoupled visitor
We can decouple the handlers by having the visit()
function take a struct of handler implementations.
Each visit()
implementation will call the corresponding handler in the struct.
type event interface {
- // Instances now implement the handler in this method
- visit(*pubsubBus)
+ visit(v eventVisitor)
}
+// The handlers for each event type are defined in instances of this struct
+type eventVisitor struct {
+ // Notice these can have different function signatures
+ visitSubscribe func(subscribeEvent)
+ visitPublish func(publishEvent)
+}
type subscribeEvent struct {
messageChan chan<- string
}
-func (sE subscribeEvent) visit(p *pubsubBus) {
- p.handleSubscribe(e)
+This is just boilertplate now; we do not need to provide a specific implementaion
+func (sE subscribeEvent) visit(v eventVisitor) {
+ v.visitSubscribe(sE)
}
type publishEvent struct {
message string
}
-func (pE publishEvent) visit(p *pubsubBus) {
- p.handlePublish(e)
+func (p publishEvent) visit(v eventVisitor) {
+ v.visitPublish(p)
}
type pubsubBus struct {
subs []chan<- string
eventChan chan event
}
func (p *pubsubBus) Run() {
for event := range p.eventChan {
- // Type switch is not required, so it's type-safe
- event.visit(p)
+ // Handler implementations are passed in here.
+ // Alternative handler implementations could be defined by creating an
+ // alternative version of this method.
+ event.visit(eventVisitor{
+ visitSubscribe: p.handleSubscribe,
+ visitPublish: p.handlePublish,
+ })
}
}
The fact that the handlers are decoupled from the types makes it easy to change the handler implementations. It also makes it possible for the handlers to have different type signatures. For example here is an interpreter of arithmetic operations using this visitor pattern.
The disadvantage of this is there is now more boilerplate,
particularly the visit
method implementations.
Conclusion
It can be frustrating not having sum types if you are used to using them in other programming languages. Fortunately, with a slight change in the way of thinking, we can find decent alternatives. Unfortunately there isn’t one technique that is objectively better than the others. It’s a tradeoff between type-safety, complexity, and verbosity, and the “best” approach will depend on your use case and personal preferences.
That’s until Go 2 is released at least…
Thanks to Jim Fisher and James Lees for proofreading.