Flatbuffers are a Google message format in the same vein as Protocol Buffers or JSON. They were designed for game programmers in C++ who want to avoid heap allocations at all costs.

It isn't a new tech, but I started seeing articles recently saying people should use them. I was truly skeptical about this as all the example code for Go looked painful.

I recently had a very niche use case where I have a proxy service that needs to inspect incoming messages before passing them along. I did not want to have to do a Marshal/Unmarshal to determine if this was a data passthru or a control plane message.

I could have solved this in other ways, but this seemed like an ideal use case for Flatbuffers, so I thought I'd give Flatbuffers a Go (pun intended).

In short, I never want to use them again

Flatbuffers really satisfy a niche. I'm skeptical of the use of them in game programming is truly valuable in anything but the most demanding titles.

Google has been able to create low latency services using protocol buffers for a decade (JSON/REST... is not a real contender for speed/space against schema'd binary encoded messages).

Protocol buffers are great in Go, EXCEPT the protoc and the Go generators are just stupidly hard to use outside Google. Getting Grpc code to generate is ridiculous and go.mod made everything worse when I didn't think that was possible.

Flatbuffers is probably a Stadia tech. In that case, maybe they are worth while (large display frames being sent that you don't want to copy around).

Except for the most intense games, I've got to imagine the time savings on the client probably are not a real factor. In the server, maybe. But is there nothing else in the game loop that could save you this amount of memory/time to avoid this complexity (note: I'm not a game programmer)?

If I was doing a game server that needed something faster than Go's protocol buffer, I'd probably look at how much faster. 2x? I'd probably just re-write the proto generators to be more efficient. The current official ones are based on a legacy implementation that could lose a little convenience to get some real speed gains. That would be simpler than using Flatbuffers in the long run.

So why are Flatbuffers hard?

Flatbuffers are similar to writing disk file formats. You are always writing to the tail of an array. To find your data, you must serialize a table at the end to tell you the offset in the buffer where your data lives.

By doing this, you only have to make a single allocation when you read the buffer into memory. After that, you simply access the location of your data and convert it to the specific type. That conversion happens on the stack.

With other formats, you read in the bytes, allocate a representation in memory (like a struct), then copy in each field.

With flatbuffers, to allow for a struct to include another struct, you have to create the contained struct first. This means you have to write everything backwards.

Normally, if you have struct B inside struct A, you write struct A and add struct B. With flatbuffers, you write struct B, then struct A and add struct B to A.

Not so bad, right?

Well, you need to do that for slices of data as well, writing them in reverse order. Basically anything that uses a vector (slices) is painful.

The specific ordering that must be done with vectors gets more complex the more levels of hierarchy you have. Those errors occur at runtime and can be difficult to trace once you build in any level of abstraction.

Then there are oddities like how to do vector's of enums. If vectors of other types are not straight forward, these are worse.

And one more note, if you want to have a field that represents []byte, in Flatbuffers you need to use [ubyte] instead of [byte], otherwise you can't access the entire slice at once, you have to do it byte by byte.

That doesn't sound too bad

After you write something four levels deep, you'll change your mind.

This means you need to write a constructor for every message type and a test for every constructor (because its easy to cause a runtime error if you do things out of order).

To give you a sense of complexity, here is a very simple two level message in both protocol buffer and flatbuffer.

Here is a protocol buffer that holds some data:

msg := &pb.Message{
    Type: pb.MessageType_MTDataType,
    Data: &pb.Data{Dest: pb.MyDest, Fd: 10, Data: b}
}

Here is the equivalent constructor you must write in flatbuffers:

// MarshalData creates a flatbuffer []byte representing a Message that contains data.
func MarshalData(dest fb.DestType, fd uint32, data []byte) []byte {
	size := len(data) + 4
	builder := flatbuffers.NewBuilder(size)

    // This is the only simple Vector operation, everything else is
    // reverse for loops.
	content := builder.CreateByteVector(data)
    
    // Builds a Data message.
	fb.DataStart(builder)
	fb.DataAddDest(builder, dest)
	fb.DataAddFd(builder, fd)
	fb.DataAddContent(builder, content)
	dataMsg := fb.DataEnd(builder)

    // Builds our outer message.
	fb.MessageStart(builder)
	fb.MessageAddType(builder, fb.MessageTypeMTDataType)
	fb.MessageAddData(builder, dataMsg)
	msg := fb.MessageEnd(builder)
	builder.Finish(msg)

	return builder.FinishedBytes()
}

Extracting:

Protocol Buffer:

    msg := &pb.Message{}
    if err := proto.Unmarshal(b, msg); err != nil {
        // Do something
    }

Flatbuffer:

// ReadMessage reads a fb.Message represented by b. 
// Flatbuffers in Go panic if they are
// not correct, so we do a recover and return the error.
func ReadMessage(b []byte) (msg *fb.Message, err error) {
	defer func() {
		if r := recover(); r != nil {
			msg = nil
			err = fmt.Errorf("deformed message: %v", err)
		}
	}()

	return fb.GetRootAsMessage(b, 0), nil
}

// ExtractData extracts the Data message from a Message type. If .Data is not
// set, *fb.Data will be nil.
func ExtractData(b []byte) *fb.Data {
	defer func() {
		if r := recover(); r != nil {
			// Do nothing
		}
	}()

	m, err := ReadMessage(b)
	if err != nil {
		return nil
	}

	if m.Type() != fb.MessageTypeMTDataType {
		return nil
	}

	return m.Data(nil)
}

func main() {
    ...
    data := ExtractData(b)
    if data == nil {
        // Do something
    }
}

I'm not sure you can trust any of the fields you are using here either. I think at this point we are only sure that the flatbuffers data is within the bounds of the slice, but not that any data inside is what is expected.

Accessing fields is going to try and interpret the range of bytes in a []byte to the specified type.

Each function that accesses a field may still need the following to prevent a panic from escaping:

defer func() {
    if r := recover(); r != nil {
        // Do something
    }
}()

But I want the speed, so it still sounds worth it

Maybe your project really needs it (I bet most uses are over optimizing the wrong portion of their code).

But let's move on to the next problem:

All bad data causes a panic!

Flatbuffers Go assumes that all messages come through without a problem. In C++ there is some type of verifier, but they did not port that to Go.

For speed reasons, they did not want to deal with returning an error and instead let panics happen.

I find it hard to believe that this is truly valid or worth the risk/reward/complexity.

At least for my use case, I have a ZERO trust in packets sent to me. So now you must deal with a kind of buffer attack (bad data crashing my program) every time you read a message if you can't trust what is coming in (I don't trust myself sending packets, much less a client not under my control).

Getters are panicy too

So flatbuffers allow you to access struct data sources such as:

field := data.StructA(nil).StructB(nil).Field1()

Note: The "nil" you see allows you to reuse a pointer for that type instead of allocating a new one. Flatbuffers are all about zero allocations.

However, if StructA or StructB is nil, this is going to panic. It could really use a GetStructA() type methods to prevent this kind of thing:

structA := data.StructA(nil)
if structA == nil {
    // Do something
}
structB := structA.StructB(nil)_
if structB == nil {
    // Do something
}
field := structB.Field1()

Way better to have:

field := data.GetStructA(nil).GetStructB(nil).Field1()

If the Get* is too slow in your use case, you can always do it the other way. It probably isn't, especially if the code generator always passes back a global static version of that type which has no mutable fields.

Are there other good alternatives?

You really need that speed, so what are your choices?

There were really only two that I considered:

  • Flatbuffers
  • Capt'n Proto

I really wanted to use Capt'n Proto, but I just couldn't make myself. Capt'n Proto is written by the Proto2 author. A super smart guy who I believe only programs for fun nowadays (jealous of that!).

His only implementation I am aware of is in C++ and a simple one for JS. Other languages are implemented by other maintainers. I think there are 6 levels of complexity that an implementation can use. The more it supports, the more advanced features you get.

The Go package supports level 1. Many of the others only support Level 1, giving you an idea of how complex this is. From reading, I don't think any of the other languages (except Python, which is a wrapper around the C++ lib) actually supports more than level 1. Several say that they are alpha level code.

Go's version is considered beta and is looking for a new maintainer. So Capt'n Proto is for people who do C++ only unless you want to use beta code and a possible maintainer.

Flatbuffers has people actively working on it vs. a pet project for someone who is only investing in the C++ (as far as I can tell).

Okay, so when should I use Flatbuffers?

With all this, you might think I believe that Flatbuffers are a bad messaging format. I don't. I do think it is niche.

The Go implementation of the spec is probably not benefiting from panic'ing and having getters that can panic.

Flatbuffers seem to be useful when:

  • You need more speed and you've turned all the other knobs
  • You have very few types of messages that are not complex
  • You are core infrastructure

For Go I keep thinking that in all likelihood if I'm down to this dial, I should at least consider using Rust to have more dials and more efficient protocol buffers implementations before committing to Flatbuffers in Go. While Go is wonderful and I do love the language, maybe in that case I need to have all the tools that a non-GC'd language can give me.