Blob Blame History Raw
.TH FLATBUFFERS 7 "APRIL 2018" Linux "User Manuals"
.SH NAME
.PP
flatbuffers \- memory efficient serialization library
.SH DESCRIPTION
.TP
\fBBefore you get started\fP
Before diving into the FlatBuffers usage in C++, it should be noted that the
Tutorial \[la]http://google.github.io/flatbuffers/flatbuffers_guide_tutorial.html\[ra] page has a complete guide to general
FlatBuffers usage in all of the supported languages (including C++). This page is designed to cover the nuances of
FlatBuffers usage, specific to C++.
.IP
This page assumes you have written a FlatBuffers schema and compiled it with the Schema Compiler. If you have not,
please see Using the schema
compiler \[la]http://google.github.io/flatbuffers/flatbuffers_guide_using_schema_compiler.html\[ra] and Writing a
schema \[la]http://google.github.io/flatbuffers/flatbuffers_guide_writing_schema.html\[ra]\&.
.IP
Assuming you wrote a schema, say \fB\fCmygame.fbs\fR (though the extension doesn't matter), you've generated a C++ header
called \fB\fCmygame_generated.h\fR using the compiler (e.g. \fB\fCflatc \-c mygame.fbs\fR), you can now start using this in your
program by including the header. As noted, this header relies on \fB\fCflatbuffers/flatbuffers.h\fR, which should be in your
include path.
.TP
\fBFlatBuffers C++ library code location\fP
The code for the FlatBuffers C++ library can be found at \fB\fCflatbuffers/include/flatbuffers\fR\&. You can browse the library
code on the FlatBuffers GitHub page \[la]https://github.com/google/flatbuffers/tree/master/include/flatbuffers\[ra]\&.
.TP
\fBTesting the FlatBuffers C++ library\fP
The code to test the C++ library can be found at \fB\fCflatbuffers/tests\fR\&. The test code itself is located in
test.cpp \[la]https://github.com/google/flatbuffers/blob/master/tests/test.cpp\[ra]\&.
.IP
This test file is built alongside \fB\fCflatc\fR\&. To review how to build the project, please read the
Building \[la]http://google.github.io/flatbuffers/flatbuffers_guide_building.html\[ra] documenation.
.IP
To run the tests, execute \fB\fCflattests\fR from the root \fB\fCflatbuffers/\fR directory.  For example, on
Linux \[la]https://en.wikipedia.org/wiki/Linux\[ra], you would simply run: \fB\fC\&./flattests\fR\&.
.TP
\fBUsing the FlatBuffers C++ library\fP
Note: See Tutorial \[la]http://google.github.io/flatbuffers/flatbuffers_guide_tutorial.html\[ra] for a more in\-depth example
of how to use FlatBuffers in C++.
.IP
FlatBuffers supports both reading and writing FlatBuffers in C++.
.IP
To use FlatBuffers in your code, first generate the C++ classes from your schema with the \fB\fC\-\-cpp\fR option to
\fB\fCflatc\fR\&. Then you can include both FlatBuffers and the generated code to read or write FlatBuffers.
.IP
For example, here is how you would read a FlatBuffer binary file in C++: First, include the library and generated
code. Then read the file into a \fB\fCchar *\fR array, which you pass to \fB\fCGetMonster()\fR\&.
.PP
.RS
.nf
  #include "flatbuffers/flatbuffers.h"
  #include "monster_test_generate.h"
  #include <cstdio> // For printing and file access.

  FILE* file = fopen("monsterdata_test.mon", "rb");
  fseek(file, 0L, SEEK_END);
  int length = ftell(file);
  fseek(file, 0L, SEEK_SET);
  char *data = new char[length];
  fread(data, sizeof(char), length, file);
  fclose(file);

  auto monster = GetMonster(data);
.fi
.RE
.IP
\fB\fCmonster\fR is of type \fB\fCMonster *\fR, and points to somewhere \fIinside\fP your buffer (root object pointers are not the same
as \fB\fCbuffer_pointer\fR !).  If you look in your generated header, you'll see it has convenient accessors for all fields,
e.g. \fB\fChp()\fR, \fB\fCmana()\fR, etc:
.PP
.RS
.nf
  printf("%d\\n", monster\->hp());            // `80`
  printf("%d\\n", monster\->mana());          // default value of `150`
  printf("%s\\n", monster\->name()\->c_str()); // "MyMonster"
.fi
.RE
.IP
Note: That we never stored a \fB\fCmana\fR value, so it will return the default.
.TP
\fBObject based API\fP
FlatBuffers is all about memory efficiency, which is why its base API is written around using as little as possible of
it. This does make the API clumsier (requiring pre\-order construction of all data, and making mutation harder).
.IP
For times when efficiency is less important a more convenient object based API can be used (through
\fB\fC\-\-gen\-object\-api\fR) that is able to unpack & pack a FlatBuffer into objects and standard STL containers, allowing for
convenient construction, access and mutation.
.IP
To use:
.PP
.RS
.nf
  // Autogenerated class from table Monster.
  MonsterT monsterobj;

  // Deserialize from buffer into object.
  UnPackTo(&monsterobj, flatbuffer);

  // Update object directly like a C++ class instance.
  cout << monsterobj\->name;  // This is now a std::string!
  monsterobj\->name = "Bob";  // Change the name.

  // Serialize into new flatbuffer.
  FlatBufferBuilder fbb;
  Pack(fbb, &monsterobj);
.fi
.RE
.IP
The following attributes are specific to the object\-based API code generation:
.RS
.IP \(bu 2
\fB\fCnative_inline\fR: (on a field): Because FlatBuffer tables and structs are optionally present in a given buffer, they
are best represented as pointers (specifically std::unique\fIptrs) in the native class since they can be null.  This
attribute changes the member declaration to use the type directly rather than wrapped in a unique\fPptr.
.IP \(bu 2
\fB\fCnative_default\fR: "value" (on a field): For members that are declared "native_inline", the value specified with this
attribute will be included verbatim in the class constructor initializer list for this member.
.IP \(bu 2
\fB\fCnative_custom_alloc\fR:"custom\fIallocator" (on a table or struct): When using the
object\-based API all generated NativeTables that  are allocated when unpacking
your  flatbuffer will use "custom allocator". The allocator is also used by
any std::vector that appears in a table defined with `native\fPcustom_alloc`.
This can be  used to provide allocation from a pool for example, for faster
unpacking when using the object\-based API.
.PP
Minimal Example:
.PP
schema:
.PP
.RS
.nf
table mytable(native_custom_alloc:"custom_allocator") {
  ...
}
.fi
.RE
.PP
with custom_allocator defined before flatbuffers.h is included, as:
.PP
.RS
.nf
template <typename T> struct custom_allocator : public std::allocator<T> {

  typedef T *pointer;

  template <class U>
  struct rebind {
    typedef custom_allocator<U> other;
  };

  pointer allocate(const std::size_t n) {
    return std::allocator<T>::allocate(n);
  }

  void deallocate(T* ptr, std::size_t n) {
    return std::allocator<T>::deallocate(ptr,n);
  }

  custom_allocator() throw() {}
  template <class U>
  custom_allocator(const custom_allocator<U>&) throw() {}
};
.fi
.RE
.IP \(bu 2
\fB\fCnative_type\fR: "type" (on a struct): In some cases, a more optimal C++ data type exists for a given struct. For
example, the following schema:
.PP
.RS
.nf
struct Vec2 {
  x: float;
  y: float;
}
.fi
.RE
.IP
generates the following Object\-Based API class:
.PP
.RS
.nf
struct Vec2T : flatbuffers::NativeTable {
  float x;
  float y;
};
.fi
.RE
.IP
However, it can be useful to instead use a user\-defined C++ type since it can provide more functionality, eg.
.PP
.RS
.nf
struct vector2 {
  float x = 0, y = 0;
  vector2 operator+(vector2 rhs) const { ... }
  vector2 operator\-(vector2 rhs) const { ... }
  float length() const { ... }
  // etc.
};
.fi
.RE
.IP
The \fB\fCnative_type\fR attribute will replace the usage of the generated class with the given type. So, continuing with
the example, the generated code would use |vector2| in place of |Vec2T| for all generated code.
.IP
However, because the native_type is unknown to flatbuffers, the user must provide the following functions to aide
in the serialization process:
.PP
.RS
.nf
namespace flatbuffers {
  FlatbufferStruct Pack(const native_type& obj);
  native_type UnPack(const FlatbufferStruct& obj);
}
.fi
.RE
.IP \(bu 2
\fB\fCnative_include\fR: "path" (at file level): Because the \fB\fCnative_type\fR attribute can be used to introduce types that
are unknown to flatbuffers, it may be necessary to include "external" header files in the generated code. This
attribute can be used to directly add an #include directive to the top of the generated code that includes the
specified path directly.
.RE
.TP
\fBExternal references\fP
An additional feature of the object API is the ability to allow you to load multiple independent FlatBuffers, and have
them refer to eachothers objects using hashes which are then represented as typed pointers in the object API.
.IP
To make this work have a field in the objects you want to referred to which is using the string hashing feature (see
\fB\fChash\fR attribute in the schema \[la]http://google.github.io/flatbuffers/flatbuffers_guide_writing_schema.html\[ra]
documentation). Then you have a similar hash in the field referring to it, along with a \fB\fCcpp_type\fR attribute
specifying the C++ type this will refer to (this can be any C++ type, and will get a \fB\fC*\fR added).
.IP
Then, in JSON or however you create these buffers, make sure they use the same string (or hash).
.IP
When you call \fB\fCUnPack\fR (or \fB\fCCreate\fR), you'll need a function that maps from hash to the object (see
\fB\fCresolver_function_t\fR for details).
.TP
\fBUsing different pointer types\fP
By default the object tree is built out of \fB\fCstd::unique_ptr\fR, but you can influence this either globally (using the
\fB\fC\-\-cpp\-ptr\-type\fR argument to \fB\fCflatc\fR) or per field (using the \fB\fCcpp_ptr_type\fR attribute) to by any smart pointer type
(\fB\fCmy_ptr<T>\fR), or by specifying \fB\fCnaked\fR as the type to get \fB\fCT *\fR pointers. Unlike the smart pointers, naked pointers
do not manage memory for you, so you'll have to manage their lifecycles manually.
.TP
\fBUsing different string type\fP
By default the object tree is built out of \fB\fCstd::string\fR, but you can influence this either globally (using the
\fB\fC\-\-cpp\-str\-type\fR argument to \fB\fCflatc\fR) or per field using the \fB\fCcpp_str_type\fR attribute.
.IP
The type must support \fB\fCT::c_str()\fR and \fB\fCT::length()\fR as member functions.
.TP
\fBReflection (& Resizing)\fP
There is experimental support for reflection in FlatBuffers, allowing you to read and write data even if you don't
know the exact format of a buffer, and even allows you to change sizes of strings and vectors in\-place.
.IP
The way this works is very elegant; there is actually a FlatBuffer schema that describes schemas (!) which you can
find in \fB\fCreflection/reflection.fbs\fR\&.  The compiler, \fB\fCflatc\fR, can write out any schemas it has just parsed as a binary
FlatBuffer, corresponding to this meta\-schema.
.IP
Loading in one of these binary schemas at runtime allows you traverse any FlatBuffer data that corresponds to it
without knowing the exact format. You can query what fields are present, and then read/write them after.
.IP
For convenient field manipulation, you can include the header \fB\fCflatbuffers/reflection.h\fR which includes both the
generated code from the meta schema, as well as a lot of helper functions.
.IP
And example of usage, for the time being, can be found in \fB\fCtest.cpp/ReflectionTest()\fR\&.
.PP
\fBMini Reflection\fP
.IP
A more limited form of reflection is available for direct inclusion in generated code, which doesn't any (binary)
schema access at all. It was designed to keep the overhead of reflection as low as possible (on the order of 2\-6
bytes per field added to your executable), but doesn't contain all the information the (binary) schema contains.
.IP
You add this information to your generated code by specifying \fB\fC\-\-reflect\-types\fR (or instead \fB\fC\-\-reflect\-names\fR if you
also want field / enum names).
.IP
You can now use this information, for example to print a FlatBuffer to text:
.PP
.RS
.nf
  auto s = flatbuffers::FlatBufferToString(flatbuf, MonsterTypeTable());
.fi
.RE
.IP
\fB\fCMonsterTypeTable()\fR is declared in the generated code for each type. The string produced is very similar to the JSON
produced by the \fB\fCParser\fR based text generator.
.IP
You'll need \fB\fCflatbuffers/minireflect.h\fR for this functionality. In there is also a convenient visitor/iterator so you
can write your own output / functionality based on the mini reflection tables without having to know the FlatBuffers
or reflection encoding.
.TP
\fBStoring maps / dictionaries in a FlatBuffer\fP
FlatBuffers doesn't support maps natively, but there is support to emulate their behavior with vectors and binary
search, which means you can have fast lookups directly from a FlatBuffer without having to unpack your data into a
\fB\fCstd::map\fR or similar.
.IP
To use it:
.RS
.IP \(bu 2
Designate one of the fields in a table as they "key" field. You do this by setting the \fB\fCkey\fR attribute on this
field, e.g.  \fB\fCname:string (key)\fR\&.
.IP
You may only have one key field, and it must be of string or scalar type.
.IP \(bu 2
Write out tables of this type as usual, collect their offsets in an array or vector.
.IP \(bu 2
Instead of \fB\fCCreateVector\fR, call \fB\fCCreateVectorOfSortedTables\fR, which will first sort all offsets such that the tables
they refer to are sorted by the key field, then serialize it.
.IP \(bu 2
Now when you're accessing the FlatBuffer, you can use \fB\fCVector::LookupByKey\fR instead of just \fB\fCVector::Get\fR to access
elements of the vector, e.g.: \fB\fCmyvector\->LookupByKey("Fred")\fR, which returns a pointer to the corresponding table
type, or \fB\fCnullptr\fR if not found.  \fB\fCLookupByKey\fR performs a binary search, so should have a similar speed to
\fB\fCstd::map\fR, though may be faster because of better caching. \fB\fCLookupByKey\fR only works if the vector has been sorted,
it will likely not find elements if it hasn't been sorted.
.RE
.TP
\fBDirect memory access\fP
As you can see from the above examples, all elements in a buffer are accessed through generated accessors. This is
because everything is stored in little endian format on all platforms (the accessor performs a swap operation on big
endian machines), and also because the layout of things is generally not known to the user.
.IP
For structs, layout is deterministic and guaranteed to be the same across platforms (scalars are aligned to their own
size, and structs themselves to their largest member), and you are allowed to access this memory directly by using
\fB\fCsizeof()\fR and \fB\fCmemcpy\fR on the pointer to a struct, or even an array of structs.
.IP
To compute offsets to sub\-elements of a struct, make sure they are a structs themselves, as then you can use the
pointers to figure out the offset without having to hardcode it. This is handy for use of arrays of structs with calls
like \fB\fCglVertexAttribPointer\fR in OpenGL or similar APIs.
.IP
It is important to note is that structs are still little endian on all machines, so only use tricks like this if you
can guarantee you're not shipping on a big endian machine (an \fB\fCassert(FLATBUFFERS_LITTLEENDIAN)\fR would be wise).
.TP
\fBAccess of untrusted buffers\fP
The generated accessor functions access fields over offsets, which is very quick. These offsets are not verified at
run\-time, so a malformed buffer could cause a program to crash by accessing random memory.
.IP
When you're processing large amounts of data from a source you know (e.g.  your own generated data on disk), this is
acceptable, but when reading data from the network that can potentially have been modified by an attacker, this is
undesirable.
.IP
For this reason, you can optionally use a buffer verifier before you access the data. This verifier will check all
offsets, all sizes of fields, and null termination of strings to ensure that when a buffer is accessed, all reads will
end up inside the buffer.
.IP
Each root type will have a verification function generated for it, e.g. for \fB\fCMonster\fR, you can call:
.PP
.RS
.nf
  bool ok = VerifyMonsterBuffer(Verifier(buf, len));
.fi
.RE
.IP
if \fB\fCok\fR is true, the buffer is safe to read.
.IP
Besides untrusted data, this function may be useful to call in debug mode, as extra insurance against data being
corrupted somewhere along the way.
.IP
While verifying a buffer isn't "free", it is typically faster than a full traversal (since any scalar data is not
actually touched), and since it may cause the buffer to be brought into cache before reading, the actual overhead may
be even lower than expected.
.IP
In specialized cases where a denial of service attack is possible, the verifier has two additional constructor
arguments that allow you to limit the nesting depth and total amount of tables the verifier may encounter before
declaring the buffer malformed. The default is \fB\fCVerifier(buf, len, 64 /* max depth */, 1000000, /* max tables */)\fR
which should be sufficient for most uses.
.TP
\fBText & schema parsing\fP
Using binary buffers with the generated header provides a super low overhead use of FlatBuffer data. There are,
however, times when you want to use text formats, for example because it interacts better with source control, or you
want to give your users easy access to data.
.IP
Another reason might be that you already have a lot of data in JSON format, or a tool that generates JSON, and if you
can write a schema for it, this will provide you an easy way to use that data directly.
.IP
(see the schema documentation for some specifics on the JSON format accepted).
.IP
There are two ways to use text formats:
.RS
.IP \(bu 2
Using the compiler as a conversion tool.
.PP
 This is the preferred path, as it doesn't require you to add any new code to your program, and is maximally
 efficient since you can ship with binary data. The disadvantage is that it is an extra step for your
 users/developers to perform, though you might be able to automate it.
.PP
.RS
.nf
flatc \-b myschema.fbs mydata.json
.fi
.RE
.PP
 This will generate the binary file \fB\fCmydata_wire.bin\fR which can be loaded as before.
.IP \(bu 2
Making your program capable of loading text directly.
.PP
 This gives you maximum flexibility. You could even opt to support both, i.e. check for both files, and regenerate
 the binary from text when required, otherwise just load the binary. This option is currently only available for
 C++, or Java through JNI.
.PP
 As mentioned in the section "Building" above, this technique requires you to link a few more files into your
 program, and you'll want to include \fB\fCflatbuffers/idl.h\fR\&.
.PP
 Load text (either a schema or json) into an in\-memory buffer (there is a convenient \fB\fCLoadFile()\fR utility function
 in \fB\fCflatbuffers/util.h\fR if you wish).  Construct a parser:
.PP
.RS
.nf
flatbuffers::Parser parser;
.fi
.RE
.PP
 Now you can parse any number of text files in sequence:
.PP
.RS
.nf
parser.Parse(text_file.c_str());
.fi
.RE
.PP
 This works similarly to how the command\-line compiler works: a sequence of files parsed by the same \fB\fCParser\fR object
 allow later files to reference definitions in earlier files. Typically this means you first load a schema file
 (which populates \fB\fCParser\fR with definitions), followed by one or more JSON files.
.PP
 As optional argument to \fB\fCParse\fR, you may specify a null\-terminated list of include paths. If not specified, any
 include statements try to resolve from the current directory.
.PP
 If there were any parsing errors, \fB\fCParse\fR will return \fB\fCfalse\fR, and \fB\fCParser::err\fR contains a human readable error
 string with a line number etc, which you should present to the creator of that file.
.PP
 After each JSON file, the \fB\fCParser::fbb\fR member variable is the \fB\fCFlatBufferBuilder\fR that contains the binary buffer
 version of that file, that you can access as described above. \fB\fCsamples/sample_text.cpp\fR is a code sample showing
 the above operations.
.RE
.TP
\fBThreading\fP
Reading a FlatBuffer does not touch any memory outside the original buffer, and is entirely read\-only (all const), so
is safe to access from multiple threads even without synchronisation primitives.
.IP
Creating a FlatBuffer is not thread safe. All state related to building a FlatBuffer is contained in a
FlatBufferBuilder instance, and no memory outside of it is touched. To make this thread safe, either do not share
instances of FlatBufferBuilder between threads (recommended), or manually wrap it in synchronisation primites. There's
no automatic way to accomplish this, by design, as we feel multithreaded construction of a single buffer will be rare,
and synchronisation overhead would be costly.
.TP
\fBAdvanced union features\fP
The C++ implementation currently supports vectors of unions (i.e. you can declare a field as \fB\fC[T]\fR where \fB\fCT\fR is a
union type instead of a table type). It also supports structs and strings in unions, besides tables.
.IP
For an example of these features, see \fB\fCtests/union_vector\fR, and \fB\fCUnionVectorTest\fR in \fB\fCtest.cpp\fR\&.
.IP
Since these features haven't been ported to other languages yet, if you choose to use them, you won't be able to use
these buffers in other languages (\fB\fCflatc\fR will refuse to compile a schema that uses these features).
.IP
These features reduce the amount of "table wrapping" that was previously needed to use unions.
.IP
To use scalars, simply wrap them in a struct.
.SH SEE ALSO
.PP
.BR flatc (1),
Official documentation \[la]http://google.github.io/flatbuffers\[ra]