GitHunt
NI

nitnelave/ProtEnc

A C++ typestate library

ProtEnc

ProtocolEncoder is a C++ typestate library, making it easy to enforce
a protocol on the usage of an object.

What are typestates?

Typestate analysis is a way
to specify at compile-time what operations are valid on an object in the
current state, and enforce these restrictions.

In other words, it allows to define a protocol of how to use an object, and
make sure that the program cannot breach this protocol.

Example: HTTP connection builder

HTTP connections have headers and a body (among other things). We want to
create a builder for one such connection, making sure that:

  • There is at least one header.
  • There is a body.
  • The headers are specified before the body.

Only then will the user be able to create a connection.

(Bear in mind, this is a simplistic example, which could be better solved with
a constructor taking the parameters.)

Let's create a class to build this connection:

class HTTPConnectionBuilder {
 public:
  void add_header(std::string header) { ... }
  void add_body(std::string body) { ... }
  HTTPConnection build() { ... }
 private:
  ...
};

Now, we need to enforce the protocol. We can represent it with the following
graph:

HTTP connection builder protocol graph

We start in the "START" state, then we have to take the transition add_header
at least once, but as many more times as we want, then the add_body
transition, then we can build the connection.

After using ProtEnc, we get a wrapper object that we can use like this:

  HTTPConnection connection = GetConnectionBuilder()
                                .add_header("First header")
                                .add_header("Second header")
                                .add_body("Body")
                                .build();

Any digression from the protocol will result in a compilation error.

Defining the protocol

To define the protocol, we need to build the graph shown above. Let's start by
declaring the possible states:

enum class HTTPBuilderState {
  START,
  HEADERS,
  BODY
};

Now we can declare from which states it is valid to start; in our case, only
START:

using MyInitialStates = InitialStates<HTTPBuilderState::START>;

Similarly, we can declare that we can finish only from the BODY state, by
calling the build function:

using MyFinalTransitions = FinalTransitions<
    FinalTransition<HTTPBuilderState::BODY, &HTTPConnectionBuilder::build>
  >;

Let's declare the transitions: we can go from START to HEADERS by calling
add_header. It looks like this:

Transition<HTTPBuilderState::START, HTTPBuilderState::Header,
           &HTTPConnectionBuilder::add_header>

We can add the other transitions in the same way, giving us:

using MyTransitions = Transitions<
      // START -> HEADERS with add_header
      Transition<HTTPBuilderState::START, HTTPBuilderState::HEADERS,
                 &HTTPConnectionBuilder::add_header>,
      // HEADERS -> HEADERS with add_header
      Transition<HTTPBuilderState::HEADERS, HTTPBuilderState::HEADERS,
                 &HTTPConnectionBuilder::add_header>,
      // HEADERS -> BODY with add_body
      Transition<HTTPBuilderState::HEADERS, HTTPBuilderState::BODY,
                 &HTTPConnectionBuilder::add_body>
  >;

Then we just have to declare our wrapper class with all this info:

PROTENC_START_WRAPPER(
    // Wrapper name
    HTTPConnectionBuilderWrapper,
    // Implementation class
    HTTPConnectionBuilder,
    // Name of the enum
    HTTPBuilderState,
    // Initial states
    MyInitialStates,
    // Transitions
    MyTransitions,
    // Final transitions
    MyFinalTransitions,
    // List of valid query methods (const methods on the implementation)
    MyValidQueries);

  // Declare the list of functions that we are wrapping:
  // Transitions
  PROTENC_DECLARE_TRANSITION(add_header);
  PROTENC_DECLARE_TRANSITION(add_body);

  // End functions.
  PROTENC_DECLARE_FINAL_TRANSITION(build);

PROTENC_END_WRAPPER;

Now we can create an instance of our wrapper with any of the starting states:

HTTPConnection connection =
  HTTPConnectionBuilderWrapper<HTTPBuilderState::Start>{}
      .add_header("Never")
      .add_header("Gonna")
      .add_body("Give you up")
      .build();

The full code of this example is in
example/http_connection.cc.

What can this library be used for?

Any protocol that can be represented by a FSM can be encoded using this
library. This provides compile-time guarantee that the code respects the
protocol.

This is a step towards formal verification of code, while being more
lightweight and easy to integrate into existing codebases.

The definition of the wrapper could be outsourced to a code generator that
checks some properties on the FSM, or simply takes an existing protocol
specification and turn the FSM into a wrapper.

But how does it work?

Oh, you want to get into the details? TL;DR: lots of template magic :)

The wrapper object

The wrapper object is templated by the enum representing the states. The value
of the template parameter is the state in which the wrapper is. Every
transition method (including the final transition methods) consume the object
(taking it by r-value reference), and return a new wrapper in the new state.
The underlying object is never copied (provided it has a good move
constructor).

The wrapper mainly just forwards the calls (declared by DECLARE_TRANSITION
and friends) to the GenericWrapper.

The GenericWrapper object

GenericWrapper does the heavy lifting:

  • It contains an instance of the wrapped class (by value, there is no
    indirection).
  • It is templated by all the parameters describing the FSM: initial states,
    transitions and so on.
  • It has a call_transition method (and friends) that takes a function
    pointer, checks that it's a valid transition, and returns the new state.
  • It has a bunch of checks to make sure that we have "pretty" error messages
    for common errors (not using the correct types to define the FSM, not
    consuming the object for a final transition, etc).

call_transition

call_transition is called with a pointer to the function you want to call
(and arguments). It has a static_assert that will go over the list of
transitions, looking for one that starts from the current state, and has the
function pointer as label. If the transition is not found, it's a compile
error. If the transition is found, we can extract from it the target state. We
then call the function on the wrapped object, and return a new wrapper
(constructed by moving the current wrapper) in the new state.

The other functions (call_final_transition, call_valid_query) work in a
similar fashion, deducing the return type from the function pointer and the
arguments.

The GenericWrapper constructor checks that it was built with one of the initial
states (with a static_assert).

The nitty-gritty of the magic

If you want to get into the template details, I invite you to have a look at
the code itself.

Caveats

  • It requires C++20. This is mainly due to the auto template parameters.
    The omitted return types could be fixed with some redundancy (decltype), but
    dropping the auto template parameters would mean dropping support for enum
    classes. It could still be acceptable, and regular enums mapped to ints would
    also work (it would just be slightly less elegant). The code could be adapted
    to work with C++14 if needed (maybe even C++11, I'm not sure), but it would
    be more verbose.

Potential improvements

  • We could have more checks.
    • We currently only have basic checks on the
      template parameters, but we could check that there is no 2 transitions with
      the same origin and label, or that there are no unreachable states. I feel
      that this is more the responsability of the user: they need to make sure the
      FSM is the right one for their protocol.
    • We check the top-level list parameters (Transitions, FinalTransitions,
      ...) but we could go down and check that the function pointers are from the
      wrapped class, that the types are Transition/FinalTransition/..., that
      the states are from the right enum, and so on.
  • Better error messages. When the transition is not found, we have static
    information about the state and the function pointer, so it should
    theoretically be possible to craft an error message with more information
    than "Transition not found".
nitnelave/ProtEnc | GitHunt