Implement the Envelope Wrapper Pattern in C++ using Google Protocol Buffers

Image by StockSnap from Pixabay

The Envelope Wrapper pattern

This pattern, described in “Enterprise Integration Patterns” (Ch. 8, “The Envelope Wrapper”), is used typically in messaging systems, when a number of metadata fields (for routing, security, etc.) are kept in a header, while the data used by service endpoints is serialized in a payload.

For the system to be generically applicable, usually the type of the payload is opaque to the message routing system, and can be any one of a number of actual object types used by the service endpoints (some or most of which may not even be necessarily aware of the whole type system).

When used in strongly (statically) typed languages such as C++ or Java this poses a challenge, as the obvious option (to use either bytes or a string) lead to inflexible, verbose and error-prone code, which is hard to maintain and upgrade.

Luckily, using Google Protocol Buffers and the proto3 syntax, it is possible to achieve both goals of keeping the format of the Envelope “generic”, while strongly typing the Payload, using the Any type.

Additionally, proto3 has added the ability to convert to/from JSON, and using the nlohmann::json library it is also possible to natively integrate with REST APIs that use JSON as the serialization format.

In this post, I will provide an example of how it is possible to do either (or both).

Any Protocol Buffers

The assumed scenario here is that we are developing code that will be used by separate teams/services and that only the Envelope type is shared across all of them, while the various payload proto types (e.g., Server and Sources in the example) are only used “locally” by different services.

Additionally, while creating “infrastructure” services (e.g., the routing components) the payload types may not even be known or exist; to accomplish this, we use the Any type:

import "google/protobuf/any.proto";
message Envelope {
    string sender = 1;
    string message = 2;

    // Opaque payload type identifier, could be anything that helps the
    // client to decode the payload correctly.
    uint32 payload_type_id = 99;

    // This is the actual payload, another Protocol Buffer.
    // See the description for the Any type.
    google.protobuf.Any payload = 3;
}

In order for this to work, we need (during compilation of the .proto file) to tell protoc where to find the google/protobuf/any.proto file in the include folders; using CMake, this is done using:

    find_package(Protobuf REQUIRED)
    ...
    set(Protobuf_IMPORT_DIRS ${Protobuf_INCLUDE_DIR})
    PROTOBUF_GENERATE_CPP(GEN_SRCS GEN_HDRS ${${PROTO_SOURCES}})

See the build_protobuf function for the full listing.

As an example of possible payloads, two separate teams may define the following PBs:

// Assume that this one is not known at time of compilation of Envelope
message Server {
  string server = 1;
  string location = 2;
  uint32 server_id = 3;
}

// This could also be carried by the Envelope, and be in a
// completely different .proto file
message Sources {
  string package = 1;
  repeated string sources = 2;
}

Implementing the Envelope pattern in C++

A “client service” that needs to send information about Server objects across the messaging system would need access to both the envelope.pb.h/cc and server.pb.h/cc, but be blissfully unaware of anything to do with sources.proto:

  Envelope envelope;
  envelope.set_sender("demo-prog");
  envelope.set_message("This is a demo");

  Server server;
  server.set_server("host.kapsules.io");
  server.set_server_id(9999);
  server.set_location("us-west-2a");

  // Use the Any::PackFrom to encode an arbitrary PB payload.
  envelope.mutable_payload()->PackFrom(server);
  envelope.set_payload_type_id(PayloadType::SERVER_TYPE);

we could then serialize the resulting PB and send it over on its merry way to whatever destination we please; any intervening services that need access to the metadata from the Envelope could do so, while ignoring (and passing over) anything to do with the payload.

A note about using Enums for the payload_type_id

Note we are using an enum field as the type ID; these map to uint32 types in proto3 and thus are interchangeable (and can be opaque to the Envelope); however, this has the drawback that some common management across services/teams be enforced, to avoid collisions; bearing in mind that the payload carries its own TypeUrl (@type, see below), carrying a Type ID may be entirely unnecessary.

At the destination, we would deserialize the Envelope PB and then inspect the payload to ensure that it carries an object that we understand:

using namespace std::string_literals;

std::ostream &operator<<(std::ostream &out, const Server &server) {
  out << server.server() << " ("s << server.server_id() << "): "s << server.location();
  return out;
}

template<typename P>
inline std::ostream &Print(
    const io::kapsules::Envelope &envelope,
    std::ostream &out = std::cout
) {
  out << "From: " << envelope.sender() << endl;
  out << "Subject: " << envelope.message() << endl;

  if (envelope.has_payload()) {
    out << "Type: " << envelope.payload().type_url()
        << " (" << envelope.payload_type_id() << ")" << endl;
    out << "---- Payload ----" << endl;
    const auto &payload = envelope.payload();
    if (payload.Is<P>()) {
      P content;
      payload.UnpackTo(&content);
      cout << content << endl;
    }
    out << "---- Payload Ends ----" << endl;
  }
  return out;
}

// ....

  // After "receiving" the Envelope message, check its payload type:
  if (envelope.payload_type_id() == PayloadType::SERVER_TYPE) {
    Print<Server>(envelope);
  } else {
    LOG(ERROR) << "Unknown payload type (" << envelope.payload_type_id() << ")";
    return EXIT_FAILURE;
  }

Obviously, at the “point of use” (i.e., when we call Any::UnpackTo) we need to know the type of the PB that we are going to deserialize (unpack); having done that, we have access to the fields of the message which can then be used for whatever purpose (here, trivially, to emit it to std::out).

Exactly the same process would be used by clients needing to exchange Sources messages, or even the very same clients who need to multiplex on the same streaming channel (e.g., a Kafka topic) different types of data:

    Sources sources;
    sources.set_package("com.example.data.container");
    sources.add_sources("one.java");
    sources.add_sources("two.java");
    sources.add_sources("three.java");

    // Use the Any PackFrom to encode an arbitrary PB payload.
    envelope.mutable_payload()->PackFrom(sources);
    envelope.set_payload_type_id(PayloadType::SOURCES_TYPE);

The Print<P> function would be exactly the same, just invoked with Print<Sources>(envelope).

(De)Serializing using JSON

Protocol Buffers v. 3 have added support for JSON which makes emitting a Protobuf as a JSON string a fairly trivial matter:

#include <google/protobuf/util/json_util.h>

  std::string jsonOut;
  google::protobuf::util::JsonPrintOptions options;
  options.add_whitespace = true;
  options.always_print_primitive_fields = true;

  auto status = google::protobuf::util::MessageToJsonString(envelope, &jsonOut, options);
  if (status.ok()) {
    // do something with the string

  } else {
    LOG(ERROR) << "Could not convert Envelope Protobuf to JSON: " << status.error_message();
  }

and this could be all that is needed if one wants to serialize an Envelope over a REST call to an API that only accepts JSON as its Accept type (application/json).

On the other hand, being able to manipulate the JSON natively as a semi-structured object, or even converting it back to Protobuf may be more desirable: in this case, using nlohmann/json could be preferable.

This requires downloading the json.hpp header file from the Github repository, and then making it available to your C++ code:

#include <json.hpp>

using nlohmann::json;

  // Extract to JSON string and parse into a `json` object.
  auto status = google::protobuf::util::MessageToJsonString(envelope, &jsonOut, options);
  if (status.ok()) {
    using nlohmann::json;
    auto data = json::parse(jsonOut);

    // The entire envelope, including payload, can be serialized as JSON:
    cout << data.dump(2) << endl;

    // Or you can extract specific fields:
    if (data["payload"].contains("server")) {
      cout << "Server: " << data["payload"]["server"] << endl;
    }
  }
  // ...

To convert the JSON back to PBs (e.g., when receiving a response by the REST API) is slightly more complicated, depending on whether one handles Envelope objects (which carry enough type information to handle the Any field), or just the payload types (which don’t):

    // Finally, from the JSON, one can just reconstruct the Payload
    auto payload = data["payload"];

    // Here we validate the type to be what we expect, or drop it.
    cout << "Received a JSON payload of type: " << payload["@type"] << endl;
    if (payload["@type"].is_string()) {
      auto type = payload["@type"].get<std::string>();
      if (type.find("io.kapsules.clients.Server") != std::string::npos) {
        Server fromJson;

        // We need to remove the @type annotation, or Google Protobuf parser will complain
        // The Server PB has no knowledge of the Any configured in Envelope.
        payload.erase("@type");
        status = google::protobuf::util::JsonStringToMessage(payload.dump(), &fromJson);
        if (status.ok()) {
          cout << "Protobuf extracted from JSON payload:" << endl;
          cout << fromJson;
        } else {
          LOG(ERROR) << "Could not convert server from JSON: " << status.error_message();
        }
      }
    }

For the Envelope it is more straightforward

  Envelope fromJson;
  google::protobuf::util::Status status =
      google::protobuf::util::JsonStringToMessage(jsonData.dump(), &fromJson);
  if (status.ok()) {
    if (fromJson.has_payload() && fromJson.payload().Is<Server>()) {
      Server server;
      payload.UnpackTo(&server);
      // Do something with `server`
      // ...
    }
  } else {
    LOG(ERROR) << "Could not convert server from JSON: " << status.error_message();
  }

Conclusion

The Envelope Wrapper pattern is widely used in messaging and streaming architectures which enable highly scalable, maintainable systems by adopting the “separation of concerns” approach.

Distributed systems require one to design abstractions so that they strike the right balance between generality and usability, especially so to ensure that bugs are kept at a minimum and development agility is not disrupted (bearing in mind that code is “written once, read many, many times”).

Statically typed languages and strongly-typed APIs allow the latter, but can make adopting the former cumbersome and overly verbose: Google Protocol Buffers’ Any type (and associated facilities) help in this respect and, with some help from third-party libraries, can be made even more flexible and useful.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: