The Envelope Wrapper pattern
This pattern, described in “Enterprise Integration Patterns” (Ch. 8, “The Envelope Wrapper”), is used typically in messaging systems, when a number of metadata fields (for routing, security, etc.) are kept in a header
, while the data used by service endpoints is serialized in a payload
.
For the system to be generically applicable, usually the type of the payload
is opaque to the message routing system, and can be any one of a number of actual object types used by the service endpoints (some or most of which may not even be necessarily aware of the whole type system).
When used in strongly (statically) typed languages such as C++ or Java this poses a challenge, as the obvious option (to use either bytes
or a string
) lead to inflexible, verbose and error-prone code, which is hard to maintain and upgrade.
Luckily, using Google Protocol Buffers and the proto3
syntax, it is possible to achieve both goals of keeping the format of the Envelope “generic”, while strongly typing the Payload, using the Any
type.
Additionally, proto3
has added the ability to convert to/from JSON, and using the nlohmann::json
library it is also possible to natively integrate with REST APIs that use JSON as the serialization format.
In this post, I will provide an example of how it is possible to do either (or both).
Any Protocol Buffers
The assumed scenario here is that we are developing code that will be used by separate teams/services and that only the Envelope
type is shared across all of them, while the various payload
proto types (e.g., Server
and Sources
in the example) are only used “locally” by different services.
Additionally, while creating “infrastructure” services (e.g., the routing components) the payload types may not even be known or exist; to accomplish this, we use the Any
type:
import "google/protobuf/any.proto";
message Envelope {
string sender = 1;
string message = 2;
// Opaque payload type identifier, could be anything that helps the
// client to decode the payload correctly.
uint32 payload_type_id = 99;
// This is the actual payload, another Protocol Buffer.
// See the description for the Any type.
google.protobuf.Any payload = 3;
}
In order for this to work, we need (during compilation of the .proto
file) to tell protoc
where to find the google/protobuf/any.proto
file in the include folders; using CMake, this is done using:
find_package(Protobuf REQUIRED)
...
set(Protobuf_IMPORT_DIRS ${Protobuf_INCLUDE_DIR})
PROTOBUF_GENERATE_CPP(GEN_SRCS GEN_HDRS ${${PROTO_SOURCES}})
See the build_protobuf
function for the full listing.
As an example of possible payloads, two separate teams may define the following PBs:
// Assume that this one is not known at time of compilation of Envelope
message Server {
string server = 1;
string location = 2;
uint32 server_id = 3;
}
// This could also be carried by the Envelope, and be in a
// completely different .proto file
message Sources {
string package = 1;
repeated string sources = 2;
}
Implementing the Envelope pattern in C++
A “client service” that needs to send information about Server
objects across the messaging system would need access to both the envelope.pb.h/cc
and server.pb.h/cc
, but be blissfully unaware of anything to do with sources.proto
:
Envelope envelope;
envelope.set_sender("demo-prog");
envelope.set_message("This is a demo");
Server server;
server.set_server("host.kapsules.io");
server.set_server_id(9999);
server.set_location("us-west-2a");
// Use the Any::PackFrom to encode an arbitrary PB payload.
envelope.mutable_payload()->PackFrom(server);
envelope.set_payload_type_id(PayloadType::SERVER_TYPE);
we could then serialize the resulting PB and send it over on its merry way to whatever destination we please; any intervening services that need access to the metadata from the Envelope
could do so, while ignoring (and passing over) anything to do with the payload
.
A note about using Enums for the payload_type_id
Note we are using an enum
field as the type ID; these map to uint32
types in proto3 and thus are interchangeable (and can be opaque to the Envelope
); however, this has the drawback that some common management across services/teams be enforced, to avoid collisions; bearing in mind that the payload
carries its own TypeUrl
(@type
, see below), carrying a Type ID may be entirely unnecessary.
At the destination, we would deserialize the Envelope
PB and then inspect the payload
to ensure that it carries an object that we understand:
using namespace std::string_literals;
std::ostream &operator<<(std::ostream &out, const Server &server) {
out << server.server() << " ("s << server.server_id() << "): "s << server.location();
return out;
}
template<typename P>
inline std::ostream &Print(
const io::kapsules::Envelope &envelope,
std::ostream &out = std::cout
) {
out << "From: " << envelope.sender() << endl;
out << "Subject: " << envelope.message() << endl;
if (envelope.has_payload()) {
out << "Type: " << envelope.payload().type_url()
<< " (" << envelope.payload_type_id() << ")" << endl;
out << "---- Payload ----" << endl;
const auto &payload = envelope.payload();
if (payload.Is<P>()) {
P content;
payload.UnpackTo(&content);
cout << content << endl;
}
out << "---- Payload Ends ----" << endl;
}
return out;
}
// ....
// After "receiving" the Envelope message, check its payload type:
if (envelope.payload_type_id() == PayloadType::SERVER_TYPE) {
Print<Server>(envelope);
} else {
LOG(ERROR) << "Unknown payload type (" << envelope.payload_type_id() << ")";
return EXIT_FAILURE;
}
Obviously, at the “point of use” (i.e., when we call Any::UnpackTo
) we need to know the type of the PB that we are going to deserialize (unpack); having done that, we have access to the fields of the message which can then be used for whatever purpose (here, trivially, to emit it to std::out
).
Exactly the same process would be used by clients needing to exchange Sources
messages, or even the very same clients who need to multiplex on the same streaming channel (e.g., a Kafka topic) different types of data:
Sources sources;
sources.set_package("com.example.data.container");
sources.add_sources("one.java");
sources.add_sources("two.java");
sources.add_sources("three.java");
// Use the Any PackFrom to encode an arbitrary PB payload.
envelope.mutable_payload()->PackFrom(sources);
envelope.set_payload_type_id(PayloadType::SOURCES_TYPE);
The Print<P>
function would be exactly the same, just invoked with Print<Sources>(envelope)
.
(De)Serializing using JSON
Protocol Buffers v. 3 have added support for JSON which makes emitting a Protobuf as a JSON string a fairly trivial matter:
#include <google/protobuf/util/json_util.h>
std::string jsonOut;
google::protobuf::util::JsonPrintOptions options;
options.add_whitespace = true;
options.always_print_primitive_fields = true;
auto status = google::protobuf::util::MessageToJsonString(envelope, &jsonOut, options);
if (status.ok()) {
// do something with the string
} else {
LOG(ERROR) << "Could not convert Envelope Protobuf to JSON: " << status.error_message();
}
and this could be all that is needed if one wants to serialize an Envelope
over a REST call to an API that only accepts JSON as its Accept
type (application/json
).
On the other hand, being able to manipulate the JSON natively as a semi-structured object, or even converting it back to Protobuf may be more desirable: in this case, using nlohmann/json could be preferable.
This requires downloading the json.hpp
header file from the Github repository, and then making it available to your C++ code:
#include <json.hpp>
using nlohmann::json;
// Extract to JSON string and parse into a `json` object.
auto status = google::protobuf::util::MessageToJsonString(envelope, &jsonOut, options);
if (status.ok()) {
using nlohmann::json;
auto data = json::parse(jsonOut);
// The entire envelope, including payload, can be serialized as JSON:
cout << data.dump(2) << endl;
// Or you can extract specific fields:
if (data["payload"].contains("server")) {
cout << "Server: " << data["payload"]["server"] << endl;
}
}
// ...
To convert the JSON back to PBs (e.g., when receiving a response by the REST API) is slightly more complicated, depending on whether one handles Envelope
objects (which carry enough type information to handle the Any
field), or just the payload
types (which don’t):
// Finally, from the JSON, one can just reconstruct the Payload
auto payload = data["payload"];
// Here we validate the type to be what we expect, or drop it.
cout << "Received a JSON payload of type: " << payload["@type"] << endl;
if (payload["@type"].is_string()) {
auto type = payload["@type"].get<std::string>();
if (type.find("io.kapsules.clients.Server") != std::string::npos) {
Server fromJson;
// We need to remove the @type annotation, or Google Protobuf parser will complain
// The Server PB has no knowledge of the Any configured in Envelope.
payload.erase("@type");
status = google::protobuf::util::JsonStringToMessage(payload.dump(), &fromJson);
if (status.ok()) {
cout << "Protobuf extracted from JSON payload:" << endl;
cout << fromJson;
} else {
LOG(ERROR) << "Could not convert server from JSON: " << status.error_message();
}
}
}
For the Envelope
it is more straightforward
Envelope fromJson;
google::protobuf::util::Status status =
google::protobuf::util::JsonStringToMessage(jsonData.dump(), &fromJson);
if (status.ok()) {
if (fromJson.has_payload() && fromJson.payload().Is<Server>()) {
Server server;
payload.UnpackTo(&server);
// Do something with `server`
// ...
}
} else {
LOG(ERROR) << "Could not convert server from JSON: " << status.error_message();
}
Conclusion
The Envelope Wrapper
pattern is widely used in messaging and streaming architectures which enable highly scalable, maintainable systems by adopting the “separation of concerns” approach.
Distributed systems require one to design abstractions so that they strike the right balance between generality and usability, especially so to ensure that bugs are kept at a minimum and development agility is not disrupted (bearing in mind that code is “written once, read many, many times”).
Statically typed languages and strongly-typed APIs allow the latter, but can make adopting the former cumbersome and overly verbose: Google Protocol Buffers’ Any
type (and associated facilities) help in this respect and, with some help from third-party libraries, can be made even more flexible and useful.
Leave a Reply