In my previous post, we learned about the current and future state of reflection in C++. But I left a few questions unanswered. Indeed, you may still be wondering why I care so much about reflection and if it has any useful applications for the average programmer. In this post, I’ll try to answer that question with real code examples using the two reference implementations of C++ reflection. I’ll explore the strengths of the two implementations, as well as the major limitations. These examples make heavy use of metaprogramming and C++17 features, so if you find yourself in unfamiliar territory while reading the code, I suggest supplementing this article with other resources.

When I refer to the reflexpr implementation, I’m talking about Matúš Chochlík’s fork of Clang which implements P1094, by Chochlík, Axel Naumann, and David Sankel.

When I refer to cpp3k, I’m talking about Andrew Sutton’s fork of Clang which implements P0590R0, by Sutton and Herb Sutter.

Disclaimer: these reference implementations do not represent the final state of the reflection proposals. They are simply prototypes and the API that may end up in the language is highly subject to change!

Generating comparison operators

Have you ever written a tedious equality operator that checked for the equality of every member of a regular type? Have you ever written a code generation tool for generator such a function? You’re not the only one. In fact, this is such a problem that someone wrote a standards proposal for adding default comparison operators for regular types to the language (N3950).

It turns out that with reflection, you can write a generic equality operator for any type composed of equality-comparable members!

reflexpr

This implementation uses the detection idiom to check if the type T has a valid equality operator. If it does, return the result of that equality comparison for the two input objects. Otherwise, we recursively call “equal” on each member of T. If the type is neither equality comparable or a record (something with members), then that means we can’t compare T for equality.

template<typename T>
bool equal(const T& a, const T& b);

template<typename ...MetaMembers>
struct compare_fold {
  template<typename T>
  static constexpr auto apply(const T& a, const T& b) {
    return (equal(
      a.*meta::get_pointer<MetaMembers>::value,
      b.*meta::get_pointer<MetaMembers>::value) && ...);
  }
};

template<typename T>
bool equal(const T& a, const T& b) {
  if constexpr (metap::is_detected<metap::equality_comparable, T>{}) {
    return a == b;
  } else {
    using MetaT = reflexpr(T);
    static_assert(meta::Record<MetaT>,
      "Type contained a member which has no comparison operator defined.");
    return meta::unpack_sequence_t<
      meta::get_data_members_m<MetaT>, compare_fold>::apply(a, b);
  }
}

Note that metap is simply my own namespace that provides some metaprogramming utilities.

The trickiest part of this example is the use of meta::unpack_sequence_t and fold expressions. To get metainfo for each member of T, we call get_data_members on reflexpr(T), the metainfo for T. This returns what the proposal currently calls a meta-object sequence, which must be unpacked using unpack_sequence_t. unpack_sequence_t takes the meta-object sequence, which is represented as a type, and a struct which is templated on a parameter pack of types, and returns the instantiation of the templated struct with the object sequence as the parameter pack.

This may seem weird to you, but it’s also the only way the authors of this proposal could see to transfer a compile-time sequence of types to variadic template arguments. Once we have the metainfo for each member, we can compute equality for each member elegantly using a fold expression on the boolean “and” operator. Within that fold expression, we access members by using the member pointer provided by that metainfo (a.*p where p = meta::get_pointer<MetaMembers>::value).

One thing that’s a bit odd about this example is that we have to forward-declare the equal operator so that we can call it within the apply function which applies the fold expression, because equal also refers to compare_fold::apply. Another syntactic nitpick is that we have to call get_pointer<MetaMembers>::value twice within the fold expression so that MetaMembers can be unpacked; if we stored the result in a value, the compiler would complain about an unexpanded parameter pack.

cpp3k

The basic idea of this example is the same as the previous one.

template<typename T>
bool equal(const T& a, const T& b) {
  if constexpr (metap::is_detected<metap::equality_comparable, T>{}) {
    return a == b;
  } else {
    bool result = true;
    meta::for_each($T.member_variables(),
      [&a, &b, &result](auto&& member){
        result &= equal(a.*member.pointer(), b.*member.pointer());
      }
    );
    return result;
  }
}

You may find it shorter and more elegant due to the use of value semantics instead of type semantics for accessing metainformation. The most important difference is the use of meta::for_each instead of unpack_sequence_t. meta::for_each implements a for loop over heterogeneous types. It allows us to write the equality comparison as a lambda function. This has the advantage that it requires less syntactic overhead than defining a struct, but it requires us to capture our inputs into the lambda, which could be annoying if there’s a lot of state that needs to be shared. More importantly, it requires us to initialize the result and capture it. In this example, it’s trivially known what the initial state of the comparison should be, but there could be cases where the initial state is not known. unpack_sequence_t allows us to directly access the result of the operation we wrote over the members.

Also, the reflexpr implementation also offers an implementation of for_each with a similar interface: it accepts an object sequence like the result of get_data_members_m as a type parameter and a function object and applies the function over the sequence. Thus you can achieve a very similar example as what I showed here with reflexpr. I just wanted to show the two different styles of iterating over members.

Serialization and deserialization

If you want to save data to hard disk or share data across different processes or machines, you need serialization and deserialization. Converting information between a structured representation that can be understood by a program and an unstructured storage format is a fundamentally important problem for practical applications.

In the C++ world, libraries like Boost Serialization, Cereal, yaml-cpp, and Niels Lohmann’s JSON parser are indispensible for this problem. These libraries often include adaptors for standard library containers like vector and map. However, they do not have true generic power: for your custom data types, you have to specify the data layout for serialization, e.g. which data fields in the serialization format go into which member fields of the type. Reflection makes it possible to infer that data layout automatically from the definition of the C++ struct.

For certain data types, you might want more control over which fields are a part of the serialization format. In the case of the standard library containers, you probably don’t want to serialize members used for bookkeeping or metadata that exposes implementation details. But for the case of POD types which represent configuration maps or protocol messages, generic serialization saves us the pain of writing and maintaing a lot of repetitive boilerplate code.

In this example we’ll write a simple JSON serializer. If you’re closely following reflection and metaprogramming, you may recall that Louis Dionne’s keynote at Meeting C++ 2016 showed an example of JSON serialization using reflection and value semantic metaprogramming. But he did not show JSON deserialization, which requires some extra thought. Since I’m not using Boost Spirit here, the task requires a lot of token parsing boilerplate that I’ll omit for the sake of brevity, but you can always see the full example on Github.

reflexpr

The basic idea behind serialization is straightforward: like the equality operator example, we’ll apply a serialization operation over the members of a struct if they are serializable primitives (numeric types or strings) and recursively call the serialize function on the member if it is a Record.

We’ll use if constexpr and a mix of type traits and the detection idiom for the “base cases”. stringable detects if the type has a std::to_string operator. iterable detects, roughly, if a type can be used in a range-based for loop, like a vector or array (although right now it’s not a bulletproof implementation). The if constexpr block conditioned on this type trait will map the type to a JSON array of its values.

template<typename T>
auto serialize(const T& src, std::string& dst) {
  if constexpr (std::is_same<T, std::string>{}) {
    dst += "\"" + src + "\"";
    return serialize_result::success;
  } else if constexpr (std::is_same<T, bool>{}) {
    dst += src ? "true" : "false";
    return serialize_result::success;
  } else if constexpr (metap::is_detected<metap::stringable, T>{}) {
    dst += std::to_string(src);
    return serialize_result::success;
  } else if constexpr (metap::is_detected<metap::iterable, T>{}) {
    // This structure has an array-like layout.
    dst += "[ ";
    for (auto it = src.begin(); it != src.end(); ++it) {
      auto entry = *it;
      auto result = serialize(entry, dst);
      if (result != serialize_result::success) {
        return result;
      }
      if (it != (src.end() - 1)) {
        dst += ", ";
      }
    }
    dst += " ]";
    return serialize_result::success;

To handle the case where T is a POD type, we’ll recursively apply the serialize function over the members of T using reflection. get_base_name_v gets the name of the member from the metainfo. We’ll use this as the key name in the JSON object.

    using MetaT = reflexpr(T);
    meta::for_each<meta::get_data_members_m<MetaT>>(
      [&src, &dst, &result](auto&& member_info){
        using MetaInfo = std::decay_t<decltype(member_info)>;
        dst += std::string("\"") + meta::get_base_name_v<MetaInfo> + "\"" + " : ";
        if (result = serialize(src.*meta::get_pointer<MetaInfo>::value, dst);
            result != serialize_result::success) {
          return;
        }
        dst += ", ";
      });

Deserialization is where it gets more interesting. I’ll skip the part of the code that deals with primitive types as well as the parser boilerplate, and show the parts related to reflection.

First, we count the colons and commas in the outermost scope of the JSON object that we are mapping to our member, and return an error if the number of colons mismatched (since that represents a key-value mapping):

    if (n_colons != meta::get_size<meta::get_data_members_m<MetaT>>{}) {
      return deserialize_result::mismatched_type;
    }

For every key, value pair in the JSON object, we’ll find the string representing the key and the string representing the value. Then, we need to match the key string in the set of possible member names for the struct we are deserializing JSON into. Because the key string is not known at compile time, we will have to pay some runtime cost to do this lookup. For now, we’ll simply loop over the members of the struct and compare the runtime string key to the name of each member.

      meta::for_each<meta::get_data_members_m<MetaT>>(
        [&dst, &key, &value_token, &result](auto&& metainfo) {
          using MetaInfo = std::decay_t<decltype(metainfo)>;
          constexpr auto name = meta::get_base_name_v<MetaInfo>;
          if (key == name) {
            constexpr auto p = refl::get_member_pointer<T, name>();
            if (result = deserialize(value_token, dst.*p);
                result != deserialize_result::success) {
              return;
            }
          }
        }
      );

As you can see here, if the key matches the name of the member, we’ll grab the type of the member from the metainfo, and retrieve the member pointer corresponding to that member.

I’ve added a couple of utilities here to make this code more readable and brief.

get_member_pointer is a utility that maps the constexpr string name of a member to the member index, and then retrieves the member pointer corresponding to that member.

template<typename T, auto Str>
constexpr auto get_member_pointer() {
  return meta::get_pointer<
    meta::get_element_m<
      meta::get_data_members_m<reflexpr(T)>,
      index_of_member<T, Str>()
    >
  >::value;
}

The implementation of index_of_member is also a bit funny. We compute a fold expression over each member of the struct again, comparing the constexpr string name to the name of the member. If the name matches, we add the index of that member to the result, otherwise we add zero.

template<typename T, auto Str, std::size_t ...I>
static constexpr auto index_helper(std::index_sequence<I...>) {
  return ((Str ==
            meta::get_base_name_v<
              meta::get_element_m<
                meta::get_data_members_m<reflexpr(T)>, I>
              > ? I : 0
          ) + ...);
}

template<typename T, auto Str>
static constexpr auto index_of_member() {
  return index_helper<T, Str>(std::make_index_sequence<n_fields<T>{}>{});
}

In this post, I’m following the “implement now, benchmark later” philosophy. If you’re obsessed with performance and the the rather naive runtime-determined member lookup presented here bothered you, don’t worry. You might be able to imagine how we can improve O(n) runtime string comparisons and O(n) compile-time string comparisons, where n is the number of members of the struct. We’ll analyze the performance and see how we can do better… in the next blog post in my reflection series!

cpp3k

The cpp3k version of the same code has a similar structure, but is overall cleaner and more terse–to reiterate the point Louis made in his aforementioned keynote. This is how we loop over members to serialize them:

    meta::for_each($T.member_variables(),
      [&src, &dst, &result](auto&& member) {
        dst += std::string("\"") + member.name() + "\"" + " : ";
        if (result = serialize(src.*member.pointer(), dst);
            result != serialize_result::success) {
          return;
        }
        dst += ", ";
      }
    );

One notable issue with the current state of this implementation is that I couldn’t find a good “type trait” equivalent to the Record<T> concept, which simply returns true if T is a type that contains members. I don’t think this is an intentional emission from the cpp3k implementation, since this kind of introspectability is key for the kind of generic programming that reflection allows, and I have hope that Herb and Andrew understand that.

Anyway, I went ahead and implemented a type trait using the detection idiom so that I could switch on this concept using if constexpr. This is not a great implementation since it could easily be faked by another interface, but it gets the job done for this example:

template<typename T>
static constexpr bool is_member_type() {
  return metap::is_detected<has_member_variables, T>{};
}

The deserialization code is much cleaner and requires fewer helper functions because of the value semantics of this API: we can simply access the member pointer directly from the metainfo. (We are still matching the runtime string to a member metainfo by looping over each member.)

      meta::for_each($T.member_variables(),
        [&dst, &key, &value_token, &result](auto&& member) {
          if (key == member.name()) {
            if (result = deserialize(value_token, dst.*member.pointer());
                result != deserialize_result::success) {
              return;
            }
          }
        }
      );

Program options and member annotation

Let’s start with a common problem in C++: you want to map int argc, char** argv from an incredibly primitive C-style array to a set of program configuration options, which you’ve encapsulated as a struct that gets passed around to initialize your application. You could write an “if” statement for each flag you want to recognize and manually stuff the options struct with the parsed values. Or, you could write a generic parse function that changes its behavior based on the layout of the options struct and some compile-time configuration options.

This example may remind you of a commonly used solution for this problem: Boost Program Options. Since this is just an example and not a fully-fledged library, it offers a much more limited interface, but my thought is that static reflection and metaprogramming can be used to implement a similar library with less runtime cost.

Of course, we will need to add some annotations to our program configuration, such as which flags specify which options.

Recall that C++11 added attributes to the language: annotations using [[double bracket]] syntax that may change how the compiler treats a function, a declaration, an expression, or pretty much anything. Ideally, we could simply add our own attributes and reflect on them (this is known as “user-defined attributes” in standard proposal-land).

Unfortunately the reflection proposals and their implementations don’t have user-defined attributes baked in. However, I’m going to show an implemention of annotated members for program options using reflection, some Boost Hana utilities and–you guessed it–macros.

The basic idea is that, within a struct, we call a macro REFLOPT_OPTION to define a member given its type, name, the flags which map to the member on the command line, and an optional help string describing what the option does.

struct ProgramOptions {
  REFLOPT_OPTION(std::string, filename, "--filename");
  REFLOPT_OPTION(int, iterations, "--iterations", "-i",
    "Number of times to run the algorithm.")
  REFLOPT_OPTION(bool, help, "--help", "-h", "Print help and exit.")
};

The REFLOPT_OPTION macro defines a member with the given type and name. It also defines a member of templated type Option that holds constexpr strings with the option-related metadata.

(Many thanks to Vittorio Romeo for inspiring me to write this example and for feedback on the syntax for annotating structs.)

We’ll define a parse function that detects if the input struct contains members which specialize Option and generates parser code based on the option metadata.

The parse function instantiates a helper OptionsMap type. At compile time, OptionsMap generates two runtime functions, contains and set based on the layout of the struct which holds our program options. contains will check if a given argument flag is valid based on the Option metadata. set will set the field in the program options struct corresponding to the given flag to the value specified on the command line. parse’s return type is std::optional; it returns the field program options struct if the given argument vector was valid or nullopt if it couldn’t figure out how to prase the arguments. (This should probably be an expected instead of an optional, but since the proposed interface of optional is simpler we’ll use that for the sake of demonstration).

  template<typename OptionsStruct>
  optional_t<OptionsStruct> parse(int argc, char** const argv) {
    OptionsStruct options;
    for (int i = 1; i < argc; i += 2) {
      if (OptionsMap<OptionsStruct>::contains(argv[i])) {
          OptionsMap<OptionsStruct>::set(options, argv[i], argv[i + 1]);
      } else {
        // unknown prefix found
        return std::experimental::nullopt;
      }
    }
    return options;
  }

OptionsMap<T> scans over all the members of T. For every member which is a specialization of Option, we look at hte flags mapping to that option and collect a compile-time map which associates the flag strings with the reflection metainfo of the member representing the program option. In the implementation of contains, we’ll take the runtime string input and compare it against the compile-time string keys of this map for a match. In the implementation of set, we’ll take the metainfo value associated with the compile-time key that matches the given flag, and retrieve the type of the member, so that we can convert the string value to the right type using boost::lexical_cast, and the member pointer, so that we can set the value in the program options struct.

reflexpr

One key helper function we need for this example is get_metainfo_for, which retrieves the metainfo for a member given a compile-time string representing its name. This requires some boilerplate since associative access of members based on the name of the identifier is not a part of the proposal, and because the constexpr string representation chosen by the proposal cannot be used as a key in a Hana compile-time map.

  template<typename MetaT>
  struct index_metainfo_helper {
    template<typename Id, size_t I, size_t ...J>
    static constexpr bool equals_member(std::index_sequence<J...>&&) {
      return ((Id{}[hana::size_c<J>] == meta::get_base_name_v<
                  meta::get_element_m<
                    meta::get_data_members_m<MetaT>,
                    I
                  >
                >[J]) && ...);
    }

    template<typename Id, size_t ...Index>
    static constexpr auto apply(Id&&, std::index_sequence<Index...>) {
      return ((equals_member<Id, Index>(
               std::make_index_sequence<hana::length(Id{})>{}) ? Index : 0) + ...);
    }
  };

  template<typename T, typename Id>
  static constexpr auto get_metainfo_for(Id&&) {
    using MetaT = reflexpr(T);
    constexpr auto index = index_metainfo_helper<MetaT>::apply(Id{},
        std::make_index_sequence<refl::n_fields<T>{}>{});
    return meta::get_element_m<meta::get_data_members_m<MetaT>, index>{};
  }

(If you have thoughts on how to clean up this section of the code and/or the below cpp3k implementation, pull requests or comments are welcome! :])

In terms of syntactic overhead and code aesthetics, the one place where the raw reflexpr API has an advantage over cpp3k is when you want to directly grab a type and use it in a template (angle-bracket) context. You can see this in the implementation of set:

    static auto set(OptionsStruct& options, const char* prefix, const char* value) {
      hana::for_each(hana::keys(prefix_map),
        [&options, &prefix, &value](auto&& key) {
          if (runtime_string_compare(key, prefix)) {
            constexpr auto info = hana::at_key(prefix_map, std::decay_t<decltype(key)>{});
            using MetaInfo = std::decay_t<decltype(info)>;
            constexpr auto member_pointer = meta::get_pointer<MetaInfo>::value;
            using MemberType = meta::get_reflected_type_t<meta::get_type_m<MetaInfo>>;
            options.*member_pointer = boost::lexical_cast<MemberType>(
              value, strnlen(value, max_value_length));
          }
        }
      );
    }

As we’ll see, the cpp3k implementation will require a little more to unwrap a type from a value to be used in the same way.

cpp3k

The implementation of get_metainfo_for is slightly nicer than above, but not by much.

  template<typename T, typename Id, size_t I, size_t ...J>
  static constexpr bool equals_member(std::index_sequence<J...>&&) {
    return ((Id{}[hana::size_c<J>] ==
             cpp3k::meta::cget<I>($T.member_variables()).name()[J]) && ...);
  }

  template<typename T>
  struct get_matching_index {
    template<typename Id, std::size_t ...I>
    static constexpr std::size_t apply(Id&& id, std::index_sequence<I...>) {
      constexpr auto N = hana::length(Id{});
      return ((equals_member<T, Id, I>(
               std::make_index_sequence<N>{}) ? I : 0) + ...);
    }
  };

  template<typename T, typename Id>
  static constexpr auto get_metainfo_for(Id&& id) {
    constexpr auto N = $T.member_variables().size();
    constexpr auto index = get_matching_index<T>::apply(
      Id{}, std::make_index_sequence<N>{});
    return cpp3k::meta::cget<index>($T.member_variables());
  }

Notice that after getting the index corresponding to the identifier we use a new utility from cpp3k: cget, the constexpr free function that accesses the heterogenous sequence container which results from $T.member_variables().

In the implementation of set, we need to retrieve the member type from the metainfo.

Now, this may be another bug in the cpp3k implementation, but I couldn’t find a way to gracefully retrieve the type of the member from the metainfo value. Ideally we could do something like this:

auto use_metainfo = [](auto&& metainfo, const char* str) {
  using MemberT = typename std::decay_t<decltype(metainfo)>::type;
  return boost::lexical_cast<MemberT>(str);
};

But trying to retrieve the type like this didn’t compile, so I had to write an unreflect_type helper function to do this.

    static auto set(OptionsStruct& options, const char* prefix, const char* value) {
      hana::for_each(hana::keys(prefix_map),
        [&options, &prefix, &value](auto&& key) {
          if (runtime_string_compare(key, prefix)) {
            constexpr auto info = hana::at_key(prefix_map, std::decay_t<decltype(key)>{});
            constexpr auto member_pointer = info.pointer();
            using MemberType = refl::unreflect_member_t<OptionsStruct, decltype(info)>;
            options.*member_pointer = boost::lexical_cast<MemberType>(
              value, strnlen(value, max_value_length));
          }
        }
      );
    }

The implementation of unreflect_type is not pretty, which makes me think the lack of type retrieval is an unintentional omission:

template<typename S, typename Member>
using unreflect_member_t = typename std::decay_t<
    decltype(std::declval<S>().*(std::decay_t<Member>::pointer()))>;

And that’s about it! If you’re feeling a brave, you can check out the complete implementation on Github, clone one of the reference implementations and play around with these examples–have fun!

In conclusion: what’s missing?

Reading this blog post, you might feel as if there’s something missing. Maybe it was the code smell coming from the macro used for member annotations in the program options example. Or maybe you just aren’t impressed by the examples so far.

The reflection proposals we’ve looked at do not extend the code generation features of the language. Unfortunately, while brainstorming what I could do with reflection, I had a lot of ideas for utilities that required the ability to add new type members based on the metainformation resulting from reflection, such as:

  • Auto-generation of mock classes. Frameworks like the impressive trompeloeil library require code duplication of the mocked interface. Reflection as it is currently proposed doesn’t solve this problem.
  • Function instrumentation: this requires essentially the same features needed for mocking. The idea is to automatically a class with the same interface as the input class that counts the number of calls to a function or other statistics for each function in a class.
  • Generating code for “virtual concepts”/”interfaces”/”Rust-style traits”. See also dyno.

Actually, I didn’t include any function reflection examples, even though it is implemented in the cpp3k fork, because the most interesting and useful applications of function reflection require this missing feature! I didn’t see a compelling and immediately obvious application of function introspection for the “everyday programmer”.

But if you’re still reading, you’re probably not an everyday programmer! Function reflection could make dealing with overload sets a lot simpler in generic libraries. The user of a library could add a custom callback by overloading a function, and the library could simply reflect over all overloads of a function to find the customization point rather than using tricky ADL rules. I haven’t made a demo of this idea, but I’m looking forward to the upcoming talk at C++ Now by Michał Dominiak about “Customization Points that Suck Less”.

At this point, you might be saying “Jackie, if you have so many opinions about this feature, why don’t you write a proposal about it?” I’m just one person who has thought about the issue of type synthesis or identifier modification–and some of the others have many, many more years of C++ and software architecture development than I do! Herb Sutter’s upcoming proposal on metaclasses is a promising way forward to make reflection and more generic programming as powerful as it needs to be. Out of respect for Herb I’ll refrain from saying more until he’s officially released this work.

(For the curious reader: the very first thing I tried to do with the reflexpr fork was making a “type synthesis” example which included metafunctions for adding and removing members using constexpr strings. However, I haven’t polished this due to my realization that without language-level support for using reflection info in identifiers and standardization of more metaprogramming utilities like constexpr strings and heterogenous data structures and algorithms, the interface for such a library is, in my opinion, unusable.)

In the third and final part of this series, we’ll take a look at the performance implications of reflection as it’s currently proposed–specifically, the metaprogramming techniques needed to make reflection useful.