C++ Boost

Serialization

Archive Class Reference


Trivial Archive
More Useful Archive Classes
Usage
Testing
Polymorphic Archives

Trivial Archive

The Archive concept specifies the functions that a class must implement to in order to be used to serialize Serializable types. Our discussion will focus on archives used for saving as the hierarchy is exactly analogous for archives used for loading data.

Minimum Requirments

The simplest class which will model the Archive concept specifies the functions that a class will look like:

#include <cstddef> // std::size_t
//////////////////////////////////////////////////////////////
// class trivial_oarchive
class trivial_oarchive {
public:
    //////////////////////////////////////////////////////////
    // public interface used by programs that use the
    // serialization library
    typedef boost::mpl::bool_<true> is_saving; 
    typedef boost::mpl::bool_<false> is_loading;
    template<class T> void register_type(){}
    template<class T> trivial_oarchive & operator<<(const T & t){
        return *this;
    }
    template<class T> trivial_oarchive & operator&(const T & t){
        return *this << t;
    }
    void save_binary(void *address, std::size_t count){};
};
The simplest possible input archive class is analogous to the above. In the following discussion, only output archives will be addressed. Input archives are exactly symmetrical to output archives.

This archive will compile and execute with any types which implement the Serializable concept. For an example see demo_trivial_archive.cpp. Of course this program won't produce any output as it is. But it provides the starting point for a simple class which can be used to log formated output. See the implementation of a simple log archive to how this has been done.

More Useful Archive Classes

The above example is fine as far as it goes. But it doesn't implement useful features such as serialization of pointers, class versioning and others. This library implements a family of full featuared archive classes appropriate for a variety of purposes.

Our archives have been factored in to a tree of classes in order to minimize repetition of code. This is shown in the accompanying class diagram. Any class which fullfills the following requirements will fit into this hierarchy and implement all the features we require. Deriving from the base class common_oarchive.hpp provides all features we desire which are missing from trivial_oarchive above.



#include <cstddef> // std::size_t
#include <boost/archive/detail/common_oarchive.hpp>

/////////////////////////////////////////////////////////////////////////
// class complete_oarchive
class complete_oarchive : 
    public boost::archive::detail::common_oarchive<complete_oarchive>
{
    // permit serialization system privileged access to permit
    // implementation of inline templates for maximum speed.
    friend class boost::archive::save_access;

    // member template for saving primitive types.
    // Specialize for any types/templates that special treatment
    template<class T>
    void save(T & t);

public:
    //////////////////////////////////////////////////////////
    // public interface used by programs that use the
    // serialization library

    // archives are expected to support this function
    void save_binary(void *address, std::size_t count);
};
Given a suitable definitions of save and save_binary, any program using serialization with a conforming C++ compiler should compile and run with this archive class.

Optional Overrides

The detail::common_oarchive class contains a number of functions that are used by various parts of the serialization library to help render the archive in a particular form.

void save_start(char const *)

Default:Does nothing.
Purpose:To inject/retrieve an object name into the archive. Used by XML archive to inject "<name " before data.

void save_end(char const *)

Default:Does nothing.
Purpose:To inject/retrieve an object name into the archive. Used by XML archive to inject "</name>" after data.

void end_preamble()

Default:Does nothing.
Purpose:Called each time user data is saved. It's not called when archive bookkeeping data is saved. This is used by XML archives to determine when to inject a ">" character at end of XML header. XML output archives keep their own internal flag indicating that data being written is header data. This internal flag is reset when an object start tag is written. When void end_preamble() is invoked and this internal flag is set a ">" character is appended to the output and the internal flag is reset. The default implementation for void end_preamble() is a no-op there by permitting it to be optimised away for archive classes that don't use it.

template<class T> void save_override(T & t, int);

Default:Invokes archive::save(Archive & ar, t)
This is the main entry into the serialization library.
Purpose:This can be specialized in cases where the data is to be written to the archive in some special way. For example, XML archives implement special handling for name-value pairs by overriding this function template for name-value pairs. This replaces the default name-value pair handling, which is just to throw away the name, with one appropriate for XML which writes out the start of an XML tag with the correct object name.

The second argument must be part of the function signature even this it is not used. Its purpose is to be sure that code is portable to compilers which fail to correctly implement partial function template ordering. For more information see this.

Types used by the serialization library

The serialization library injects bookkeeping data into the serialization archive. This data includes things like object ids, version numbers, class names etc. Each of these objects is included in a wrapper so that the archive class can override the implementation of void save_override(T & t, int);. For example, in the XML archive, the override for this type renders an object_id equal to 23 as "object_id=_23". The following table lists the types defined in the boost::archive namespace used internally by the serialization library:

typedefault
serialized as
version_typeunsigned int
object_id_typeunsigned int
object_id_reference_typeunsigned int
class_id_typeint
class_id_optional_typenothing
class_id_reference_typeint
tracking_typebool
classname_typestring

All of these are associated with a default serialization defined in terms of primitive types so it isn't a requirement to define save_override for these types.

These are defined in basic_archive.hpp. All of these types have been assigned an implementation level of primitive and are convertible to types such as int, unsigned int, etc. so that they have default implementations. This is illustrated by basic_text_iarchive.hpp. which relies upon the default. However, in some cases, overrides will have to be explicitly provided for these types. For an example see basic_xml_iarchive.hpp.

In real practice, we probably won't be quite done. One or more of the following issues may need to be addressed:

The attached class diagram shows the relationships between classes used to implement the serialization library.

A close examination of the archives included with the library illustrate what it takes to make a portable archive that covers all data types.

Usage

The newly created archive will usually be stored in its own header module. All that is necessary is to include the header and construct an instance of the new archive. EXCEPT for one special case. To make this work, the following should be included after the archive class definition.

#define BOOST_SERIALIZATION_REGISTER_ARCHIVE(Archive)
Failure to do this will not inhibit the program from compiling, linking and executing properly - except in one case. If an instance of a derived class is serialized through a pointer to its base class, the program will throw an unregistered_class exception.

Testing

Exhaustive testing of the library requires testing the different aspects of object serialization with each archive. There are 46 different tests that can run with any archive. There are 5 "standard archives" included with the system. (3 in systems don't support wide charactor i/o).

In addition, there are 28 other tests which aren't related to any particular archive class.

The default bjam testing setup will run all the above described tests. This will result in as many as 46 archive tests * 5 standard archives + 28 general tests = 258 tests. Note that a complete test of the library would include DLL vs static library, release vs debug so the actual total would be closer to 1032 tests.

For each archive there is a header file in the test directory similar to the one below. The name of this archive is passed to the test program by setting the environmental variable BOOST_ARCHIVE_TEST to the name of the header. Here is the header file test_archive.hpp . Test header files for other archives are similar.


// text_archive test header
// include output archive header
#include <boost/archive/text_oarchive.hpp>
// set name of test output archive
typedef boost::archive::text_oarchive test_oarchive;
// set name of test output stream
typedef std::ofstream test_ostream;

// repeat the above for input archive
#include <boost/archive/text_iarchive.hpp>
typedef boost::archive::text_iarchive test_iarchive;
typedef std::ifstream test_istream;

// define open mode for streams
//   binary archives should use std::ios_base::binary
#define TEST_STREAM_FLAGS (std::ios_base::openmode)0
To test a new archive, for example, portable binary archives, with the gcc compiler, make a header file portable_binary_archive.hpp and invoke bjam with
 
-sBOOST_ARCHIVE_LIST=portable_binary_archive.hpp
This process in encapsulated in the shell or cmd script library_test whose command line is

library_test --toolset=gcc -sBOOST_ARCHIVE_LIST=portable_binary_archive.hpp

Polymorphic Archives

Motivation

All archives described so far are implemented as templates. Code to save and load data to archives is regenerated for each combination of archive class and data type. Under these cirumstances, a good optimizing compiler that can expand inline functions to enough depth will generate fast code. However:

Implementation

The solution is the the pair polymorphic_oarchive and polymorphic_iarchive. They present a common interface of virtual functions - no templates - that is equivalent to the standard templated one. This is shown in the accompanying class diagram

The accompanying demo program in files demo_polymorphic.cpp, demo_polymorphic_A.hpp, and demo_polymorphic_A show how polymorphic archives are to be used. Note the following:

As can be seen in the class diagram and the header files, this implementation is just a composition of the polymorphic interface and the standard template driven implementation. This composition is accomplished by the templates polymorphic_iarchive_route.hpp and polymorphic_oarchive_route.hpp which redirect calls to the polymorphic archives to the specific archive. As these contain no code specific to the particular implementation archive, they can be used to create a polymorphic archive implementation from any functioning templated archive implementation.

As a convenience, small header files have been included which contain typedef for polymorphic implementation for each corresponding templated one. For example, the headers polymorphic_text_iarchive.hpp and polymorphic_text_oarchive.hpp. contain the typedef for the polymorphic implementation of the standard text archive classes text_iarchive.hpp and text_oarchive.hpp respectively. All included polymorphic archives use the same naming scheme.

Usage

Polymorphic archives address the issues raised above regarding templated implementation. That is, there is no replicated code, and no recompilation for new archives. This will result in smaller executables for program which use more than one type of archive, and smaller DLLS. There is a penalty for calling archive functions through a virtual function dispatch table and there is no possibility for a compiler to inline archive functions. This will result in a detectable degradation in performance for saving and loading archives.

Note this the concept and of polymophic archives is fundamentally incompatible with the serialization of new types are are marked "primitive" by the user with:

 
BOOST_CLASS_IMPLEMENTATION(my_primitive_type, boost::serialization::primitive_type)
Code to implement serialization for these types is instantiated "on the fly" in the user's program. But this conflicts with the whole purpose of the polymorphic archive. An attempt to serialize such a primitive type will result in a compilation error since the common polymorhic interface is static and cannot instantiate code for a new type.

The main utility of polymorphic archives will be to permit the buiding of class DLLs that will include serialization code for all present and future archives with no redundant code.


© Copyright Robert Ramey 2002-2004. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)