C++ Streams & Typedefs: Be Charful

The C++ typedef keyword is indispensable in many situations, especially for writing portable low-level code. However, in some circumstances it can cause trouble, particularly when it comes to function overloading. Consider the following C++ template class:
template <typename T>
struct foobar
{
    foobar( const T foo ) : foo_( foo ) {}
    T foo_;
};
One might want to write a simple stream output operator to format the template class’ member values, e.g. for debugging purposes:
template <typename T>
ostream& operator<<( ostream& s, const foobar<T>& fb )
{
    return s << "foo: " << fb.foo_;
}
This seems reasonable. Now, let’s assume that this template is going to be used in a context where T will be one of several fixed-width integer types. These are usually typedefs from a header like stdint.h (for those that don’t mind including a C header) or boost/cstdint.hpp (to be a C++ purist). They are commonly named int64_t, int32_t, int16_t, and int8_t, where the X in intX_t specifies the number of bits used to represent the integer. There are also unsigned variants, but we’ll ignore those for this discussion.

Let’s now explore what happens when we initialize a foobar<intX_t> instance with its foo_ member set to a small integer and print it to standard output via our custom stream output operator:
cout << foobar<int64_t>( 42 ) << endl;
cout << foobar<int32_t>( 42 ) << endl;
cout << foobar<int16_t>( 42 ) << endl;
Each of these statements prints “foo: 42″, as expected. Great, everything works! But wait, there was one type that we didn’t test:
cout << foobar<int8_t>( 42 ) << endl; 
 This prints “foo: *” instead of “foo: 42″. This is probably not the expected result of printing the value of an int8_t. After all, it looks and feels just like all of the other intX_t types! What causes it to be printed differently from the other types? Let’s look at how the integer types might be defined for an x86 machine:
typedef long int int64_t;
typedef int int32_t;
typedef short int16_t;
typedef char int8_t;
The problem is that the only way to represent an integer with exactly 8 bits (and no more) is with a char (at least on the x86 architecture). While a char is an integer, it is also a… character. So, this trouble is caused by the fact that the char type is trying to be two things at once. A simple (but incorrect) approach to work around this is to overload1 the stream output operator for the int8_t type, and force it to be printed as a number:
// This is incorrect:
ostream& operator<<( ostream& s, const int8_t i )
{
return s << static_cast<int>( i );
}

The problem with this approach is that the int8_t typedef does not represent a unique type. The typedef keyword is named poorly; it does not introduce new types. Rather, it creates aliases for existing types. By overloading the stream output operator for the int8_t type, the char type’s operator is being overloaded as well. Since the standard library already defines a stream output operator for the char type, the above definition would violate the One Definition Rule and result in a compiler error. Even if it did compile, the results of redefining the way characters are printed would probably not be desirable.

An alternative (working) solution to the problem is to overload the output stream operator for the foobar<int8_t> type:

ostream& operator<<( ostream& s, const foobar<int8_t>& fb )
{
    return s << "foo: " << static_cast<int>( fb.foo_ );
}
This definition does not clash with any existing overloads from the standard library, and it effectively causes the int8_t to be printed as an integer. The downside is that it will cause unexpected behavior when a foobar<char> is printed, if the programmer intends char to represent a character. The only way to avoid this would be to define int8_t as a class instead of making it a typedef, and providing a well-behaved stream output operator for that class. The class’ arithmetic operators could be overloaded to make it look almost exactly like a POD integer, and it wouldn’t necessarily take up any extra memory. However, this solution is still not ideal, because classes behave differently than POD types in subtle ways (e.g. POD types are not initialized by default, but classes are).

If there’s anything to take away from this, it’s that the C++ char type is an odd beast to watch out for. Also, the name of the typedef operator could use some improvement…

To subscribe to the "Guy WhoSteals" feed, click here.
You can add yourself to the GuyWhoSteals fanpage on Facebook or follow GuyWhoSteals on Twitter.

0 comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...
top
Share