Bad pimpl vs. const, beware!

Note: This entry has been restored from old archives.

Here’s an interesting little bug that good old pimpl can easily lead you to, with nary a whistle from the compiler. It actually all boils down to being damn lazy and not doing pimpl properly, but I’ll get to that later. First, let me introduce some lazy
code:

///////////////////////////////////////////////////////////////////////////////
// LazyThing.h
#include <boost/shared_ptr.hpp>
#include <string>

class LazyThing {
public:
    LazyThing();
    void setThing(const std::string & thing, bool value);
    bool getThing(const std::string & thing) const;
    unsigned thingCount() const;
private:
    class LazyThingImpl;
    // here's our pimpl
    boost::shared_ptr<LazyThingImpl> m_impl;
};

///////////////////////////////////////////////////////////////////////////////
// LazyThing.cpp
// #include <LazyThing.h>
#include <map>

// Declare the impl
class LazyThing::LazyThingImpl {
public:
    std::map<std::string,bool> m_thingMap;
};

// Define LazyThing
LazyThing::LazyThing() : m_impl(new LazyThingImpl()) {}

void LazyThing::setThing(const std::string & thing, bool value) {
    m_impl->m_thingMap[thing] = value;
}

bool LazyThing::getThing(const std::string & thing) const {
    return m_impl->m_thingMap[thing];
}

unsigned LazyThing::thingCount() const {
    return m_impl->m_thingMap.size();
}

///////////////////////////////////////////////////////////////////////////////
// main.cpp
// #include <LazyThing.h>
#include <iostream>

int main() {
    LazyThing t;
    t.setThing("foo", false);
    t.setThing("bar", true);
    std::cout << t.thingCount() << std::endl;
    std::cout << (t.getThing("foo") ? "true" : "false") << std::endl;
    std::cout << (t.getThing("bar") ? "true" : "false") << std::endl;
    std::cout << (t.getThing("baz") ? "true" : "false") << std::endl;
    std::cout << t.thingCount() << std::endl;
    return 0;
}

Both g++ and icc will compile this code with no errors or warnings (ignoring some spurious unrelated ones from icc when -Wall is set.) When we run the resultant binary we see this:

2
false
true
false
3

Fair enough? Well no, the code has a bug. Though you don’t know for sure since I haven’t given a specification for getThing, let it be this: returns the value set for thing; returns false if thing has not been set. So, that final 3 is a bit disturbing. Hey, isn’t getThing supposed to be const?! What’s the root of this? If you’ve much experience with C++/STL then you’ll know that operator[] is a non-const member that inserts a default-constructed element into the std::map if no element previously exists. You’ll know this either from properly reading the documentation, or from having this sort of situation bite you in the arse before (prompting you to properly read the documentation;) Quite simply, the code should not be using operator[] in getThing, that is completely retarded. But, you say, if it is non-const then why did the compiler let us call it?! The understanding is that a const method is not permitted to make non-const calls on data members. Right? Alas, this breaks down with pointers!

If we take the above code and remove the pimpl, embed the std::map into LazyThing, and then try to implement getThing with return m_map[thing] the code won’t compile. As we’d hope! If we take the above and change the boost::shared_ptr pimpl container to a plain LazyThingImpl* then the code will compile and exhibit the disturbing behaviour above. (In other words: don’t blame the shared_ptr.)

So, what gives? Compiler getting lost or does the C++ standard let this happen? In fact, the behaviour here is logically correct, even though it feels wrong. If we refer to 9.3.2 of the spec, namely “the this pointer“, and look at paragraph 2 we see what seems to be the extent of the power of applying const to a member function.

const member function, the object for which the function is called is
accessed through a const access path; therefore, a const member function
shall not modify the object and its non-static data members.

So a const member can’t modify non-static data members, if the std::map is a member then operator[] clearly modifies it, verboten! If a LazyThingImpl* is a member then calling a non-const method on the object the pointer points to clearly does not modify the pointer. Sure, it modifies the destination instance but I think that is beside the point… that instance is not the member in question, the pointer is. (And yeah, you can modify static “members” in const member functions too.)

So, we knew all this already right? In general it isn’t new to me, I’ve hit the const calling non-const on pointer “problem” before. However the pattern in the code above is simplified (a lot) from something I wrote last week, I committed this crime. At the root of the crime is a sin: laziness. Thus the name of the class! You see, you’re not really supposed to implement pimpl as “pointer to data struct”, as named the pattern is “pointer to implementation” and we can avoid the whole problem above, the foolish mistake of using operator[], by doing pimpl the right way.

Brace yourself for another code dump!

///////////////////////////////////////////////////////////////////////////////
// Thing.h
#include <boost/shared_ptr.hpp>
#include <string>

class Thing {
public:
    Thing();
    void setThing(const std::string & thing, bool value);
    bool getThing(const std::string & thing) const;
    unsigned thingCount() const;
private:
    class ThingImpl;
    boost::shared_ptr<ThingImpl> m_impl;
};  

///////////////////////////////////////////////////////////////////////////////
// Thing.cpp
// #include <Thing.h>
#include <map>

// Declare the impl - mirroring the Thing public interface!
class Thing::ThingImpl {
public:
    void setThing(const std::string & thing, bool value);
    bool getThing(const std::string & thing) const;
    unsigned thingCount() const;
private:
    std::map<std::string,bool> m_thingMap;
};

// Define ThingImpl methods
void Thing::ThingImpl::setThing(const std::string & thing, bool value) {
    m_thingMap[thing] = value;
}

bool Thing::ThingImpl::getThing(const std::string & thing) const {
    std::map<std::string,bool>::const_iterator res = m_thingMap.find(thing);
    if (res == m_thingMap.end()) {
        return false;
    }
    return res->second;
}

unsigned Thing::ThingImpl::thingCount() const {
    return m_thingMap.size();
}

// Define Thing methods
Thing::Thing() : m_impl(new ThingImpl()) {}
    
void Thing::setThing(const std::string & thing, bool value) {
    m_impl->setThing(thing, value);
}   
    
bool Thing::getThing(const std::string & thing) const { 
    return m_impl->getThing(thing);
}   

unsigned Thing::thingCount() const {
    return m_impl->thingCount();
}

I’ve left out main this time, aside from the altered class name it is identical to the previous main. So, the difference is that Thing is now no more than a proxy, every call to Thing just makes the same call to m_impl. It’s important that it is the same call too, otherwise you might get yourself lost. The public interface to ThingImpl should mirror that of Thing including constness! If you made ThingImpl‘s getThing non-const you’d still be permitted to make the operator[] mistake. As the code is above, if you tried using operator[] on m_map in Thing::ThingImpl::getThing the compiler would tell you to get buggered:

:; g++ -Wall -Wextra -pedantic Thing.cpp -o Thing
Thing.cpp: In member function 'bool Thing::ThingImpl::getThing(const std::string
&) const':
Thing.cpp:38: error: no match for 'operator[]' in '((const Thing::ThingImpl*)thi
s)->Thing::ThingImpl::m_thingMap[thing]'
/usr/include/c++/4.1.3/bits/stl_map.h:340: note: candidates are: _Tp& std::map<_
Key, _Tp, _Compare, _Alloc>::operator[](const _Key&) [with _Key = std::basic_str
ing<char, std::char_traits<char>, std::allocator<char> >, _Tp = bool, _Compare =
 std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> 
> >, _Alloc = std::allocator<std::pair<const std::basic_string<char, std::char_t
raits<char>, std::allocator<char> >, bool> >] <near match>

Love those g++ template messages! This is what I like to see though, the compiler letting me know when I’ve done something stupid. So, remember:

pimpl is “pointer to implementation

This goes far beyond this silly example using std::map::operator[]. If you only move data members to your impl and access them all via the pimpl from logic in the interface class then you’ve entirely lost the enforcement of const. You could modify a std::string, increment an int, or whathaveyou. Know better, mirror the interface in the impl and make the data members private.

(A quick Google shows up a few pimpl examples out there on the web that do what I’ve called “pointer to data struct.” Then again, I don’t have 100% confidence that my interpretation of “implementation” is that of the originator(s) of the concept anyway – protecting the const enforcement as strongly as possible seems wise to me though.)

2 thoughts on “Bad pimpl vs. const, beware!”

  1. Pointer? It’s “private implementation”, though of course the term morphed – I know I called it “pointer to implementation” when I taught it (and I must have stolen^Wresearched it from somewhere).

    Herb Sutter’s “Pimples – Beauty Marks You Can Depend Upon” which I think is basically the seminal article on these things used as the implementation part (will this make it through the html escaper… with no preview who knows):

    struct XImpl {
        std::list<C> clist_;
        D d_;
    };

    Which is going pretty literal on the “pointer to data struct” πŸ™‚

  2. References

    private? Dang! I must have picked it up off you, your fault! πŸ˜‰ Maybe we can blame Meyers’s Effective C++? He use the “pointer to implementation” definition, in fact Sutter, in GotW#24, also implies that it is pointer saying in a footnote: “The eponymous pimpl_ was actually coined by friend and colleague Jeff Sumner, who shares my penchant for Hungarian-style “p” prefixes for pointer variables and who has an occasional taste for horrid puns.” I’m not ashamed to be following the words of either guru. πŸ™‚

    My uncertainty at the end stemmed mainly from Herb Sutter’s GotW#24 (http://www.gotw.ca/gotw/024.htm), “Compilation Firewalls” (another name for pimpl), in this GotW one proposed take on pimpl is:

    make XImpl entirely the class that X would have been, and write X as only the public interface made up entirely of simple forwarding functions (a handle/body variant).

    And Sutter’s answer to that is:

    This is useful in a few restricted cases, and has the benefit of avoiding a back pointer since all services are available in the pimpl class. The chief drawback is that it normally makes the visible class useless for any inheritance, as either a base or a derived class.

    In other words, he says: “do pimpl as pointer to data struct”… I’d expect that classes that are not involved in inheritance are hierarchies are more common, making “the chief drawback” not too concerning really (and I’m not really sure it is all that at odds with inheritance really.) Anyway, GotW#24 is very old. Also, it makes no mention of maintaining const at all.

    Beyond that I also had a look at “43. Pimpl judiciously” of the much more recent “C++ Coding Standards: 101 Rules, Guidelines, and Best Practices” the point “Pimpl should be used to store all private members, both member data and private member functions” is reiterated (still no mention of const.) So I guess Sutter’s PoV is still at odds with my interpretation, and I’m not about to go against /the/ guru. πŸ™‚

    The main reference I pulled out at the time of writing was the old favourite: Effective C++, third ed. In item 31 (“Minimise compilation dependencies…“) he addresses pimpl and tends to prefer the name “Handle classes”. It is from this book that my idea that methods in classes with pimpls all call through to pimpl equivalents probably comes from. There’s even an example:

    std::string Person::name() const {
        return pImpl->name();
    }

    Meyers says “one way is to forward all their function calls to the corresponding implementation classes and have those classes do the real work.” He certainly doesn’t put this forward as a definitive statement though, merely “one way.”
    Meyers also makes no mention of constness.

    So, in the end I think my PoV comes from Meyers but the method has had some added concrete in my own head. Given the ‘const’ situation I think I’ll keep the concrete for now. I was going to be more detailed in a coverage of stuff I’d read but it was kind of late so I left it at “don’t have 100% confidence”… no confidence left at all now. πŸ˜‰

Comments are closed.