Bad pimpl vs. const, beware!
Sun 2008-03-02 03:12
Here's an interesting little bug that good old pimpl can easily lead you to, with nary a whistle from the compiler. It actually all boils down to being damn lazy and not doing pimpl properly, but I'll get to that later. First, let me introduce some lazy code:
/////////////////////////////////////////////////////////////////////////////// // LazyThing.h #include <boost/shared_ptr.hpp> #include <string> class LazyThing { public: LazyThing(); void setThing(const std::string & thing, bool value); bool getThing(const std::string & thing) const; unsigned thingCount() const; private: class LazyThingImpl; // here's our pimpl boost::shared_ptr<LazyThingImpl> m_impl; }; /////////////////////////////////////////////////////////////////////////////// // LazyThing.cpp // #include <LazyThing.h> #include <map> // Declare the impl class LazyThing::LazyThingImpl { public: std::map<std::string,bool> m_thingMap; }; // Define LazyThing LazyThing::LazyThing() : m_impl(new LazyThingImpl()) {} void LazyThing::setThing(const std::string & thing, bool value) { m_impl->m_thingMap[thing] = value; } bool LazyThing::getThing(const std::string & thing) const { return m_impl->m_thingMap[thing]; } unsigned LazyThing::thingCount() const { return m_impl->m_thingMap.size(); } /////////////////////////////////////////////////////////////////////////////// // main.cpp // #include <LazyThing.h> #include <iostream> int main() { LazyThing t; t.setThing("foo", false); t.setThing("bar", true); std::cout << t.thingCount() << std::endl; std::cout << (t.getThing("foo") ? "true" : "false") << std::endl; std::cout << (t.getThing("bar") ? "true" : "false") << std::endl; std::cout << (t.getThing("baz") ? "true" : "false") << std::endl; std::cout << t.thingCount() << std::endl; return 0; }
Both g++ and icc will compile this code with no errors or warnings
(ignoring some spurious unrelated ones from icc when -Wall is set.)
When we run the resultant binary we see this:
2 false true false 3
Fair enough? Well no, the code has a bug. Though you don't know for sure since
I haven't given a specification for getThing, let it be this: returns the
value set for thing; returns false if thing has not been set. So,
that final 3 is a bit disturbing. Hey, isn't getThing supposed to be
const?! What's the root of this? If you've much experience with C++/STL then
you'll know that operator[] is a non-const member that inserts
a default-constructed element into the std::map if no element previously exists.
You'll know this either from properly reading the documentation, or from having
this sort of situation bite you in the arse before (prompting you to properly
read the documentation;) Quite simply, the code should not be using
operator[] in getThing, that is completely retarded. But,
you say,
if it is non-const then why did the compiler let us call it?!
The understanding is that a const method is not permitted to make
non-const calls on data members. Right? Alas, this breaks down with pointers!
If we take the above code and remove the pimpl, embed the std::map into
LazyThing, and then try to implement getThing with
return m_map[thing] the code won't compile. As we'd hope! If we
take the above and change the boost::shared_ptr pimpl container to a plain
LazyThingImpl* then the code will compile and exhibit the disturbing
behaviour above. (In other words: don't blame the shared_ptr.)
So, what gives? Compiler getting lost or does the C++ standard let this
happen? In fact, the behaviour here is logically correct, even though it
feels wrong. If we refer to 9.3.2 of the spec, namely
"the this pointer", and look at paragraph 2 we see what seems to be the
extent of the power of applying const to a member function.
In a
constmember function, the object for which the function is called is accessed through aconstaccess path; therefore, aconstmember function shall not modify the object and its non-static data members.
So a const member can't modify non-static data members, if the std::map is a
member then operator[] clearly modifies it, verboten! If a
LazyThingImpl* is a member then calling a non-const method on the object the
pointer points to clearly does not modify the pointer. Sure, it modifies
the destination instance but I think that is beside the point… that
instance is not the member in question, the pointer is. (And yeah, you can
modify static "members" in const member functions too.)
So, we knew all this already right? In general it isn't new to me, I've hit
the const calling non-const on pointer "problem" before. However the
pattern in the code above is simplified (a lot) from something I wrote last
week, I committed this crime. At the root of the crime is a sin: laziness.
Thus the name of the class! You see, you're not really supposed to implement
pimpl as "pointer to data struct", as named the pattern is
"pointer to implementation"
and we can avoid the whole problem above, the foolish mistake of using
operator[], by doing pimpl the right way.
Brace yourself for another code dump!
/////////////////////////////////////////////////////////////////////////////// // Thing.h #include <boost/shared_ptr.hpp> #include <string> class Thing { public: Thing(); void setThing(const std::string & thing, bool value); bool getThing(const std::string & thing) const; unsigned thingCount() const; private: class ThingImpl; boost::shared_ptr<ThingImpl> m_impl; }; /////////////////////////////////////////////////////////////////////////////// // Thing.cpp // #include <Thing.h> #include <map> // Declare the impl - mirroring the Thing public interface! class Thing::ThingImpl { public: void setThing(const std::string & thing, bool value); bool getThing(const std::string & thing) const; unsigned thingCount() const; private: std::map<std::string,bool> m_thingMap; }; // Define ThingImpl methods void Thing::ThingImpl::setThing(const std::string & thing, bool value) { m_thingMap[thing] = value; } bool Thing::ThingImpl::getThing(const std::string & thing) const { std::map<std::string,bool>::const_iterator res = m_thingMap.find(thing); if (res == m_thingMap.end()) { return false; } return res->second; } unsigned Thing::ThingImpl::thingCount() const { return m_thingMap.size(); } // Define Thing methods Thing::Thing() : m_impl(new ThingImpl()) {} void Thing::setThing(const std::string & thing, bool value) { m_impl->setThing(thing, value); } bool Thing::getThing(const std::string & thing) const { return m_impl->getThing(thing); } unsigned Thing::thingCount() const { return m_impl->thingCount(); }
I've left out main this time, aside from the altered class name it is
identical to the previous main. So, the difference is that Thing is now no
more than a proxy, every call to Thing just makes the same call to m_impl.
It's important that it is the same call too, otherwise you might get yourself
lost. The public interface to ThingImpl should mirror that of Thing
including constness! If you made ThingImpl's getThing non-const
you'd still be permitted to make the operator[] mistake. As the
code is above, if you tried using operator[] on m_map in
Thing::ThingImpl::getThing the compiler would tell you to get buggered:
:; g++ -Wall -Wextra -pedantic Thing.cpp -o Thing
Thing.cpp: In member function 'bool Thing::ThingImpl::getThing(const std::string
&) const':
Thing.cpp:38: error: no match for 'operator[]' in '((const Thing::ThingImpl*)thi
s)->Thing::ThingImpl::m_thingMap[thing]'
/usr/include/c++/4.1.3/bits/stl_map.h:340: note: candidates are: _Tp& std::map<_
Key, _Tp, _Compare, _Alloc>::operator[](const _Key&) [with _Key = std::basic_str
ing<char, std::char_traits<char>, std::allocator<char> >, _Tp = bool, _Compare =
std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char>
> >, _Alloc = std::allocator<std::pair<const std::basic_string<char, std::char_t
raits<char>, std::allocator<char> >, bool> >] <near match>
Love those g++ template messages! This is what I like to see though, the
compiler letting me know when I've done something stupid. So, remember:
pimpl is "pointer to implementation"
This goes far beyond this silly example using
std::map::operator[]. If you only move data members to your impl
and access them all via the pimpl from logic in the interface class then
you've entirely lost the enforcement of const. You could modify a
std::string, increment an int, or whathaveyou. Know better, mirror
the interface in the impl and make the data members private.
(A quick Google shows up a few pimpl examples out there on the web that do
what I've called "pointer to data struct." Then again, I don't have 100%
confidence that my interpretation of "implementation" is that of the
originator(s) of the concept anyway – protecting the const enforcement
as strongly as possible seems wise to me though.)
Recent Entries
Categories
- Entries - 260
- Beer - 1
- Cycling - 2
- Food - 53
- Cooking - 26
- Hare - 5
- Soup - 1
- Eating - 5
- England - 5
- London - 4
- Rickmansworth - 1
- Produce - 14
- Ristretto - 8
- Health - 3
- Money - 2
- Random - 74
- Technology - 93
- Code - 22
- General - 46
- Security - 23
- Work - 2
- Wanderings - 32
- Australia - 2
- Barcelona - 2
- Belgium - 2
- England - 15
- Cambridge_Easter - 3
- Lakes - 9
- Finland - 4
- France - 1
- Germany - 1
- Ramble - 3
- Wales - 1




2 Responses
Herb Sutter's "Pimples - Beauty Marks You Can Depend Upon" which I think is basically the seminal article on these things used as the implementation part (will this make it through the html escaper... with no preview who knows):
struct XImpl {
std::list<C> clist_;
D d_;
};
Which is going pretty literal on the "pointer to data struct" :)
My uncertainty at the end stemmed mainly from Herb Sutter's GotW#24 (http://www.gotw.ca/gotw/024.htm), "Compilation Firewalls" (another name for pimpl), in this GotW one proposed take on pimpl is:
"make XImpl entirely the class that X would have been, and write X as only the public interface made up entirely of simple forwarding functions (a handle/body variant)."
And Sutter's answer to that is:
"This is useful in a few restricted cases, and has the benefit of avoiding a back pointer since all services are available in the pimpl class. The chief drawback is that it normally makes the visible class useless for any inheritance, as either a base or a derived class."
In other words, he says: "do pimpl as pointer to data struct"... I'd expect that classes that are not involved in inheritance are hierarchies are more common, making "the chief drawback" not too concerning really (and I'm not really sure it is all that at odds with inheritance really.) Anyway, GotW#24 is very old. Also, it makes no mention of maintaining const at all.
Beyond that I also had a look at "43. Pimpl judiciously" of the much more recent "C++ Coding Standards: 101 Rules, Guidelines, and Best Practices" the point "Pimpl should be used to store all private members, both member data and private member functions" is reiterated (still no mention of const.) So I guess Sutter's PoV is still at odds with my interpretation, and I'm not about to go against /the/ guru. :)
The main reference I pulled out at the time of writing was the old favourite: Effective C++, third ed. In item 31 ("Minimise compilation dependencies...") he addresses pimpl and tends to prefer the name "Handle classes". It is from this book that my idea that methods in classes with pimpls all call through to pimpl equivalents probably comes from. There's even an example:
std::string Person::name() const {
return pImpl->name();
}
Meyers says "one way is to forward all their function calls to the corresponding implementation classes and have those classes do the real work." He certainly doesn't put this forward as a definitive statement though, merely "one way."
Meyers also makes no mention of constness.
So, in the end I think my PoV comes from Meyers but the method has had some added concrete in my own head. Given the 'const' situation I think I'll keep the concrete for now. I was going to be more detailed in a coverage of stuff I'd read but it was kind of late so I left it at "don't have 100% confidence"... no confidence left at all now. ;)