Friday, 21 September 2012

One Definition to Rule them all

Generally I am a little bit suspicious of people who can quote the C++ standard, knowing the exact definitions and all the obscure intricacies should only be necessary for compiler writers. The rest of us should get by with a broad understanding of how things work and the compiler should catch us when we stray.

In fact, and as an aside, I would actually consider too much knowledge of operator precedence to be positively harmful. People who know things can't help themselves but to use that knowledge and end up leaving out unnecessary braces in compound statements because it's obvious to them that they are unnecessary. Some idiot who hasn't memorised the rules (like me) then inherits the code, makes an incorrect assumption about what it's trying to do, and screws it up.

Bytes are cheap, just use brackets.

At least that was what I thought until I got bitten firmly in the arse by this little beauty:
http://stackoverflow.com/questions/6379422/c-multiple-classes-with-same-name
Which if I had been more familiar with standard I would have spotted immediately.

In my case I had a little interface I needed to mock for a couple of tests.

class ThingResponder
{
public: 
   virtual void respond(int) = 0;
};

In the first test I didn't care about it.

test1.cpp:

class MyThingResponder : public ThingResponder
{
public: 
   virtual void respond(int) {}
};

void test1::someTest()
{
    MyThingResponder responder;
    foo(responder);
    ...

But in the second test I wanted to verify it was called, so did something a little different

test2.cpp:

class MyThingResponder : public ThingResponder{
public: 
    std::vector<int> responces;
    virtual void respond(int responce)
    {
        responces.push.back(responce);
    }
};

void test2::someTest()
{
    MyThingResponder responder;
    foo(responder);
    ...

Can you guess what happened next.

I had already broken the one definition rule, this did not result in any build errors or warnings, instead when the calling frame wanted to allocate an object of MyThingResponder it used the local definition, but when it called the constructor it used the one from test2.cpp. This meant after I wrote test2, test1 started crashing in weird ways because the constructor for MyThingResponder was initialising a std::vector that wasn't there and screwing up a load of other local variables.

Had it been the other way around the responces vector would have never been initialised and it would have been even harder to track down the error.

These are the things that scared me.

  1. As we move to using more functors and the like, small locally defined classes with generic names will become more common.
  2. There is no way to guard against this. Anyone else's new class could break your code.
And these are the things I realised:
  1. I am going to put everything in a namespace from now on. Everything!
  2. I suppose I do kind-of have to know the obscure intricacies of the C++ standard after all.

Tuesday, 18 September 2012

Another pointless C vs C++ musing

I really wanted to agree with this blog post, I largely agreed with part1 and having once tried to integrate ZeroMQ with an application that was tied to an old version of gcc I really wished he'd written it in C too.

Ultimately I think it's a bit wrong-headed though. He seems to want to break encapsulation for a performance benefit (which is fine, so long you appreciate the trade-off and genuinely do need the speed) and for some reason thinks that's fine in C but not in C++. Personally I think ugly hard-to-maintain code is ugly and hard to maintain whatever language you write it in.

I have a pet theory that C will out-live C++. Managed languages will slowly intrude on the application space because the pain of developing in them is so much less, and the performance penalty will continue to get smaller until the one outweighs the other for all but a tiny subset of problems. Even the resource scarce embedded space, where you'd have thought C++ would have trounced, say, Java every time, seems to have been largely ceded, Android being prepared to take the hit of a JVM in return for stability and safety (how stable and safe is arguable, but certainly more stable and safe than running native code from unknown third parties).

Meanwhile C++, with horrors such as exceptions and (yuck!) template meta-programming, is never going to make much inroad into the system space. Operating systems, while they may employ more and more C++ components, are for the foreseeable future going to be written mostly in C. In fact it's clear from noises coming out of Microsoft that they'd like to do as much as possible in C# and are only stymied by performance issues; performance issues that will surely be resolved sooner or later, if only by faster hardware.

So to come full circle, one of the pain points of C++ that really irritates me, and does not seem to get talked about much, is people who insist on writing APIs in C++. In theory it's all nice, you simply compile everything up and off you go, and if they've used some later code feature then you update your compiler. In the real world it doesn't work that way. If I need to update my compiler I need to recompile about a dozen other open-source C++ libraries and go and find updated version of the two or three third party proprietary ones. All that functionality now needs to be re-tested, most of it by human beings, all at great cost.

For these reasons we rarely update compilers, and so the chances of a given C++ API not compiling on one of the (several) compilers we use is greatly increased. Greatly increasing the pain of using it.

On top of all that I can no longer dynamically load it at run time (or at least, not without jumping through more hoops than is worthwhile) which means it becomes an absolute dependency whether or not the user needs the functionality, and I either have to link it statically bloating the binary or leave myself vulnerable to version mismatches.

All of this makes me sad. I love C++; I like the philosophy behind it, believe that smart pointers trump any other form of garbage collection, that templates are great, and think that it is generally just unforgiving enough to force people into actually thinking about what their code is doing rather than muddle through by trial and error. But I think the lack of an ABI (is it even possible? I don't know) is becoming such a large pain point that it will eventually (though hopefully not in my professional lifetime) kill the language off.