Thursday, December 29, 2005

An observation concerning functional programming in C++

The following code was extracted from a live program.

namespace
{
class count_bytes
{
public:
count_bytes(size_t& total)
: total_(total)
{
}
void operator()(
const Element& element
)
{
total_ += element.bytes();
}
private:
size_t& total_;
};
}
size_t
ElementSet::bytes() const
{
size_t total = 0;
std::for_each(elements_.begin(), elements_.end(), count_bytes(total));
return total;
}

By my count that's 26 lines of code. To be fair I should admit that I tightened it up a bit. The original was well over 30 lines.

Compare that to:

size_t
ElementSet::bytes() const
{
size_t total = 0;
for(size_t i = 0; i < elements_.size; ++i)
{
total += elements_[i].bytes();
}
return total;
}


Ten lines (And yes, you could use an iterator rather than an index.)

Functional Programming Motto: You may have to type a whole lot more, but at least your code will be harder to understand.

Actually functional programming is just fine when you need it. The problem happens when it's the "Wrong Tool For The Job[TM]." There seems to be a lot of that goin' 'round [sigh].

Saturday, December 10, 2005

Thread return value/threadID/ handle/ join/detach vs C++

Two entries ago, I opined that in a C++ program there should be an object that represents every thread. One interesting consequence of this statement is that many of the "features" of the OS-supplied multithreading support become unnecesary and even counterproductive.

For example, on almost all platforms, start_thread (by whatever name) calls a function with a void* argument. That's ok. Amost all thread functions begin life with a cast. However the thread function is expected to return a value: usually an int but maybe a DWORD or even a void * or whatever depending on your platform. In C++ the proper return value from this function should be 0 -- always! (Actually it should be void, but it seems a shame to disappoint the OS that's eagerly awating the zero. (And besides the compiler won't let me get away with it.))

Why?

Because if there is an object associated with the thread, the thread has a much richer channel through which to return information -- the members of the object.

And speaking of return values, many times no one cares what the thread has to say as it exits. [No death-bed epigrams for you, Thread, you're outta here. ] The joinable vs detached concept in many OSs accomodates this desire on the part of the thread to get the last word in.

However, since the thread now has a whole object available through which to return values, and since anyone who cares can keep a smart pointer to the object (remember the purpose of smart pointers is to manage object lifetimes.) the whole joinable vs. detached issue becomes moot.

Threads in C++ should ALWAYS run detached. Rather than joining a thread, you can wait on a condition in the thread's object (safely because you have a smart ptr to guarantee the condition will be around to be waited on.)

This approach might be much kinder to your system resources. Many OS's hang on to lots of information about a terminated thread -- possibly even it's entire stack, register safe storage, open FDs, etc. waiting for someone to join in and tell them the resources can be freed. Using an object can be a considerable savings.

Which brings up the issue of thread ID's If your interaction with the thread is via the object, you don't need a thread ID to join the thread -- and you certainly don't need a thread ID to kill the thread (see blog entry n-1) so the thread ID becomes much less valuable. It still has some value in identifying the thread in log messages (Ever tried to follow a log that didn't include thread ID's in a message? Me too, and I still regret it.) And the thread ID might also be involved in managing thread specific storage -- although many uses of TSS could be handled better by storing the data (or pointers thereto) in the thread's object.

Speaking of TSS. Think of it as FORTRAN COMMON for the thread-wielding-crowd. There's usually a better way, but sometimes the better way requires some thought <insert cynical comment here.>

Foot shooting

In my previous post I criticized ACE for allowing the programmer to shoot himself in the foot. I was wrong. The problem is not that ACE allows you to do dangerous things. The problem is it doesn't provide enough incentive to convince a lot of programmers NOT to do the dangerous things.

If I work really hard I can imagine a situation in which the only way to save my life would be to shoot myself in the foot. Witness the hiker a couple of years ago who cut off his trapped arm so he could hike out of the wilderness with the rest of his body parts still functioning. That doesn't mean that everyone who ventures into the wilderness should pack an amputation kit just in case, or if the do, it should be in a package that is clearly labled: "For emergency use only." Once you open the package, you should find another package that says, "No, this is not an emergency, do it the right way." Only after opening THAT package should you find the Acme self-amputation, foot-targeting, and dangerous OS functions kit. [Pat. Pending]

So, if you want thread A to kill thread B, all you need to do is call the:
I_AM_A_BLINKING_IDIOT_FOR_USING_THIS::kill_thread() method.

Monday, December 05, 2005

Practical Threading

A while ago, I started to talk about multithreaded programs in this blog. Alas, I distracted myself into talking about "Why Thread" -- an important topic and one that is often misunderstood-- when I should have been talking about "How to thread" because "How" is done badly even more often than "Why".

My recent work with multithreading has been with ACE and boost threads, so I'll use them as examples (no offense, guys.)

So, "How to thread in C++"

C++ is an object oriented language (or at least it can be used to write object oriented programs which is not quite the same thing but is close enough for now.) An object oriented should have an object representing/corresponding to each entity the program is dealing with.

A thread is an entity that needs to be dealt with by a multi-threaded program.

Rule #1: There should be a one-to-one relationship between threads and thread-related objects in a C++ program.

This does not mean an application programmer (a programmer who is dealing with objects that represent "real-world" entities) should be thinking about thread objects. On the contrary the thread objects should be hidden so well that they do their job without distracting the application programmer from the real work to be done.

"But wait!" you say, "isn't that a lot of overhead? Expecially," you add -- looking ahead in this entry -- "when you have to allocate the object on the heap."

"Don't be ridiculous." I calmly reply. "You're planning to start a thread with it's own stack, register storage, and who knows what resources tied up in the OS and you worried about a simple malloc!"

But I digress (so what else is new)

Where was I? Oh, yeah. "There should be a one-to-one relationship between threads an thread-related objects." But many thread-support libraries (ACE) don't always do this. Instead they have objects like a thread group that represents some number of threads. As soon as you do this, you lose control of individual threads and that's a problem.

I'm not saying that there shouldn't be objects like thread pools, just that a thread pool should never interact directly with an OS thread. Instead it should interact with the C++ object that represents the thread -- of which there will be as many as there are threads.

Oh, didn't I mention rule 2: All interaction with a thread should be through its object. I guess that seems obvious to me, but again it's often not obvious enough to the authors of thread libraries [ACE] that they actually do it that way (sigh). I guess they think the programmers would object to having a gun that wouldn't fire when pointed directly at your foot (or more vital parts of your anatomy.)

Ok, its time for rule three: The lifetime of the thread-related object must be longer than the lifetime of the thread. It should exist (however briefly) before the thread is started, and should continue to exist (however briefly) after the thread exits.

Again, this is something that many existing libraries (ACE, boost, et. al.) get wrong.

Oh, dear. We've entered the hazardous realm of object lifetime management [sometimes misrepresented as an issue of object ownership, but thinking in terms of ownership muddles the issue.]

Fortunately object lifetime management is an area in which the *SILVER* *BULLET* solution has emerged -- reference counted pointers! Unfortunately, C++ does not provide the tools to do reference counted pointers well, but it is possible to come close. boost::shared_ptr and ACE_Strong_Ptr are examples of refcount pointers done pretty darn good if not perfectly.
[ACE_Refcounted_Autoptr on the other hand is a disaster waiting to happen -- please don't use it (at least not in an airplane I might fly in!)]

So, we're going to use boost::shared_ptr to manage the lifetime of the ThreadRelatedObject. Cool!

class ThreadRelatedObject;
// alias TRO

typedef boost::shared_ptr ThreadRelatedObjectPtr;
// alias TROPtr


That leads to the next rule (4 I think): The TRO must be the one to start the thread (deftly satisfying half of rule three by guaranteeing that the TRO exists before the thread does.) The other half of rule three is handled by rule 5: The TRO must have its own, private, TROPtr so that it can be involved in its own lifetime management. We'll call this the self pointer. The last thing the thread does before exiting will be to reset it's self pointer -- allowing the TRO to be deleted if no one else remembers it. [So if you're thinking object ownership you might not understand why an object needs to own itself, but it seems perfectly reasonable to think that an object might want to manage its own lifetime.]

And then of course, there's rule 6: Any object outside the TRO that wishes to interact with the thread must do so via a TROPtr. Otherwise it can't guarantee that the TRO still exists.

A point of information: boost::shared_ptr to the same object must touch each other. I.e.
Widget * w = new Widget;
WidgetPtr p1(w);
WidgetPtr p2(w);
and you have a disaster because p1 didn't touch p2.
No one would code the above, but they might code:
WidgetPtr p3(new Widget);
which looks perfectly reasonable, and in fact is the preferred technique up 'till the point that the Widget does
class Widget
{
Widget()
:self_(this)
{
}
WidgetPtr self_;
};
Kaboom.

Anyway, it's time for a very-import-point-that-*everybody*-gets-wrong. Rule 7: The TRO must not -- repeat must not -- start the thread in the constructor. Why?
Suppose the constructor:
1) creates the self pointer
2) starts the thread
3) returns the self pointer to the creator of the TRO (oops, see above!)

Nevermind that point 3 doesn't work because the constructor can't return anything other than this which is not a ThisPtr -- that's just a deficiency in the C+++ language, and there are ways[hack] around that problem -- like private constructors with static create() methods [hack] and my favorite: passing to the constructor a reference to a TROPtr which the constructor initializes[hack].

The real reason for rule 7 shows up somewhere between step 2 and step 3 of the constructor when the thread started by step 2, does what needs to be done, and exits(!) before step 3 is executed. The static create method can't handle this. The pass-a-ref-to-ptr-to-the-constructor hack mentioned above, can cope if done very carefully (a very careful hack, eh?) but it gets ugly --especially when layers of inheritance happen. Actually its kind of fun to get it wrong, then watch one of your coworkers try to figure out why the pointer returned from new points to an object that's already been deleted -- but only if you have a coworker who deserves it . To bad at this job I don't have any candidates.

Why do ugly when there is a simple solution.

Rule 7a: There should be a start method on the TRO that actually starts the thread.

One of the fun things about programming that when you find the right solution, stuff works! Separating object construction from thread initiation is one of those right solutions. The calling object gets the luxury continuing to initialize the TRO after creating it and before starting it. Some things are best not done in a constructor.

Separating construction from thread initiation also make it considerably easier to create a generic base class for handling thread related issues. The ugly create and or pass-a-pointer hacks aren't necessary (go ahead, try to figure out how to do a static create method in a base class). This means that we can get the solution right once and never worry about it again.

So that's what I did.

My solution that's actually being used is based on ACE thread support (ACE does provide good platform independent thread support [as long as you don't use Thread Specific Storage (grin) I just don't like the way its packaged.) Unfortunatly ACE tends to be a bit intrusive -- it's a shame to take on all that baggage just to get a few platform-neutral thread-related functions.

That's why I went looking at boost threads. Most of boost is truly high-class work. Boost threads, alas, is not. Although it passes the "use conditions rather than events" test [an altogether different topic.], it fails the object-lifetime management test.

So I guess I won't be publishing my "universal thread support the right way" base class quite yet. Maybe I'll just publish the ACE-based version and someone will point me to a platform independent thread library that separates object construction from thread initiation and supports condition rather then event, and does not come with tons of baggage.

Just remember, the multicores are coming. Do you know what your threads are doing?