Monday, December 05, 2005

Practical Threading

A while ago, I started to talk about multithreaded programs in this blog. Alas, I distracted myself into talking about "Why Thread" -- an important topic and one that is often misunderstood-- when I should have been talking about "How to thread" because "How" is done badly even more often than "Why".

My recent work with multithreading has been with ACE and boost threads, so I'll use them as examples (no offense, guys.)

So, "How to thread in C++"

C++ is an object oriented language (or at least it can be used to write object oriented programs which is not quite the same thing but is close enough for now.) An object oriented should have an object representing/corresponding to each entity the program is dealing with.

A thread is an entity that needs to be dealt with by a multi-threaded program.

Rule #1: There should be a one-to-one relationship between threads and thread-related objects in a C++ program.

This does not mean an application programmer (a programmer who is dealing with objects that represent "real-world" entities) should be thinking about thread objects. On the contrary the thread objects should be hidden so well that they do their job without distracting the application programmer from the real work to be done.

"But wait!" you say, "isn't that a lot of overhead? Expecially," you add -- looking ahead in this entry -- "when you have to allocate the object on the heap."

"Don't be ridiculous." I calmly reply. "You're planning to start a thread with it's own stack, register storage, and who knows what resources tied up in the OS and you worried about a simple malloc!"

But I digress (so what else is new)

Where was I? Oh, yeah. "There should be a one-to-one relationship between threads an thread-related objects." But many thread-support libraries (ACE) don't always do this. Instead they have objects like a thread group that represents some number of threads. As soon as you do this, you lose control of individual threads and that's a problem.

I'm not saying that there shouldn't be objects like thread pools, just that a thread pool should never interact directly with an OS thread. Instead it should interact with the C++ object that represents the thread -- of which there will be as many as there are threads.

Oh, didn't I mention rule 2: All interaction with a thread should be through its object. I guess that seems obvious to me, but again it's often not obvious enough to the authors of thread libraries [ACE] that they actually do it that way (sigh). I guess they think the programmers would object to having a gun that wouldn't fire when pointed directly at your foot (or more vital parts of your anatomy.)

Ok, its time for rule three: The lifetime of the thread-related object must be longer than the lifetime of the thread. It should exist (however briefly) before the thread is started, and should continue to exist (however briefly) after the thread exits.

Again, this is something that many existing libraries (ACE, boost, et. al.) get wrong.

Oh, dear. We've entered the hazardous realm of object lifetime management [sometimes misrepresented as an issue of object ownership, but thinking in terms of ownership muddles the issue.]

Fortunately object lifetime management is an area in which the *SILVER* *BULLET* solution has emerged -- reference counted pointers! Unfortunately, C++ does not provide the tools to do reference counted pointers well, but it is possible to come close. boost::shared_ptr and ACE_Strong_Ptr are examples of refcount pointers done pretty darn good if not perfectly.
[ACE_Refcounted_Autoptr on the other hand is a disaster waiting to happen -- please don't use it (at least not in an airplane I might fly in!)]

So, we're going to use boost::shared_ptr to manage the lifetime of the ThreadRelatedObject. Cool!

class ThreadRelatedObject;
// alias TRO

typedef boost::shared_ptr ThreadRelatedObjectPtr;
// alias TROPtr


That leads to the next rule (4 I think): The TRO must be the one to start the thread (deftly satisfying half of rule three by guaranteeing that the TRO exists before the thread does.) The other half of rule three is handled by rule 5: The TRO must have its own, private, TROPtr so that it can be involved in its own lifetime management. We'll call this the self pointer. The last thing the thread does before exiting will be to reset it's self pointer -- allowing the TRO to be deleted if no one else remembers it. [So if you're thinking object ownership you might not understand why an object needs to own itself, but it seems perfectly reasonable to think that an object might want to manage its own lifetime.]

And then of course, there's rule 6: Any object outside the TRO that wishes to interact with the thread must do so via a TROPtr. Otherwise it can't guarantee that the TRO still exists.

A point of information: boost::shared_ptr to the same object must touch each other. I.e.
Widget * w = new Widget;
WidgetPtr p1(w);
WidgetPtr p2(w);
and you have a disaster because p1 didn't touch p2.
No one would code the above, but they might code:
WidgetPtr p3(new Widget);
which looks perfectly reasonable, and in fact is the preferred technique up 'till the point that the Widget does
class Widget
{
Widget()
:self_(this)
{
}
WidgetPtr self_;
};
Kaboom.

Anyway, it's time for a very-import-point-that-*everybody*-gets-wrong. Rule 7: The TRO must not -- repeat must not -- start the thread in the constructor. Why?
Suppose the constructor:
1) creates the self pointer
2) starts the thread
3) returns the self pointer to the creator of the TRO (oops, see above!)

Nevermind that point 3 doesn't work because the constructor can't return anything other than this which is not a ThisPtr -- that's just a deficiency in the C+++ language, and there are ways[hack] around that problem -- like private constructors with static create() methods [hack] and my favorite: passing to the constructor a reference to a TROPtr which the constructor initializes[hack].

The real reason for rule 7 shows up somewhere between step 2 and step 3 of the constructor when the thread started by step 2, does what needs to be done, and exits(!) before step 3 is executed. The static create method can't handle this. The pass-a-ref-to-ptr-to-the-constructor hack mentioned above, can cope if done very carefully (a very careful hack, eh?) but it gets ugly --especially when layers of inheritance happen. Actually its kind of fun to get it wrong, then watch one of your coworkers try to figure out why the pointer returned from new points to an object that's already been deleted -- but only if you have a coworker who deserves it . To bad at this job I don't have any candidates.

Why do ugly when there is a simple solution.

Rule 7a: There should be a start method on the TRO that actually starts the thread.

One of the fun things about programming that when you find the right solution, stuff works! Separating object construction from thread initiation is one of those right solutions. The calling object gets the luxury continuing to initialize the TRO after creating it and before starting it. Some things are best not done in a constructor.

Separating construction from thread initiation also make it considerably easier to create a generic base class for handling thread related issues. The ugly create and or pass-a-pointer hacks aren't necessary (go ahead, try to figure out how to do a static create method in a base class). This means that we can get the solution right once and never worry about it again.

So that's what I did.

My solution that's actually being used is based on ACE thread support (ACE does provide good platform independent thread support [as long as you don't use Thread Specific Storage (grin) I just don't like the way its packaged.) Unfortunatly ACE tends to be a bit intrusive -- it's a shame to take on all that baggage just to get a few platform-neutral thread-related functions.

That's why I went looking at boost threads. Most of boost is truly high-class work. Boost threads, alas, is not. Although it passes the "use conditions rather than events" test [an altogether different topic.], it fails the object-lifetime management test.

So I guess I won't be publishing my "universal thread support the right way" base class quite yet. Maybe I'll just publish the ACE-based version and someone will point me to a platform independent thread library that separates object construction from thread initiation and supports condition rather then event, and does not come with tons of baggage.

Just remember, the multicores are coming. Do you know what your threads are doing?

1 comment:

Anonymous said...

Bravo! Now get busy. :->