VSoft Technologies Blogs

rss

VSoft Technologies Blogs - posts about our products and software development.

FinalBuilder is written in a combination of Delphi and C#. This is a blog post about a Delphi RTL threading issue that I recently ran into and had to work around.

We recently had some bug reports of FinalBuilder deadlocking at shutdown under certain circumstances. I could reproduce the problem, but I was baffled for a long time because - according to the call stacks  - the program should have been fine.

The issue I'd run into has been around for years, and is summed up nicely in this Borland QC note: Synchronize and WaitFor methods of TThread can lead to a DeadLock in logically correct applications.[Dead link removed]

The Problem

The simple summary is: If you ever have a call stack which includes both Synchronize and WaitFor, and the thread you are waiting on also calls Synchronize at some point, you will occasionally get a deadlock.

Explaining why is kind of tricky, but if you read the QC case you'll get the gist of it. The problematic sequence of events looks like this:

  • Thread A calls Synchronize(MethodA)
  • Thread B calls Synchronize(MethodB)

Then, inside the context of the Main Thread:

  • Main thread calls CheckSynchronize() while processing messages
  • CheckSynchronize is implemented to batch-process all waiting calls(*). So it picks up the queue of waiting calls (containing MethodA and MethodB) and loops through them one by one.
  • MethodA executes in the main thread's context. Assume MethodA calls ThreadB.WaitFor
  • WaitFor calls CheckSynchronize to process any waiting calls to Synchronize
  • In theory, this should then process ThreadB's Synchronize(MethodB), allowing Thread B to complete.
  • However, MethodB is already a possession of the first CheckSynchronize call, so it never gets called.

DEADLOCK!

The Solution

The ideal solution is to refactor your program so it will never call CheckSynchronize or WaitFor from inside an already-synchronized method call. For FinalBuilder, this was not a good option - all it takes is one Synchronized method call to call ProcessMessages and you're letting in dozens of potential deadlocks.

The solution I came up with is to write a "SafeSynchronize" method that only allows one thread at a time to get on the Synchronize method queue. This comes with a performance hit, but prevents the deadlock.

Here's a new class which implements the SafeSynchronize method:

  TSafeSyncThread = class(TThread)
  private
    procedure SafeSynchronizeInner;
  protected
    procedure SafeSynchronize(const AThreadMethod: TThreadMethod);
  end;

implementation

{ TSafeSyncThread }

var
  FSingleSynchronize : TCriticalSection;
  FSingleSynchronizeTarget : TThreadMethod;

procedure TSafeSyncThread.SafeSynchronize(const AThreadMethod: TThreadMethod);
begin
  FSingleSynchronize.Acquire;
  FSingleSynchronizeTarget := AThreadMethod;
  Synchronize(SafeSynchronizeInner);
end;


procedure TSafeSyncThread.SafeSynchronizeInner;
var
  targetMethod : TThreadMethod;
begin
  targetMethod := FSingleSynchronizeTarget;
  FSingleSynchronize.Release; // Now we're inside a CheckSynchronize call, it's safe to let another thread join the queue
  targetMethod;  
end;

Basically, this serializes access to the Synchronize queue. The performance hit is potentially quite large if you have a lot of concurrent SafeSynchronize calls and a busy main thread. FinalBuilder is designed in a way which minimizes this, so it's less of an issue for us.

I put together a quick test project which demonstrates the issue (at least on Delphi 2007.) You can get it here if you want to have a look. It also contains SafeSyncThread.pas if you want to incorporate the workaround into your own projects. This code comes with no guarantees and no warranty, obviously.

If anyone has a better (preferably faster) solution to this problem, which doesn't require rewriting the RTL, please let us know. :).

PS The test project example contains some fairly awful cut-and-paste code reproduction, which - for the record - I normally avoid like the plague. I used it here because the methods are very simple, and I couldn't see an easy way to rearchitect it.

(*) FWIW, I actually quite like the way that CheckSynchronize is implemented, using an InterlockedExchange to swap the old queue object with a brand new one. That way you can process the entire old queue at once, without granular locking. Check the source code if you want to see what I mean. It's just a shame about the deadlock. (back)

Showing 4 Comments

Avatar
angus 16 years ago

Thanks guys. That's interesting to see.<br><br>Without going into too much depth about how we do things, we don't have any worker threads which block on Synchronize. What we do have is several asynchronous consumer threads, a couple of which consume messages to update the UI. These are the threads which call Synchronize - but they don't hold anything else up when they do so.<br><br>We did talk about the possibilities of changing these threads so we create our own message queue in the main thread, instead of using Synchronize. However, this kind of approach adds a lot of our own code without giving substantial extra performance or functionality.<br><br>The main advantage to our current approach is code reuse - we use the same consumer component for UI vs. non-UI elements, the only difference being that a UI consumer calls Synchronize(). <br><br>However, given the bug described above, we'll probably investigate this again in the near future. Thanks again for the info.<br><br>- Angus


Avatar
Hallvard Vassbotn 16 years ago

I whole-heartedly agree with Rossen - don't use Synchronize at all - communicate between threads and the mainthread using waitable work queues - see my TDM article in my blog here for details and code:<br><br>http://hallvards.blogspot.com/2008/03/tdm6-knitting-your-own-threads.html


Avatar
Rossen Assenov 16 years ago

Here is the correct CodeCentral link : http://cc.codegear.com/Item/17604


Avatar
Rossen Assenov 16 years ago

>>If anyone has a better (preferably faster) solution to this problem, which doesn't require rewriting the RTL, please let us know. :).<br><br>Better way : don't EVER use Synchronize and WaitFor :). Synchronize IMHO is just a hack which should never be used in real programs. The proper way is to never touch the UI directly from a worker thread. Instead write your own message queue which allows you to communicate between your main thread and your worker threads asynchronously - a worker thread should never be blocked waiting for the main thread to update the UI.<br><br>Take a look at my CodeCentral project for example : http://www.finalbuilder.com/Default.aspx?tabid=77&EntryID=259



Comments are closed.