Thread Synchronization with Kernel Objects
Posted by MHesham on August 29, 2011
Content
- Introduction
- Wait Functions
- Event Kernel Objects
- Waitable Timer Kernel Objects
- Semaphore Kernel Objects
- Mutex Kernel Objects
- Mutexes vs Critical Sections
Introduction
Although user-mode thread synchronization mechanisms offer great performance, they do have limitations, such as:
- You can use critical sections to place a thread in a wait state, but you can use them only to synchronize threads contained within a single process
- You can easily get into deadlock situations with critical sections because you cannot specify a timeout value while waiting to enter the critical section.
The drawback of using Kernel Objects is their performance, the transition from user-mode to kernel-mode is costly: it takes about 200 CPU cycles on the x86 platform for an empty system call—and this, of course, does not include the execution of the kernel-mode code that actually implements the function your thread is calling. But what takes several orders of magnitude more is the overhead of scheduling a new thread with all the cache flushes/misses it entails. Here we’re talking about tens of thousands of cycles.
The following kernel objects can be in a signaled or nonsignaled state:
- Processes
- Threads
- Jobs
- File and console standard input/output/error streams
- Events
- Waitable timers
- Semaphores
- Mutexes
Wait Functions
DWORD dw = WaitForSingleObject(hProcess, 5000);
switch (dw) {
case WAIT_OBJECT_0:
// The process terminated.
break;
case WAIT_TIMEOUT:
// The process did not terminate within 5000 milliseconds.
break;
case WAIT_FAILED:
// Bad call to function (invalid handle?)
break;
}
The preceding code tells the system that the calling thread should not be schedulable until either the specified process has terminated or 5000 milliseconds have expired, whichever comes first. So this call returns in less than 5000 milliseconds if the process terminates, and it returns in about 5000 milliseconds if the process hasn’t terminated. Note that you can pass 0 for the dwMilliseconds parameter. If you do this, WaitForSingleObject always returns immediately, even if the wait condition hasn’t been satisfied.
HANDLE3];
h[0] = hProcess1;
h[1] = hProcess2;
h[2] = hProcess3;
DWORD dw = WaitForMultipleObjects(3, h, FALSE, 5000);
switch (dw) {
case WAIT_FAILED:
// Bad call to function (invalid handle?)
break;
case WAIT_TIMEOUT:
// None of the objects became signaled within 5000 milliseconds.
break;
case WAIT_OBJECT_0 + 0:
// The process identified by h[0] (hProcess1) terminated.
break;
case WAIT_OBJECT_0 + 1:
// The process identified by h[1] (hProcess2) terminated.
break;
case WAIT_OBJECT_0 + 2:
// The process identified by h[2] (hProcess3) terminated.
break;
}
Successful Wait Side Effects
For some kernel objects, a successful call to WaitForSingleObject or WaitForMultiple-Objects actually alters the state of the object. A successful call is one in which the function sees that the object was signaled and returns a value relative to WAIT_OBJECT_0. A call is unsuccessful if the function returns WAIT_TIMEOUT or WAIT_FAILED. Objects never have their state altered for unsuccessful calls.
Let’s look at an example. Two threads call WaitForMultipleObjects in exactly the same way:
HANDLE h[2]; h[0] = hAutoResetEvent1; // Initially nonsignaled h[1] = hAutoResetEvent2; // Initially nonsignaled WaitForMultipleObjects(2, h, TRUE, INFINITE);
When WaitForMultipleObjects is called, both event objects are nonsignaled; this forces both threads to enter a wait state. Then the hAutoResetEvent1 object becomes signaled. Both threads see that the event has become signaled, but neither can wake up because the hAutoResetEvent2 object is still nonsignaled. Because neither thread has successfully waited yet, no side effect happens to the hAutoResetEvent1 object.
Next, the hAutoResetEvent2 object becomes signaled. At this point, one of the two threads detects that both objects it is waiting for have become signaled. The wait is successful, both event objects are set to the nonsignaled state, and the thread is schedulable. But what about the other thread? It continues to wait until it sees that both event objects are signaled. Even though it originally detected that hAutoResetEvent1 was signaled, it now sees this object as nonsignaled.
If multiple threads wait for a single kernel object, which thread does the system decide to wake up when the object becomes signaled? "The algorithm is fair." which means that if multiple threads are waiting, each should get its own chance to wake up each time the object becomes signaled.
Event Kernel Objects
Events signal that an operation has completed. There are two different types of event objects: manual-reset events and auto-reset events. When a manual-reset event is signaled, all threads waiting on the event become schedulable. When an auto-reset event is signaled, only one of the threads waiting on the event becomes schedulable.
Once an event is created, you control its state directly. When you call SetEvent, you change the event to the signaled state:
BOOL SetEvent(HANDLE hEvent);
When you call ResetEvent, you change the event to the nonsignaled state:
BOOL ResetEvent(HANDLE hEvent);
It’s that easy.
an auto-reset event is automatically reset to the nonsignaled state when a thread successfully waits on the object.
Waitable Timer Kernel Objects
Waitable timers are kernel objects that signal themselves at a certain time or at regular intervals. They are most commonly used to have some operation performed at a certain time.
Waitable timer objects are always created in the nonsignaled state. You must call the SetWaitable-Timer function to tell the timer when you want it to become signaled.
The following code sets a timer to go off for the first time on January 1, 2008, at 1:00 P.M., and then to go off every six hours after that:
// Declare our local variables. HANDLE hTimer; SYSTEMTIME st; FILETIME ftLocal, ftUTC; LARGE_INTEGER liUTC; // Create an auto-reset timer. hTimer = CreateWaitableTimer(NULL, FALSE, NULL); // First signaling is at January 1, 2008, at 1:00 P.M. (local time). st.wYear = 2008; // Year st.wMonth = 1; // January st.wDayOfWeek = 0; // Ignored st.wDay = 1; // The first of the month st.wHour = 13; // 1PM st.wMinute = 0; // 0 minutes into the hour st.wSecond = 0; // 0 seconds into the minute st.wMilliseconds = 0; // 0 milliseconds into the second SystemTimeToFileTime(&st, &ftLocal); // Convert local time to UTC time. LocalFileTimeToFileTime(&ftLocal, &ftUTC); // Convert FILETIME to LARGE_INTEGER because of different alignment. liUTC.LowPart = ftUTC.dwLowDateTime; liUTC.HighPart = ftUTC.dwHighDateTime; // Set the timer. SetWaitableTimer(hTimer, &liUTC, 6 * 60 * 60 * 1000, NULL, NULL, FALSE); ...
Instead of setting an absolute time that the timer should first go off, you can have the timer go off at a time relative to calling SetWaitableTimer. You simply pass a negative value in the pDueTime parameter. The value you pass must be in 100-nanosecond intervals. Because we don’t normally think in intervals of 100 nanoseconds, you might find this useful: 1 second = 1,000 milliseconds = 1,000,000 microseconds = 10,000,000 100-nanoseconds.
The following code sets a timer to initially go off 5 seconds after the call to SetWaitableTimer:
// Declare our local variables. HANDLE hTimer; LARGE_INTEGER li; // Create an auto-reset timer. hTimer = CreateWaitableTimer(NULL, FALSE, NULL); // Set the timer to go off 5 seconds after calling SetWaitableTimer. // Timer unit is 100 nanoseconds. const int nTimerUnitsPerSecond = 10000000; // Negate the time so that SetWaitableTimer knows we // want relative time instead of absolute time. li.QuadPart = -(5 * nTimerUnitsPerSecond); // Set the timer. SetWaitableTimer(hTimer, &li, 6 * 60 * 60 * 1000, NULL, NULL, FALSE); ...
Waitable Timers vs User Timers
The biggest difference is that User timers require a lot of additional user interface infrastructure in your application, which makes them more resource intensive. Also, waitable timers are kernel objects, which means that they can be shared by multiple threads and are securable.
- User timers generate WM_TIMER messages that come back to the thread that called SetTimer (for callback timers) or the thread that created the window (for window-based timers). So only one thread is notified when a User timer goes off. Multiple threads, on the other hand, can wait on waitable timers, and several threads can be scheduled if the timer is a manual-reset timer.
- With waitable timers, you’re more likely to be notified when the time actually expires. The WM_TIMER messages are always the lowest-priority messages and are retrieved when no other messages are in a thread’s queue.
Semaphore Kernel Objects
Semaphore kernel objects are used for resource counting. They contain a usage count, as all kernel objects do, but they also contain two additional signed 32-bit values: a maximum resource count and a current resource count. The maximum resource count identifies the maximum number of resources that the semaphore can control; the current resource count indicates the number of these resources that are currently available.
The rules for a semaphore are as follows:
- If the current resource count is greater than 0, the semaphore is signaled.
- If the current resource count is 0, the semaphore is nonsignaled.
- The system never allows the current resource count to be negative.
- The current resource count can never be greater than the maximum resource count.
A thread gains access to a resource by calling a wait function, passing the handle of the semaphore guarding the resource. Internally, the wait function checks the semaphore’s current resource count and if its value is greater than 0 (the semaphore is signaled), the counter is decremented by 1 and the calling thread remains schedulable.
Unfortunately, there is just no way to get the current resource count of a semaphore without altering it.
Mutex Kernel Objects
Mutex kernel objects ensure that a thread has mutual exclusive access to a single resource. A mutex object contains a usage count, thread ID, and recursion counter. Mutexes behave identically to critical sections. However, mutexes are kernel objects, while critical sections are user-mode synchronization objects.
This means that mutexes are slower than critical sections. But it also means that threads in different processes can access a single mutex, and it means that a thread can specify a timeout value while waiting to gain access to a resource.
The rules for a mutex are as follows:
- If the thread ID is 0 (an invalid thread ID), the mutex is not owned by any thread and is signaled.
- If the thread ID is a nonzero value, a thread owns the mutex and the mutex is nonsignaled.
A thread gains access to the shared resource by calling a wait function, passing the handle of the mutex guarding the resource. Internally, the wait function checks the thread ID to see if it is 0 (the mutex is signaled). If the thread ID is 0, the thread ID is set to the calling thread’s ID, the recursion counter is set to 1, and the calling thread remains schedulable.
Every time a thread successfully waits on a mutex, the object’s recursion counter is incremented. The only way the recursion counter can have a value greater than 1 is if the thread waits on the same mutex multiple times.
Abandonment Issues
So if a thread owning a mutex terminates (using ExitThread, TerminateThread, ExitProcess, or TerminateProcess) before releasing the mutex, the system considers the mutex to be abandoned— the thread that owns it can never release it because the thread has died.
Because the system keeps track of all mutex and thread kernel objects, it knows exactly when mutexes become abandoned. When a mutex becomes abandoned, the system automatically resets the mutex object’s thread ID to 0 and its recursion counter to 0. Then the system checks to see whether any threads are currently waiting for the mutex.
This is the same as before except that the wait function does not return the usual WAIT_OBJECT_0 value to the thread. Instead, the wait function returns the special value of WAIT_ABANDONED. This special return value (which applies only to mutex objects) indicates that the mutex the thread was waiting on was owned by another thread that was terminated before it finished using the shared resource. The newly scheduled thread has no idea what state the resource is currently in—the resource might be totally corrupt.
In real life, most applications never check explicitly for the WAIT_ABANDONED return value because a thread is rarely just terminated. (This whole discussion provides another great example of why you should never call the TerminateThread function.)
Mutex vs Critical Section
|
Characteristic |
Mutex |
Critical Section |
| Mode | Kernel Mode | User Mode |
| Performance | Slow | Fast |
| Can be used across process boundaries? | Yes | No |
| Declaration | HANDLE hMutex; | CRITICAL_SECTION cs; |
| Initialization | hMutext = CreateMutex(NULL, FALSE, NULL); | InitializeCriticalSection(&s); |
| Cleanup | CloseHandle(hMutext); | DeleteCriticalSection(&cs); |
| Infinite Wait | WaitForSingleObject(hMutex, INFINITE); | EnterCriticalSection(&cs); |
| 0 Wait | WaitForSingleObject(hMutex, 0); | TryEnterCriticalSection(&cs); |
| Arbitrary Wait | WaitForSingleObject(hMutex, dwMilliseconds); | N/A |
| Release | ReleaseMutext(hMutext); | LeaveCriticalSection(&cs); |
| Can be waited on with other kernel objects? | Yes (e.g WaitForMultipleObjects or similar) | No |
API Table
References
This entry was posted on August 29, 2011 at 2:20 AM and is filed under Book Snapshot, Operating Systems, Windows Programming, Windows via C/C++. Tagged: events, kernel mode, mutext, process synchronization, semaphore, thread synchronization, timers, winapi, windows programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.