<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Muhamad Hesham&#039;s T-Blog</title>
	<atom:link href="http://mhesham.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://mhesham.wordpress.com</link>
	<description>A growing computer scientist mind</description>
	<lastBuildDate>Thu, 01 Dec 2011 09:46:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='mhesham.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Muhamad Hesham&#039;s T-Blog</title>
		<link>http://mhesham.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://mhesham.wordpress.com/osd.xml" title="Muhamad Hesham&#039;s T-Blog" />
	<atom:link rel='hub' href='http://mhesham.wordpress.com/?pushpress=hub'/>
		<item>
		<title>How to Achieve High Marks</title>
		<link>http://mhesham.wordpress.com/2011/09/30/how-to-achieve-high-marks/</link>
		<comments>http://mhesham.wordpress.com/2011/09/30/how-to-achieve-high-marks/#comments</comments>
		<pubDate>Thu, 29 Sep 2011 22:47:56 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[college]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[faculty]]></category>
		<category><![CDATA[success]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/09/30/how-to-achieve-high-marks/</guid>
		<description><![CDATA[Someone has asked me before how to get high marks in our college. I replied him with the following answer. I`d like to share it with you. First of all we must admit that high marks in our college is not the measure, also low marks is not a good indicator about the person. I [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=412&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><font size="2"><b>Someone has asked me before how to get high marks in our college. I replied him with the following answer. I`d like to share it with you.</b>       <br />First of all we must admit that high marks in our college is not the measure, also low marks is not a good indicator about the person.       <br />I am not telling here the right &quot;How to&quot; way, I don&#8217;t know if what am saying is applicable to all people. What I have done suited me, and it may not suite others due to differences in environment and the character from person to another. So am sharing with you my personal thoughts and ideas.       <br />The high marks way is achieved through 3 points:</font></p>
<ol>
<li><b><u><font size="2">Spiritual</font></u></b>
<ul>
<li><font size="2">This is the most important axis, this may rule the other points and dominate it in effect. </font></li>
<li><font size="2">Being close to Allah is what one should care about. صلاة الفجر هي ترمومتر الإنسان المسلم, and for sure we know how to be close to Allah, there are many ways to achieve this.</font> </li>
<li><font size="2">Trust that your hard work will not be wasted. </font></li>
<li><font size="2">Never get disappointed and angry that you did something wrong for a long time, this is what &quot;Shetan&quot; wants. Doing a wrong thing should be followed by a good thing. Such disappointing state tends to make one not able to make any output. NEVER TRAP YOURSELF IN SUCH STATE.</font> </li>
<li><font size="2">High spiritual state help one study faster with comfortable and stable state. </font></li>
<li><font size="2">A comfortable and stable state will tend to give the person the advantage to answer well in the exam.</font> </li>
<li><font size="2">Being close to Allah will help you in Exam Correction, and this can&#8217;t be denied. ربنا بيكرم فعلا في الجزئية دي و عن تجربة</font> </li>
</ul>
</li>
<li><b><u><font size="2">Practical</font></u></b>
<ul>
<li><font size="2"><b><u>Study</u></b><b><u>:</u></b></font>
<ul>
<li><font size="2">You want to be a one who make a change one day? This person can&#8217;t be without science, your beliefs will not change anything. </font></li>
<li><font size="2">To get a science in our college there is one and only one way, it is reading a book. It is your great teacher. </font></li>
<li><font size="2">It is not easy to read all books, this requires alot of effort and time. but try to read as much books as you can. </font></li>
<li><font size="2">Choose 3 or 4 books to read in the term. </font></li>
<li><font size="2">Don&#8217;t depend on the lecture in all subjects. </font></li>
<li><font size="2">Some subjects tend to be studied from lecture and sections only and there are other subjects that worth studying form books more. (Numerical, OR, etc..)</font> </li>
<li><font size="2">Don&#8217;t waste your time in a lecture or a section you will not benefit from. Studying form the book sometimes is more effective and time saving. Your time is valuable than just missing an attendance. </font></li>
<li><font size="2">Try to ask your mates about what they took in a similar section you are attending in same week. If you find the section is not useful and may be you already know what all they take, then you may not attend.</font> </li>
</ul>
</li>
<li><b><u><font size="2">Projects:</font></u></b>
<ul>
<li><font size="2">A team more than 4-3 persons is not practical, it will not be effective. </font></li>
<li><font size="2">Divide the work on you, give a task to the one who really know how to do it, others may involve with him, but mainly he is the one responsible for it. </font></li>
<li><font size="2">Stick to bonuses as much as you can. This will assist you alot in marks and will overcome the attendance marks you missed. </font></li>
<li><font size="2">For one wanting to be a scientist, bonuses is the golden result of projects, they are extra work for extra knowledge and extra marks.</font> </li>
</ul>
</li>
<li><b><u><font size="2">Exams:</font></u></b>
<ul>
<li><font size="2">If you don&#8217;t attend lectures then make sure to know what doctors will focus on in exams by asking people attending the lectures. Doctors tend to give hints to those who attend last lectures. </font></li>
<li><font size="2">From my observations, in final exams, doctors focus more on what have been studied after the midterm, s/he already tested you in what is before midterm.</font> </li>
<li><font size="2">Make sure to check exams of previous years to know what you need to stress on, sometimes it is very helpful to test your self-study knowledge. </font></li>
<li><font size="2">Make sure to check the slides, it will tell you what actually the doctor gave in the lectures. the doctor skip some sections in the chapter and this will help you not to focus on study on it especially if it is definitions. </font></li>
<li><font size="2">In the solution paper, don&#8217;t hesitate to give hints and illustrations using the pencil about your solution when applicable. </font></li>
<li><font size="2">Sometimes you solve using some special notation, make a box and write your notations in it, it is OK as long as you don&#8217;t go off the topic. </font></li>
<li><font size="2">There are some things I used to do, When I receive the answers paper I immediately flip it, and say some &quot;Quran&quot; and &quot;Azkar&quot; and any do3a2 to comfort myself.</font> </li>
</ul>
</li>
</ul>
</li>
<li><b><u><font size="2">Talents and Skill</font></u></b>
<ul>
<li><font size="2">It is very important for one to know his skills and abilities and benefit from them. </font></li>
<li><font size="2">If you find your self able to understand things quickly then you may use this skill to focus more on projects than study. </font></li>
<li><font size="2">It is very insane to do, but it if you are a book worm, then it is very OK to read from books to understand in exam days. You will get a deeper understanding, this will assist you in your answers. and you will be proud of yourself.</font> </li>
</ul>
</li>
</ol>
<p><font size="2">I hope that I covered the most important things I remember.      <br />والله المستعان</font></p>
<h1>Useful posts discussing the same subject</h1>
<ul>
<li><a href="http://abumuslimamr.wordpress.com/2011/02/14/educational-tips-and-guidelines/">Educational Tips and Guide lines</a> – Amr Abu Muslim</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/412/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/412/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/412/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/412/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/412/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/412/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/412/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=412&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/09/30/how-to-achieve-high-marks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>
	</item>
		<item>
		<title>Windows Memory Architecture</title>
		<link>http://mhesham.wordpress.com/2011/09/10/windows-memory-architecture/</link>
		<comments>http://mhesham.wordpress.com/2011/09/10/windows-memory-architecture/#comments</comments>
		<pubDate>Sat, 10 Sep 2011 13:41:08 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[memory architecture]]></category>
		<category><![CDATA[memory management]]></category>
		<category><![CDATA[paging]]></category>
		<category><![CDATA[virtual memory]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/09/10/windows-memory-architecture/</guid>
		<description><![CDATA[Content How a Virtual Address Space Is Partitioned Regions in an Address Space Committing Physical Storage Within a Region Physical Storage and the Paging File Page Protection Attributes The Importance of Data Alignment Introduction Every process is given its very own virtual address space. For 32-bit processes, this address space is 4 GB because a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=408&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h1>Content</h1>
<ol>
<li>How a Virtual Address Space Is Partitioned </li>
<li>Regions in an Address Space </li>
<li>Committing Physical Storage Within a Region </li>
<li>Physical Storage and the Paging File </li>
<li>Page Protection Attributes </li>
<li>The Importance of Data Alignment </li>
</ol>
<h1>Introduction</h1>
<p>Every process is given its very own virtual address space. For 32-bit processes, this address space is 4 GB because a 32-bit pointer can have any value from 0&#215;00000000 through 0xFFFFFFFF. </p>
<p>Every process has its own private address space. Process A can have a data structure stored in its address space at address 0&#215;12345678, while Process B can have a totally different data structure stored in <i>its</i> address space—at address 0&#215;12345678. When threads running in Process A access memory at address 0&#215;12345678, these threads are accessing Process A&#8217;s data structure. When threads running in Process B access memory at address 0&#215;12345678, these threads are accessing Process B&#8217;s data structure. Threads running in Process A cannot access the data structure in Process B&#8217;s address space, and vice versa.</p>
<p>This address space is simply a range of memory addresses. Physical storage needs to be assigned or mapped to portions of the address space before you can successfully access data without raising access violations.</p>
<h1>How a Virtual Address Space Is Partitioned</h1>
<p>Each process&#8217; virtual address space is split into partitions. The address space is partitioned based on the underlying implementation of the operating system.</p>
<p><a href="http://mhesham.files.wordpress.com/2011/09/image2.png"><img style="background-image:none;padding-left:0;padding-right:0;display:block;float:none;padding-top:0;border-width:0;margin:5px auto;" title="image" border="0" alt="image" src="http://mhesham.files.wordpress.com/2011/09/image_thumb2.png?w=640&#038;h=286" width="640" height="286" /></a></p>
<p>The partition of the process&#8217; address space from 0&#215;00000000 to 0x0000FFFF inclusive is set aside to help programmers catch NULL-pointer assignments. If a thread in your process attempts to read from or write to a memory address in this partition, an access violation is raised.</p>
<p>This User-Mode partition is where the process&#8217; address space resides. The usable address range and approximate size of the user-mode partition depends on the CPU architecture.</p>
<p><a href="http://mhesham.files.wordpress.com/2011/09/image3.png"><img style="background-image:none;padding-left:0;padding-right:0;display:block;float:none;padding-top:0;border-width:0;margin:5px auto;" title="image" border="0" alt="image" src="http://mhesham.files.wordpress.com/2011/09/image_thumb3.png?w=640&#038;h=182" width="640" height="182" /></a></p>
<p>This Kernel-Mode partition is where the operating system&#8217;s code resides. The code for thread scheduling, memory management, file systems support, networking support, and all device drivers is loaded in this partition. Everything residing in this partition is shared among all processes. Although this partition is just above the user-mode partition in every process, all code and data in this partition is completely protected. If your application code attempts to read or write to a memory address in this partition, your thread raises an access violation.</p>
<h1>Regions in an Address Space</h1>
<p>When a process is created and given its address space, the bulk of this usable address space is <i>free</i>, or unallocated. To use portions of this address space, you must allocate regions within it by calling <strong>VirtualAlloc.</strong> the act of allocating a region is called <strong>reserving.</strong></p>
<p>Whenever you reserve a region of address space:</p>
<ul>
<li>The system ensures that the region begins on an <i>allocation granularity</i> boundary. All the CPU platforms use the same allocation granularity of 64 KB—that is, allocation requests are rounded to a <strong>64-KB boundary</strong>. </li>
<li>The system ensures that the size of the region is a multiple of the system&#8217;s <i>page</i> size. A page is a unit of memory that the system uses in managing memory. Like the allocation granularity. The <i>x</i>86 and <i>x</i>64 systems use a <strong>4-KB page size</strong>, but the IA-64 uses an 8-KB page size. </li>
</ul>
<p>If you attempt to reserve a 10-KB region of address space, the system will automatically round up your request and reserve a region whose size is a multiple of the page size. This means that on <i>x</i>86 and <i>x</i>64 systems, the system will reserve a region that is 12 KB.</p>
<p>When your program&#8217;s algorithms no longer need to access a reserved region of address space, the region should be freed. This process is called <i>releasing</i> the region of address space and is accomplished by calling the <b>VirtualFree</b> function.</p>
<h1>Committing Physical Storage Within a Region</h1>
<p>To use a reserved region of address space, you must allocate physical storage and then map this storage to the reserved region. This process is called <i>committing</i> physical storage. Physical storage is always committed in pages. To commit physical storage to a reserved region, you again call the <b>VirtualAlloc</b> function.</p>
<p>When your program&#8217;s algorithms no longer need to access committed physical storage in the reserved region, the physical storage should be freed. This process is called <i>decommitting</i> the physical storage and is accomplished by calling the <b>VirtualFree</b> function.</p>
<h1>Physical Storage and the Paging File</h1>
<p>The file on the disk is typically called a <i>paging file</i>, and it contains the virtual memory that is available to all processes.</p>
<p>when an application commits physical storage to a region of address space by calling the <b>VirtualAlloc</b> function, space is actually allocated from a file on the hard disk. The size of the system&#8217;s paging file is the most important factor in determining how much physical storage is available to applications; the amount of RAM you have has very little effect.</p>
<p>Now when a thread in your process attempts to access a block of data in the process&#8217; address space.</p>
<p>physical address in memory, and then the desired access is performed.</p>
<p>In the second possibility, the data that the thread is attempting to access is not in RAM but is contained somewhere in the paging file. In this case, the attempted access is called a <i>page fault</i>, and the CPU notifies the operating system of the attempted access. The operating system then locates a free page of memory in RAM; if a free page cannot be found, the system must free one. If a page has not been modified, the system can simply free the page. But if the system needs to free a page that was modified, it must first copy the page from RAM to the paging file. Next the system goes to the paging file, locates the block of data that needs to be accessed, and loads the data into the free page of memory. The operating system then updates its table indicating that the data&#8217;s virtual memory address now maps to the appropriate physical memory address in RAM. The CPU now retries the instruction that generated the initial page fault, but this time the CPU is able to map the virtual memory address to a physical RAM address and access the block of data.</p>
<p>The more often the system needs to copy pages of memory to the paging file and vice versa, the more your hard disk thrashes and the slower the system runs. (<i>Thrashing</i> means that the operating system spends all its time swapping pages in and out of memory instead of running programs.) </p>
<p>When you invoke an application, the system opens the application&#8217;s .exe file and determines the size of the application&#8217;s code and data. Then the system reserves a region of address space and notes that the physical storage associated with this region is the .exe file itself. That&#8217;s right—instead of allocating space from the paging file, the system uses the actual contents, or <i>image</i>, of the .exe file as the program&#8217;s reserved region of address space. This, of course, makes loading an application very fast and allows the size of the paging file to remain small.</p>
<p>When a program&#8217;s file image (that is, an .exe or a DLL file) on the hard disk is used as the physical storage for a region of address space, it is called a <i>memory-mapped file</i>. When an .exe or a DLL is loaded, the system automatically reserves a region of address space and maps the file&#8217;s image to <a name="741"></a><a name="IDX-3801C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>this region.</p>
<h1>Page Protection Attributes</h1>
<p>Individual pages of physical storage allocated can be assigned different protection attributes.</p>
<p><a href="http://mhesham.files.wordpress.com/2011/09/image4.png"><img style="background-image:none;padding-left:0;padding-right:0;display:block;float:none;padding-top:0;border-width:0;margin:5px auto;" title="image" border="0" alt="image" src="http://mhesham.files.wordpress.com/2011/09/image_thumb4.png?w=640&#038;h=426" width="640" height="426" /></a></p>
<p>Some malware applications write code into areas of memory intended for data (such as a thread&#8217;s stack) and then the application executes the malicious code. Windows&#8217; <i>Data Execution Prevention (DEP)</i> feature provides protection against this type of malware attack. With DEP enabled, the operating system uses the <b>PAGE_EXECUTE_*</b> protections only on regions of memory that are intended to have code execute; other protections (typically <b>PAGE_READWRITE</b>) are used for regions of memory intended to have data in them (such as thread stacks and the application&#8217;s heaps).</p>
<p>Windows supports a mechanism that allows two or more processes to share a single block of storage. So if 10 instances of Notepad are running, all instances share the application&#8217;s code and data pages.</p>
<p>When an .exe or a .dll module is mapped into an address space, the system calculates how many pages are writable. (Usually, the pages containing code are marked as <b>PAGE_EXECUTE_READ</b> while the pages containing data are marked <b>PAGE_READWRITE</b>.) Then the system allocates storage from the paging file to accommodate these writable pages. This paging file storage is not used unless the module&#8217;s writable pages are actually written to.</p>
<p>When a thread in one process attempts to write to a shared block, the system intervenes and performs the following steps:</p>
<ol>
<li>
<p>The system finds a free page of memory in RAM. </p>
</li>
<li>
<p>The system copies the contents of the page attempting to be modified (in the image) to the free page found in step 1. This free page will be assigned either <b>PAGE_READWRITE</b> or <b>PAGE_EXECUTE_READWRITE</b> protection. The original page&#8217;s protection and data does not change at all.</p>
</li>
<li>
<p>The system then updates the process&#8217; page tables so that the accessed virtual address now translates to the new page of RAM.</p>
</li>
</ol>
<p>After the system has performed these steps, the process can access its own private instance of this page of storage.</p>
<p>A <em>memory</em> <i>block</i> is a set of contiguous pages that all have the same protection attributes and that are all backed by the same type of physical storage.</p>
<p>Protection attributes are given to a region for the sake of efficiency only, and they are always overridden by protection attributes assigned to physical storage.</p>
<p>A block&#8217;s protection attributes override the protection attributes of the region that contains the block. </p>
<h1>The Importance of Data Alignment</h1>
<p>Data alignment is not so much a part of the operating system&#8217;s memory architecture as it is a part of the CPU&#8217;s architecture.</p>
<p>CPUs operate most efficiently when they access properly aligned data. Data is aligned when the memory address of the data modulo of the data&#8217;s size is 0. For example, a <b>WORD</b> value should always start on an address that is evenly divided by 2, a <b>DWORD</b> value should always start on an address that is evenly divided by 4, and so on. When the CPU attempts to read a data value that is not properly aligned, the CPU will do one of two things. It will either raise an exception or the CPU will perform multiple, aligned memory accesses to read the full misaligned data value.</p>
<p><u>Here is some code that accesses misaligned data:</u></p>
<pre class="csharpcode">VOID SomeFunc(PVOID pvDataBuffer) {

   <span class="rem">// The first byte in the buffer is some byte of information</span>
   <span class="kwrd">char</span> c = * (PBYTE) pvDataBuffer;

   <span class="rem">// Increment past the first byte in the buffer</span>
   pvDataBuffer = (PVOID)((PBYTE) pvDataBuffer + 1);

   <span class="rem">// Bytes 2-5 contain a double-word value</span>
   DWORD dw = * (DWORD *) pvDataBuffer;

   <span class="rem">// The line above raises a data misalignment exception on some CPUs</span>
...</pre>
<p>Obviously, if the CPU performs multiple memory accesses, the performance of your application is hampered. At best, it will take the system twice as long to access a misaligned value as it will to access an aligned value—but the access time could be even worse! To get the best performance for your application, you&#8217;ll want to write your code so that the data is properly aligned.</p>
<h1>References</h1>
<blockquote>
<p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/408/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/408/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/408/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/408/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/408/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/408/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/408/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/408/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=408&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/09/10/windows-memory-architecture/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/09/image_thumb2.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/09/image_thumb3.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/09/image_thumb4.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>I/O Completion Ports</title>
		<link>http://mhesham.wordpress.com/2011/09/07/io-completion-ports/</link>
		<comments>http://mhesham.wordpress.com/2011/09/07/io-completion-ports/#comments</comments>
		<pubDate>Wed, 07 Sep 2011 14:21:27 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[I/O completion ports]]></category>
		<category><![CDATA[thread pool]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/09/07/io-completion-ports/</guid>
		<description><![CDATA[Content Introduction Creating an I/O Completion Port Associating a Device with an I/O Completion Port How the I/O Completion Port Manages the Thread Pool Simulating Completed I/O Requests Introduction Service application architecture can be one of the following models: Serial model A single thread waits for a client to make a request (usually over the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=401&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h1>Content</h1>
<ol>
<li>Introduction </li>
<li>Creating an I/O Completion Port </li>
<li>Associating a Device with an I/O Completion Port </li>
<li>How the I/O Completion Port Manages the Thread Pool </li>
<li>Simulating Completed I/O Requests </li>
</ol>
<h1>Introduction</h1>
<p>Service application architecture can be one of the following models:</p>
<ul>
<li>
<p><b>Serial model</b> A single thread waits for a client to make a request (usually over the network). When the request comes in, the thread wakes and handles the client&#8217;s request.</p>
</li>
<li>
<p><b>Concurrent model</b> A single thread waits for a client request and then creates a new thread to handle the request. While the new thread is handling the client&#8217;s request, the original thread loops back around and waits for another client request. When the thread that is handling the client&#8217;s request is completely processed, the thread dies.</p>
</li>
</ul>
<p>The problem with the serial model is that it does not handle multiple, simultaneous requests well. If two clients make requests at the same time, only one can be processed at a time; the second request must wait for the first request to finish processing. A service that is designed using the serial approach cannot take advantage of multiprocessor machines. Obviously, the serial model is good only for the simplest of server applications, in which few client requests are made and requests can be handled very quickly. A Ping server is a good example of a serial server.</p>
<p>Because of the limitations in the serial model, the concurrent model is extremely popular. In the concurrent model, a thread is created to handle each client request. The advantage is that the thread waiting for incoming requests has very little work to do. Most of the time, this thread is sleeping. When a client request comes in, the thread wakes, creates a new thread to handle the request, and then waits for another client request. This means that incoming client requests are handled expediently. Also, because each client request gets its own thread, the server application scales well and can easily take advantage of multiprocessor machines. </p>
<p>Service applications using the concurrent model were implemented using Windows. The Windows team noticed that application performance was not as high as desired. In particular, the <a name="634"></a><a name="IDX-3211C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>team noticed that handling many simultaneous client requests meant that many threads were running in the system concurrently. Because all these threads were <i>runnable</i> (not suspended and waiting for something to happen), Microsoft realized that the Windows kernel spent too much time context switching between the running threads, and the threads were not getting as much CPU time to do their work. To make Windows an awesome server environment, Microsoft needed to address this problem. The result is the I/O completion port kernel object.</p>
<h1>Creating an I/O Completion Port</h1>
<p>The theory behind I/O Completion Ports states the following:</p>
<ol>
<li>The number of threads running concurrently must have an upper bound, i.e 500 simultaneous client requests cannot allow 500 runnable threads to exist, it makes sense to set the upper bound equals number of CPUs. </li>
<li>I/O completion ports were designed to work with a pool of threads. A pool of threads is created when the application initializes, and these threads hang around for the duration of the application. What is the number of threads in the pool? As a rule of thumb take the number of CPUs on the host machine and multiply it by 2. So on a dual-processor machine, you should create a pool of four threads. </li>
</ol>
<pre class="csharpcode">HANDLE CreateNewCompletionPort(DWORD dwNumberOfConcurrentThreads) {

   <span class="kwrd">return</span>(CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0,
      dwNumberOfConcurrentThreads));
}</pre>
<h1>Associating a Device with an I/O Completion Port</h1>
<pre class="csharpcode">BOOL AssociateDeviceWithCompletionPort(
   HANDLE hCompletionPort, HANDLE hDevice, DWORD dwCompletionKey) {

   HANDLE h = CreateIoCompletionPort(hDevice, hCompletionPort, dwCompletionKey, 0);
   <span class="kwrd">return</span>(h == hCompletionPort);
}</pre>
<h1>I/O Completion Ports Internal Data structures</h1>
<p><strong>Device List: </strong>contains a set of device handles associated with the completion port.</p>
<p><strong>I/O Completion Queue (FIFO):</strong> When an asynchronous I/O request for a device completes, the system checks to see whether the device is associated with a completion port and, if it is, the system appends the completed I/O request entry to the end of the completion port&#8217;s I/O completion queue.</p>
<p><strong>Waiting Thread Stack (LIFO):</strong> As each thread in the thread pool calls <b>GetQueuedCompletionStatus</b>, the ID of the calling thread is placed in this waiting thread queue, enabling the I/O completion port kernel object to always know which threads are currently waiting to handle completed I/O requests. When an entry appears in the port&#8217;s I/O completion queue, the completion port wakes one of the threads in the waiting thread queue. This thread gets the pieces of information that make up a completed I/O entry.</p>
<p><strong>Release Thread List and Paused Thread List:</strong> When a completion port wakes a thread, the completion port places the thread&#8217;s ID in the released thread list. This allows the completion port to remember which threads it awakened and to monitor the execution of these threads. If a released thread calls any function that places the thread in a wait state, the completion port detects this and updates its internal data structures by moving the thread&#8217;s ID from the released thread list to the paused thread list </p>
<p>All the threads in the pool should execute the same function. Typically, this thread function performs some sort of initialization and then enters a loop that should terminate when the service process is instructed to stop. Inside the loop, the thread puts itself to sleep waiting for device I/O requests to complete to the completion port.</p>
<h1>How the I/O Completion Port Manages the Thread Pool</h1>
<p>The goal of the completion port is to keep as many entries in the released thread list as are specified by the concurrent number of threads value used when creating the completion port. If a released thread enters a wait state for any reason, the released thread list shrinks and the completion port releases another waiting thread. If a paused thread wakes, it leaves the paused thread list and <a name="646"></a><a name="IDX-3281C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>reenters the released thread list. This means that the released thread list can now have more entries in it than are allowed by the maximum concurrency value.</p>
<p>Once a thread calls <b>GetQueuedCompletionStatus</b>, the thread is &quot;assigned&quot; to the specified completion port. The system assumes that all assigned threads are doing work on behalf of the completion port. The completion port wakes threads from the pool only if the number of running assigned threads is less than the completion port&#8217;s maximum concurrency value.</p>
<p><u>You can break the thread/completion port assignment in one of three ways:</u></p>
<ol>
<li>Have the thread exit. </li>
<li>Have the thread call <b>GetQueuedCompletionStatus</b>, passing the handle of a different I/O completion port. </li>
<li>Destroy the I/O completion port that the thread is currently assigned to. </li>
</ol>
<p>Let&#8217;s tie all of this together now. Say that we are again running on a machine with two CPUs. We create a completion port that allows no more than two threads to wake concurrently, and we create four threads that are waiting for completed I/O requests. If three completed I/O requests get queued to the port, only two threads are awakened to process the requests, reducing the number of runnable threads and saving context-switching time. Now if one of the running threads calls <b>Sleep</b>, <b>WaitForSingleObject</b>, <b>WaitForMultipleObjects</b>, <b>SignalObjectAndWait</b>, a synchronous I/O call, or any function that would cause the thread not to be runnable, the I/O completion port would detect this and wake a third thread immediately. The goal of the completion port is to keep the CPUs saturated with work.</p>
<p>Eventually, the first thread will become runnable again. When this happens, the number of runnable threads will be higher than the number of CPUs in the system. However, the completion port again is aware of this and will not allow any additional threads to wake up until the number of threads drops below the number of CPUs. The I/O completion port architecture presumes that the number of runnable threads will stay above the maximum for only a short time and will die down quickly as the threads loop around and again call <b>GetQueuedCompletionStatus</b>. This explains why the thread pool should contain more threads than the concurrent thread count set in the completion port.</p>
<h1>Simulating Completed I/O Requests</h1>
<p>I/O completion ports do not have to be used with device I/O at all. it can be used for inter-thread communication.</p>
<p>The <b>PostQueuedCompletionStatus</b> function is incredibly useful—it gives you a way to communicate with all the threads in your pool. For example, when the user terminates a service application, you want all the threads to exit cleanly. But if the threads are waiting on the completion port and <a name="651"></a><a name="IDX-3311C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>no I/O requests are coming in, the threads can&#8217;t wake up. By calling <b>PostQueuedCompletionStatus</b> once for each thread in the pool, each thread can wake up, examine the values returned from <b>GetQueuedCompletionStatus</b>, see that the application is terminating, and clean up and exit appropriately.</p>
<p>You must be careful when using a thread termination technique like the one I just described. My example works because the threads in the pool are dying and not calling <b>GetQueuedCompletionStatus</b> again. However, if you want to notify each of the pool&#8217;s threads of something and have them loop back around to call <b>GetQueuedCompletionStatus</b> again, you will have a problem because the threads wake up in a LIFO order. So you will have to employ some additional thread synchronization in your application to ensure that each pool thread gets the opportunity to see its simulated I/O entry. Without this additional thread synchronization, one thread might see the same notification several times.</p>
<h1>API Table</h1>
<table border="2" cellspacing="3" cellpadding="4" width="900">
<tbody>
<tr>
<td valign="top" width="516">
<pre>HANDLE CreateIoCompletionPort(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160; hFile,</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160; hExistingCompletionPort,</pre>
<pre>&#160;&#160; ULONG_PTR CompletionKey,</pre>
<pre>&#160;&#160; DWORD&#160;&#160;&#160;&#160; dwNumberOfConcurrentThreads);</pre>
</td>
<td valign="top" width="371">
<p>This function performs two different tasks: it creates an I/O completion port, and it associates a device with an I/O completion port.</p>
<p><a name="636"></a><a name="IDX-3221C4AB577-C32E-4A16-BF0F-07F3124E1"></a>The <b>dwNumberOfConcurrentThreads</b> parameter tells the I/O completion port the maximum number of threads that should be runnable at the same time. If you pass <b>0</b> for the <b>dwNumberOfConcurrentThreads</b> parameter, the completion port defaults to allowing as many concurrent threads as there are CPUs on the host machine.</p>
</td>
</tr>
<tr>
<td valign="top" width="516">
<pre>BOOL GetQueuedCompletionStatus(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160;&#160;&#160;&#160; hCompletionPort,</pre>
<pre>&#160;&#160; PDWORD&#160;&#160;&#160;&#160;&#160;&#160; pdwNumberOfBytesTransferred,</pre>
<pre>&#160;&#160; PULONG_PTR&#160;&#160; pCompletionKey,</pre>
<pre>&#160;&#160; OVERLAPPED** ppOverlapped,</pre>
<pre>&#160;&#160; DWORD&#160;&#160;&#160;&#160;&#160;&#160;&#160; dwMilliseconds);<a name="642"></a><a name="IDX-3251C4AB577-C32E-4A16-BF0F-07F3124E1"></a></pre>
</td>
<td valign="top" width="371">
<p>The thread puts itself to sleep waiting for device I/O requests to complete to the completion port</p>
</td>
</tr>
<tr>
<td valign="top" width="516">
<pre>BOOL GetQueuedCompletionStatusEx(</pre>
<pre>&#160; HANDLE hCompletionPort,</pre>
<pre>&#160; LPOVERLAPPED_ENTRY pCompletionPortEntries,</pre>
<pre>&#160; ULONG ulCount,</pre>
<pre>&#160; PULONG pulNumEntriesRemoved,</pre>
<pre>&#160; DWORD dwMilliseconds,</pre>
<pre>&#160; BOOL bAlertable);</pre>
</td>
<td valign="top" width="371">
<p>In Windows Vista, if you expect a large number of I/O requests to be constantly submitted, instead of multiplying the number of threads to wait on the completion port and incurring the increasing cost of the corresponding context switches, you can retrieve the result of several I/O requests at the same time.</p>
</td>
</tr>
<tr>
<td valign="top" width="516">
<pre>BOOL PostQueuedCompletionStatus(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160;&#160;&#160; hCompletionPort,</pre>
<pre>&#160;&#160; DWORD&#160;&#160;&#160;&#160;&#160;&#160; dwNumBytes,</pre>
<pre>&#160;&#160; ULONG_PTR&#160;&#160; CompletionKey,</pre>
<pre>&#160;&#160; OVERLAPPED* pOverlapped);</pre>
</td>
<td valign="top" width="416">
<p>This function appends a completed I/O notification to an I/O completion port&#8217;s queue. The first parameter, <b>hCompletionPort</b>, identifies the completion port that you want to queue the entry for. The remaining three parameters—<b>dwNumBytes</b>, <b>CompletionKey</b>, and <b>pOverlapped</b>—indicate the values that should be returned by a thread&#8217;s call to <b>GetQueuedCompletionStatus</b>. When a thread pulls a simulated entry from the I/O completion queue, <b>GetQueuedCompletionStatus</b> returns <b>TRUE</b>, indicating a successfully executed I/O request.</p>
</td>
</tr>
</tbody>
</table>
<h1>References</h1>
<blockquote>
<p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/401/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/401/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/401/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/401/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/401/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/401/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/401/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/401/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=401&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/09/07/io-completion-ports/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>
	</item>
		<item>
		<title>Synchronous and Asynchronous Device I/O</title>
		<link>http://mhesham.wordpress.com/2011/09/07/synchronous-and-asynchronous-device-io/</link>
		<comments>http://mhesham.wordpress.com/2011/09/07/synchronous-and-asynchronous-device-io/#comments</comments>
		<pubDate>Wed, 07 Sep 2011 12:57:30 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[alertable I/O]]></category>
		<category><![CDATA[asynchronous I/O]]></category>
		<category><![CDATA[device]]></category>
		<category><![CDATA[file]]></category>
		<category><![CDATA[synchronous I/O]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/09/07/synchronous-and-asynchronous-device-io/</guid>
		<description><![CDATA[Content Introduction Synchronous I/O Basics of Asynchronous Device I/O Receiving Completed I/O Request Notifications Introduction A scalable application handles a large number of concurrent operations as efficiently as it handles a small number of concurrent operations. One of the strengths of Windows is the sheer number of devices that it supports. In the context of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=398&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h1>Content</h1>
<ol>
<li>Introduction</li>
<li>Synchronous I/O </li>
<li>Basics of Asynchronous Device I/O</li>
<li>Receiving Completed I/O Request Notifications</li>
</ol>
<h1>Introduction</h1>
<p>A scalable application handles a large number of concurrent operations as efficiently as it handles a small number of concurrent operations.</p>
<p>One of the strengths of Windows is the sheer number of devices that it supports. In the context of this discussion, We define a device to be anything that allows communication. The below table lists some devices and their most common uses.</p>
<p><a href="http://mhesham.files.wordpress.com/2011/09/image.png"><img style="background-image:none;padding-left:0;padding-right:0;display:block;float:none;padding-top:0;border-width:0;margin:5px auto;" title="image" border="0" alt="image" src="http://mhesham.files.wordpress.com/2011/09/image_thumb.png?w=640&#038;h=467" width="640" height="467" /></a></p>
<p>To perform any type of I/O, you must first open the desired device and get a handle to it. The way you get the handle to a device depends on the particular device. The below table lists various devices and the functions you should call to open them.</p>
<p><a href="http://mhesham.files.wordpress.com/2011/09/image1.png"><img style="background-image:none;padding-left:0;padding-right:0;display:block;float:none;padding-top:0;border-width:0;margin:5px auto;" title="image" border="0" alt="image" src="http://mhesham.files.wordpress.com/2011/09/image_thumb1.png?w=639&#038;h=768" width="639" height="768" /></a></p>
<p>Synchronous I/O is what most developers are used to. When you read data from a file, your thread is suspended, waiting for the information to be read. Once the information has been read, the thread regains control and continues executing. </p>
<p>Because device I/O is slow when compared with most other operations, you might want to consider communicating with some devices asynchronously. Here&#8217;s how it works: Basically, you call a function to tell the operating system to read or write data, but instead of waiting for the I/O to complete, your call returns immediately, and the operating system completes the I/O on your behalf using its own threads. When the operating system has finished performing your requested I/O, you can be notified. Asynchronous I/O is the key to creating high-performance, scalable, responsive, and robust applications. </p>
<p>Most Windows functions that return a handle return <b>NULL</b> when the function fails. However, <b>CreateFile</b> returns <b>INVALID_HANDLE_VALUE</b> (defined as –1) instead. </p>
<p>HANDLE hFile = CreateFile(&#8230;);    <br />if (hFile == NULL) {     <br />&#160;&#160; // We&#8217;ll never get in here     <br />} else {     <br />&#160;&#160; // File might or might not be created OK     <br />}     </p>
<p>Here&#8217;s the correct way to check for an invalid file handle:</p>
<pre>HANDLE hFile = CreateFile(...);
if (hFile == INVALID_HANDLE_VALUE) {
   // File not created
} else {
   // File created OK
}</pre>
<h2>Synchronous I/O Cancellation</h2>
<p>Functions that do synchronous I/O are easy to use, but they block any other operations from occurring on the thread that issued the I/O until the request is completed. A great example of this is a <b>CreateFile</b> operation. When a user performs mouse and keyboard input, window messages are inserted into a queue that is associated with the thread that created the window that the input is destined for. If that thread is stuck inside a call to <b>CreateFile</b>, waiting for <b>CreateFile</b> to <a name="605"></a><a name="IDX-3041C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>return, the window messages are not getting processed and all the windows created by the thread are frozen. The most common reason why applications hang is because their threads are stuck waiting for synchronous I/O operations to complete! </p>
<p>To build a responsive application, you should try to perform asynchronous I/O operations as much as possible. This typically also allows you to use very few threads in your application, thereby saving resources (such as thread kernel objects and stacks). Also, it is usually easy to offer your users the ability to cancel an operation when you initiate it asynchronously </p>
<p>In Windows Vista, the following function allows you to cancel a pending synchronous I/O request for a given thread: BOOL CancelSynchronousIo(HANDLE hThread); </p>
<p>The <b>hThread</b> parameter is a handle of the thread that is suspended waiting for the synchronous I/O request to complete. This handle must have been created with the <b>THREAD_TERMINATE</b> access. If this is not the case, <b>CancelSynchronousIo</b> fails and <b>GetLastError</b> returns <b>ERROR_ACCESS_ DENIED</b>. When you create the thread yourself by using <b>CreateThread</b> or <b>_beginthreadex</b>, the returned handle has <b>THREAD_ALL_ACCESS</b>, which includes <b>THREAD_TERMINATE</b> access. </p>
<p>If the specified thread was suspended waiting for a synchronous I/O operation to complete, <b>CancelSynchronousIo</b> wakes the suspended thread and the operation it was trying to perform returns failure; calling <b>GetLastError</b> returns <b>ERROR_OPERATION_ABORTED</b>. Also, <b>CancelSynchronousIo</b> returns <b>TRUE</b> to its caller. </p>
<p>Note that the thread calling <b>CancelSynchronousIo</b> doesn&#8217;t really know where the thread that called the synchronous operation is. The thread could have been pre-empted and it has yet to actually communicate with the device; it could be suspended, waiting for the device to respond; or the device could have just responded, and the thread is in the process of returning from its call. If <b>CancelSynchronousIo</b> is called when the specified thread is not actually suspended waiting for the device to respond, <b>CancelSynchronousIo</b> returns <b>FALSE</b> and <b>GetLastError</b> returns <b>ERROR_NOT_FOUND</b>. </p>
<h1>Basics of Asynchronous Device I/O</h1>
<p>Compared to most other operations carried out by a computer, device I/O is one of the slowest and most unpredictable. The CPU performs arithmetic operations and even paints the screen much faster than it reads data from or writes data to a file or across a network. However, using asynchronous device I/O enables you to better use resources and thus create more efficient applications. </p>
<p>To access a device asynchronously, you must first open the device by calling <b>CreateFile</b>, specifying the <b>FILE_FLAG_OVERLAPPED</b> flag in the <b>dwFlagsAndAttributes</b> parameter. This flag notifies the system that you intend to access the device asynchronously. </p>
<h2>The <i>OVERLAPPED</i> Structure</h2>
<p>When performing asynchronous device I/O, you must pass the address to an initialized <b>OVERLAPPED</b> structure via the <b>pOverlapped</b> parameter.</p>
<pre class="csharpcode">typedef <span class="kwrd">struct</span> _OVERLAPPED {
   DWORD  Internal;     <span class="rem">// [out] Error code</span>
   DWORD  InternalHigh; <span class="rem">// [out] Number of bytes transferred</span>
   DWORD  Offset;       <span class="rem">// [in]  Low 32-bit file offset</span>
   DWORD  OffsetHigh;   <span class="rem">// [in]  High 32-bit file offset</span>
   HANDLE hEvent;       <span class="rem">// [in]  Event handle or data</span>
} OVERLAPPED</pre>
<ul>
<li>
<p><b>Offset and OffsetHigh</b> When a file is being accessed, these members indicate the 64-bit offset in the file where you want the I/O operation to begin. Recall that each file kernel object has a file pointer associated with it. When issuing a synchronous I/O request, the system knows to start accessing the file at the location identified by the file pointer. After the operation is complete, the system updates the file pointer automatically so that the next operation can pick up where the last operation left off. </p>
<p>When performing asynchronous I/O, this file pointer is ignored by the system. Imagine what would happen if your code placed two asynchronous calls to <b>ReadFile</b> (for the same file kernel object). In this scenario, the system wouldn&#8217;t know where to start reading for the second call to <b>ReadFile</b>. You probably wouldn&#8217;t want to start reading the file at the same location used by the first call to <b>ReadFile</b>. You might want to start the second read at the byte in the file that followed the last byte that was read by the first call to <b>ReadFile</b>. To avoid the confusion of multiple asynchronous calls to the same object, all asynchronous I/O requests must specify the starting file offset in the <b>OVERLAPPED</b> structure. </p>
<p><a name="610"></a><a name="IDX-3071C4AB577-C32E-4A16-BF0F-07F3124E103E"></a></p>
<p>Note that the <b>Offset</b> and <b>OffsetHigh</b> members are not ignored for nonfile devices—you must initialize both members to <b>0</b> or the I/O request will fail and <b>GetLastError</b> will return <b>ERROR_INVALID_PARAMETER</b>.</p>
</li>
<li>
<p><b>hEvent</b> This member is used by one of the four methods available for receiving I/O completion notifications. When using the alertable I/O notification method, this member can be used for your own purposes. </p>
</li>
<li>
<p><b>Internal</b> This member holds the processed I/O&#8217;s error code. As soon as you issue an asynchronous I/O request, the device driver sets <b>Internal</b> to <b>STATUS_PENDING</b>, indicating that no error has occurred because the operation has not started.</p>
</li>
<li>
<p><b>InternalHigh</b> When an asynchronous I/O request completes, this member holds the number of bytes transferred.</p>
</li>
</ul>
<p>To pass with the OVERLAPPED structure a more useful contextual information, you can extend it.</p>
<h2>Asynchronous Device I/O Caveats</h2>
<ul>
<li>The device driver doesn&#8217;t have to process queued I/O requests in a first-in first-out (FIFO) fashion. </li>
<li>When attempting to queue an asynchronous I/O request, the device driver might choose to process the request synchronously. This can occur if you&#8217;re reading from a file and the system checks whether the data you want is already in the system&#8217;s cache. If the data is available, your I/O request is not queued to the device driver; instead, the system copies the data from the cache to your buffer, and the I/O operation is complete. </li>
<li>The data buffer and <b>OVERLAPPED</b> structure used to issue the asynchronous I/O request must not be moved or destroyed until the I/O request has completed. </li>
</ul>
<h2>Canceling Queued Device I/O Requests</h2>
<ul>
<li>You can call <b>CancelIo</b> to cancel all I/O requests queued by the calling thread for the specified handle: BOOL CancelIo(HANDLE hFile); </li>
<li>You can cancel all queued I/O requests, regardless of which thread queued the request, by <strong>closing the handle</strong> to a device itself. </li>
<li>When a thread dies, the system automatically cancels all I/O requests issued by the thread. </li>
<li>If you need to cancel a single, specific I/O request submitted on a given file handle, you can call <b>CancelIoEx</b>: BOOL CancelIoEx(HANDLE hFile, LPOVERLAPPED pOverlapped);. With <b>CancelIoEx</b>, you are able to cancel pending I/O requests emitted by a thread different from the calling thread. This function marks as canceled all I/O requests that are pending on <b>hFile</b> and associated with the given <b>pOverlapped</b> parameter. Because each outstanding I/O request should have its own <b>OVERLAPPED</b> structure, each call to <b>CancelIoEx</b> should cancel just one outstanding request. However, if the <b>pOverlapped</b> parameter is <b>NULL</b>, <b>CancelIoEx</b> cancels all outstanding I/O requests for the specified <b>hFile</b>. </li>
</ul>
<h1>Receiving Completed I/O Request Notifications</h1>
<ol>
<li><strong>Signaling a device kernel object:</strong> Not useful for performing multiple simultaneous I/O requests against a single device. Allows one thread to issue an I/O request and another thread to process it.
<p>&#160; </li>
<li><strong>Signaling an event kernel object: </strong>Allows multiple simultaneous I/O requests against a single device. Allows one thread to issue an I/O request and another thread to process it.
<p>&#160; </li>
<li><strong>Using alertable I/O:</strong> Allows multiple simultaneous I/O requests against a single device. The thread that issued an I/O request must also process it.
<p>&#160; </li>
<li><strong>Using I/O completion ports: </strong>Allows multiple simultaneous I/O requests against a single device. Allows one thread to issue an I/O request and another thread to process it. This technique is highly scalable and has the most flexibility. </li>
</ol>
<h2>1. Signaling a Device Kernel Object</h2>
<p>A thread can determine whether an asynchronous I/O request has completed by calling either <b>WaitForSingleObject</b> or <b>WaitForMultipleObjects</b>. Here is a simple example:</p>
<pre class="csharpcode">HANDLE hFile = CreateFile(..., FILE_FLAG_OVERLAPPED, ...);
BYTE bBuffer[100];
OVERLAPPED o = { 0 };
o.Offset = 345;

BOOL bReadDone = ReadFile(hFile, bBuffer, 100, NULL, &amp;o);
DWORD dwError = GetLastError();

<span class="kwrd">if</span> (!bReadDone &amp;&amp; (dwError == ERROR_IO_PENDING)) {
   <span class="rem">// The I/O is being performed asynchronously; wait for it to complete</span>
   WaitForSingleObject(hFile, INFINITE);
   bReadDone = TRUE;
}

<span class="kwrd">if</span> (bReadDone) {
   <span class="rem">// o.Internal contains the I/O error</span>
   <span class="rem">// o.InternalHigh contains the number of bytes transferred</span>
   <span class="rem">// bBuffer contains the read data</span>
} <span class="kwrd">else</span> {
   <span class="rem">// An error occurred; see dwError</span>
}</pre>
<h2>2. Signaling an Event Kernel Object</h2>
<p>The following code demonstrates this approach:</p>
<pre class="csharpcode">HANDLE hFile = CreateFile(..., FILE_FLAG_OVERLAPPED, ...);

BYTE bReadBuffer[10];
OVERLAPPED oRead = { 0 };
oRead.Offset = 0;
oRead.hEvent = CreateEvent(...);
ReadFile(hFile, bReadBuffer, 10, NULL, &amp;oRead);

BYTE bWriteBuffer[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
OVERLAPPED oWrite = { 0 };
oWrite.Offset = 10;
oWrite.hEvent = CreateEvent(...);
WriteFile(hFile, bWriteBuffer, _countof(bWriteBuffer), NULL, &amp;oWrite);
...

HANDLE h[2];
h[0] = oRead.hEvent;
h[1] = oWrite.hEvent;
DWORD dw = WaitForMultipleObjects(2, h, FALSE, INFINITE);
<span class="kwrd">switch</span> (dw – WAIT_OBJECT_0) {
   <span class="kwrd">case</span> 0:   <span class="rem">// Read completed</span>
      <span class="kwrd">break</span>;

   <span class="kwrd">case</span> 1:   <span class="rem">// Write completed</span>
      <span class="kwrd">break</span>;
}</pre>
<h2>3. Alertable I/O</h2>
<p>Whenever a thread is created, the system also creates a queue that is associated with the thread. This queue is called the asynchronous procedure call (APC) queue. When issuing an I/O request, you can tell the device driver to append an entry to the calling thread&#8217;s APC queue. To have completed I/O notifications queued to your thread&#8217;s APC queue, you call the <b>ReadFileEx</b> and <b>WriteFileEx</b> functions.</p>
<p>Second, the <b>*Ex</b> functions require that you pass the address of a callback function, called a <i>completion routine</i>. This routine must have the following prototype: VOID WINAPI CompletionRoutine(DWORD dwError, DWORD dwNumBytes, OVERLAPPED* po);</p>
<p>When you issue an asynchronous I/O request with <b>ReadFileEx</b> and <b>WriteFileEx</b>, the functions pass the address of this function to the device driver. When the device driver has completed the I/O request, it appends an entry in the issuing thread&#8217;s APC queue. This entry contains the address of the completion routine function and the address of the <b>OVERLAPPED</b> structure used to initiate the I/O request.</p>
<p>When the thread is in an alertable state (discussed shortly), the system examines its APC queue and, for every entry in the queue, the system calls the completion function, passing it the I/O error code, the number of bytes transferred, and the address of the <b>OVERLAPPED</b> structure.</p>
<p>To process entries in your thread&#8217;s APC queue, the thread must put itself in an alertable state. This simply means that your thread has reached a position in its execution where it can handle being interrupted.</p>
<h4>The Bad and the Good of Alertable I/O</h4>
<ul>
<li>
<p><b>Callback functions</b> Alertable I/O requires that you create callback functions, which makes implementing your code much more difficult. These callback functions typically don&#8217;t have enough contextual information about a particular problem to guide you, so you end up placing a lot of information in global variables. </p>
</li>
<li>
<p><b>Threading issues</b> The real big problem with alertable I/O is this: The thread issuing the I/O request must also handle the completion notification. If a thread issues several requests, that thread must respond to each request&#8217;s completion notification, even if other threads are sitting completely idle. Because there is no load balancing, the application doesn&#8217;t scale well.</p>
</li>
</ul>
<p>Both of these problems are pretty severe, so it is strongly discouraged to use alertable I/O for device I/O. </p>
<h1>API Table</h1>
<table border="2" cellspacing="2" cellpadding="3" width="1328">
<tbody>
<tr>
<td valign="top" width="10">
<pre>DWORD GetFileType(HANDLE hDevice);</pre>
</td>
<td valign="top" width="1318">
<p>Also, if you have a handle to a device, you can find out what type of device it is by calling <b>GetFileType,</b></p>
<p>· FILE_TYPE_UNKNOWN: The type of the specified file is unknown.</p>
<p>· FILE_TYPE_DISK: The specified file is a disk file.</p>
<p>· FILE_TYPE_CHAR: The specified file is a character file, typically an LPT device or a console.</p>
<p>· FILE_TYPE_PIPE: The specified file is either a named pipe or an anonymous pipe.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>HANDLE CreateFile(</pre>
<pre>&#160;&#160; PCTSTR pszName,</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess,</pre>
<pre>&#160;&#160; DWORD dwShareMode,</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; DWORD dwCreationDisposition,</pre>
<pre>&#160;&#160; DWORD dwFlagsAndAttributes,</pre>
<pre>&#160;&#160; HANDLE hFileTemplate);</pre>
</td>
<td valign="top" width="1318">
<p>creates and opens disk files, but don&#8217;t let the name fool you— it opens lots of other devices as well.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL WINAPI GetDiskFreeSpace(</pre>
<pre>&#160; __in&#160;&#160; LPCTSTR lpRootPathName,</pre>
<pre>&#160; __out&#160; LPDWORD lpSectorsPerCluster,</pre>
<pre>&#160; __out&#160; LPDWORD lpBytesPerSector,</pre>
<pre>&#160; __out&#160; LPDWORD lpNumberOfFreeClusters,</pre>
<pre>&#160; __out&#160; LPDWORD lpTotalNumberOfClusters</pre>
<pre>);</pre>
</td>
<td valign="top" width="1318">
<p>Retrieves information about the specified disk, including amount of bytes per sector.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL GetFileSizeEx(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; hFile,</pre>
<pre>&#160;&#160; PLARGE_INTEGER pliFileSize);</pre>
</td>
<td valign="top" width="1318">
<p>Acquire the file&#8217;s size, The first parameter, <b>hFile</b>, is the handle of an opened file, and the <b>pliFileSize</b> parameter is the address of a <b>LARGE_INTEGER</b> union.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL SetFilePointerEx(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; hFile,</pre>
<pre>&#160;&#160; LARGE_INTEGER&#160; liDistanceToMove,</pre>
<pre>&#160;&#160; PLARGE_INTEGER pliNewFilePointer,</pre>
<pre>&#160;&#160; DWORD&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; dwMoveMethod);</pre>
</td>
<td valign="top" width="1318">
<p>If you need to access a file randomly, you will need to alter the file pointer associated with the file&#8217;s kernel object.</p>
<p>The <b>hFile</b> parameter identifies the file kernel object whose file pointer you want to change. The <b>liDistanceToMove</b> parameter tells the system by how many bytes you want to move the pointer. The number you specify is added to the current value of the file&#8217;s pointer, so a negative number has the effect of stepping backward in the file. The last parameter of <b>SetFilePointerEx</b>, <b>dwMoveMethod</b>, tells <b>SetFilePointerEx</b> how to interpret the <b>liDistanceToMove</b> parameter.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL SetEndOfFile(HANDLE hFile);</pre>
<pre>&#160;</pre>
</td>
<td valign="top" width="1318">
<p>This <b>SetEndOfFile</b> function truncates or extends a file&#8217;s size to the size indicated by the file object&#8217;s file pointer. For example, if you wanted to force a file to be 1024 bytes long, you&#8217;d use <b>SetEndOfFile</b> this way:</p>
<pre>HANDLE hFile = CreateFile(...);</pre>
<pre>LARGE_INTEGER liDistanceToMove;</pre>
<pre>liDistanceToMove.QuadPart = 1024;</pre>
<pre>SetFilePointerEx(hFile, liDistanceToMove, NULL, FILE_BEGIN);</pre>
<pre>SetEndOfFile(hFile);</pre>
<pre>CloseHandle(hFile);</pre>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL ReadFile(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160;&#160;&#160; hFile,</pre>
<pre>&#160;&#160; PVOID&#160;&#160;&#160;&#160;&#160;&#160; pvBuffer,</pre>
<pre>&#160;&#160; DWORD&#160;&#160;&#160;&#160;&#160;&#160; nNumBytesToRead,</pre>
<pre>&#160;&#160; PDWORD&#160;&#160;&#160;&#160;&#160; pdwNumBytes,</pre>
<pre>&#160;&#160; OVERLAPPED* pOverlapped);</pre>
<pre>&#160;</pre>
<pre>BOOL WriteFile(</pre>
<pre>&#160;&#160; HANDLE&#160;&#160;&#160;&#160;&#160; hFile,</pre>
<pre>&#160;&#160; CONST VOID&#160; *pvBuffer,</pre>
<pre>&#160;&#160; DWORD&#160;&#160;&#160;&#160;&#160;&#160; nNumBytesToWrite,</pre>
<pre>&#160;&#160; PDWORD&#160;&#160;&#160;&#160;&#160; pdwNumBytes,</pre>
<pre>&#160;&#160; OVERLAPPED* pOverlapped);</pre>
</td>
<td valign="top" width="1318">
<p>The <b>hFile</b> parameter identifies the handle of the device you want to access. When the device is opened, you must not specify the <b>FILE_FLAG_OVERLAPPED</b> flag, or the system will think that you want to perform asynchronous I/O with the device. The <b>pvBuffer</b> parameter points to the buffer to which the device&#8217;s data should be read or to the buffer containing the data that should be written to the device. The <b>nNumBytesToRead</b> and <b>nNumBytesToWrite</b> parameters tell <b>ReadFile</b> and <b>WriteFile</b> how many bytes to read from the device and how many bytes to write to the device, respectively.</p>
<p>The <b>pdwNumBytes</b> parameters indicate the address of a <b>DWORD</b> that the functions fill with the number of bytes successfully transmitted to and from the device. The last parameter, <b>pOverlapped</b>, should be <b>NULL</b> when performing synchronous I/O. You&#8217;ll examine this parameter in more detail shortly when asynchronous I/O is discussed.</p>
<p>Both <b>ReadFile</b> and <b>WriteFile</b> return <b>TRUE</b> if successful. By the way, <b>ReadFile</b> can be called only for devices that were opened with the <b>GENERIC_READ</b> flag. Likewise, <b>WriteFile</b> can be called only when the device is opened with the <b>GENERIC_WRITE</b> flag.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL FlushFileBuffers(HANDLE hFile);</pre>
<pre>&#160;</pre>
</td>
<td valign="top" width="1318">
<p>If you want to force the system to write cached data to the device.</p>
<p>The <b>FlushFileBuffers</b> function forces all the buffered data associated with a device that is identified by the <b>hFile</b> parameter to be written. For this to work, the device has to be opened with the <b>GENERIC_WRITE</b> flag. If the function is successful, <b>TRUE</b> is returned.</p>
</td>
</tr>
<tr>
<td valign="top" width="10">
<pre>BOOL SetFileCompletionNotificationModes(HANDLE hFile, UCHAR uFlags);</pre>
</td>
<td valign="top" width="1318">
<p>To improve performance slightly, you can tell Windows not to signal the file object when the operation completes.</p>
<p>The <b>hFile</b> parameter identifies a file handle, and the <b>uFlags</b> parameter indicates how Windows should modify its normal behavior with respect to completing an I/O operation. If you pass the <b>FILE_SKIP_SET_EVENT_ON_HANDLE</b> flag, Windows will not signal the file handle when operations on the file complete.</p>
</td>
</tr>
</tbody>
</table>
<h1>References</h1>
<blockquote>
<p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/398/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/398/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=398&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/09/07/synchronous-and-asynchronous-device-io/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/09/image_thumb.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/09/image_thumb1.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Thread Synchronization with Kernel Objects</title>
		<link>http://mhesham.wordpress.com/2011/08/29/thread-synchronization-with-kernel-objects/</link>
		<comments>http://mhesham.wordpress.com/2011/08/29/thread-synchronization-with-kernel-objects/#comments</comments>
		<pubDate>Sun, 28 Aug 2011 23:20:56 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[events]]></category>
		<category><![CDATA[kernel mode]]></category>
		<category><![CDATA[mutext]]></category>
		<category><![CDATA[process synchronization]]></category>
		<category><![CDATA[semaphore]]></category>
		<category><![CDATA[thread synchronization]]></category>
		<category><![CDATA[timers]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/08/29/thread-synchronization-with-kernel-objects/</guid>
		<description><![CDATA[Content Introduction Wait Functions Event Kernel Objects Waitable Timer Kernel Objects Semaphore Kernel Objects Mutex Kernel Objects Mutexes vs Critical Sections Introduction Although user-mode thread synchronization mechanisms offer great performance, they do have limitations, such as: You can use critical sections to place a thread in a wait state, but you can use them only [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=393&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h1>Content</h1>
<ol>
<li>Introduction </li>
<li>Wait Functions </li>
<li>Event Kernel Objects </li>
<li>Waitable Timer Kernel Objects </li>
<li>Semaphore Kernel Objects </li>
<li>Mutex Kernel Objects </li>
<li>Mutexes vs Critical Sections </li>
</ol>
<h1>Introduction</h1>
<p>Although user-mode thread synchronization mechanisms offer great performance, they do have limitations, <u>such as:</u></p>
<ul>
<li>You can use critical sections to place a thread in a wait state, but you can use them only to synchronize threads contained within a single process </li>
<li>You can easily get into deadlock situations with critical sections because you cannot specify a timeout value while waiting to enter the critical section. </li>
</ul>
<p>The drawback of using Kernel Objects is their performance, the transition from <em>user-mode</em> to <em>kernel-mode </em>is costly: it takes about 200 CPU cycles on the <i>x</i>86 platform for an empty system call—and this, of course, does not include the execution of the kernel-mode code that actually implements the function your thread is calling. But what takes several orders of magnitude more is the overhead of scheduling a new thread with all the cache flushes/misses it entails. Here we&#8217;re talking about tens of thousands of cycles.</p>
<p><u>The following kernel objects can be in a signaled or nonsignaled state:</u> </p>
<ul>
<li>Processes </li>
<li>Threads </li>
<li>Jobs </li>
<li>File and console standard input/output/error streams </li>
<li>Events </li>
<li>Waitable timers </li>
<li>Semaphores </li>
<li>Mutexes </li>
</ul>
<h1>Wait Functions</h1>
<pre>DWORD dw = WaitForSingleObject(hProcess, 5000);
switch (dw) {

   case WAIT_OBJECT_0:
     // The process terminated.
     break;

   case WAIT_TIMEOUT:
      // The process did not terminate within 5000 milliseconds.
      break;

   case WAIT_FAILED:
      // Bad call to function (invalid handle?)
      break;
}</pre>
<p>The preceding code tells the system that the calling thread should not be schedulable until either the specified process has terminated or 5000 milliseconds have expired, whichever comes first. So this call returns in less than 5000 milliseconds if the process terminates, and it returns in about 5000 milliseconds if the process hasn&#8217;t terminated. Note that you can pass <b>0</b> for the <b>dwMilliseconds</b> parameter. If you do this, <b>WaitForSingleObject</b> always returns immediately, even if the wait condition hasn&#8217;t been satisfied.</p>
<pre>HANDLE3];
h[0] = hProcess1;
h[1] = hProcess2;
h[2] = hProcess3;
DWORD dw = WaitForMultipleObjects(3, h, FALSE, 5000);
switch (dw) {
   case WAIT_FAILED:
      // Bad call to function (invalid handle?)
      break;

   case WAIT_TIMEOUT:
      // None of the objects became signaled within 5000 milliseconds.
      break;

   case WAIT_OBJECT_0 + 0:
      // The process identified by h[0] (hProcess1) terminated.
      break;

   case WAIT_OBJECT_0 + 1:
      // The process identified by h[1] (hProcess2) terminated.
      break;

   case WAIT_OBJECT_0 + 2:
      // The process identified by h[2] (hProcess3) terminated.
      break;
}
<a name="487"></a><a name="IDX-2461C4AB577-C32E-4A16-BF0F-07F3124E103E"></a></pre>
<h2>Successful Wait Side Effects</h2>
<p>For some kernel objects, a successful call to <b>WaitForSingleObject</b> or <b>WaitForMultiple-Objects</b> actually alters the state of the object. A successful call is one in which the function sees that the object was signaled and returns a value relative to <b>WAIT_OBJECT_0</b>. A call is unsuccessful if the function returns <b>WAIT_TIMEOUT</b> or <b>WAIT_FAILED</b>. Objects never have their state altered for unsuccessful calls. </p>
<p><u>Let&#8217;s look at an example. Two threads call <b>WaitForMultipleObjects</b> in exactly the same way:</u></p>
<pre>HANDLE h[2];
h[0] = hAutoResetEvent1;   // Initially nonsignaled
h[1] = hAutoResetEvent2;   // Initially nonsignaled
WaitForMultipleObjects(2, h, TRUE, INFINITE);</pre>
<p>When <b>WaitForMultipleObjects</b> is called, both event objects are nonsignaled; this forces both threads to enter a wait state. Then the <b>hAutoResetEvent1</b> object becomes signaled. Both threads see that the event has become signaled, but neither can wake up because the <b>hAutoResetEvent2</b> object is still nonsignaled. Because neither thread has successfully waited yet, no side effect happens to the <b>hAutoResetEvent1</b> object. </p>
<p>Next, the <b>hAutoResetEvent2</b> object becomes signaled. At this point, one of the two threads detects that both objects it is waiting for have become signaled. The wait is successful, both event objects are set to the nonsignaled state, and the thread is schedulable. But what about the other thread? It continues to wait until it sees that both event objects are signaled. Even though it originally detected that <b>hAutoResetEvent1</b> was signaled, it now sees this object as nonsignaled. </p>
<p>If multiple threads wait for a single kernel object, which thread does the system decide to wake up when the object becomes signaled?&#160; &quot;The algorithm is fair.&quot; which means that if multiple threads are waiting, each should get its own chance to wake up each time the object becomes signaled. </p>
<h1>Event Kernel Objects</h1>
<p>Events signal that an operation has completed. There are two different types of event objects: manual-reset events and auto-reset events. When a manual-reset event is signaled, all threads waiting on the event become schedulable. When an auto-reset event is signaled, only one of the threads waiting on the event becomes schedulable. </p>
<p>Once an event is created, you control its state directly. When you call <b>SetEvent</b>, you change the event to the signaled state:</p>
<pre>BOOL SetEvent(HANDLE hEvent);</pre>
<p>When you call <b>ResetEvent</b>, you change the event to the nonsignaled state:</p>
<pre>BOOL ResetEvent(HANDLE hEvent);</pre>
<p>It&#8217;s that easy. </p>
<p>an auto-reset event is automatically reset to the nonsignaled state when a thread successfully waits on the object. </p>
<h1>Waitable Timer Kernel Objects</h1>
<p>Waitable timers are kernel objects that signal themselves at a certain time or at regular intervals. They are most commonly used to have some operation performed at a certain time. </p>
<p>Waitable timer objects are always created in the nonsignaled state. You must call the <b>SetWaitable-Timer</b> function to tell the timer when you want it to become signaled. </p>
<p>The following code sets a timer to go off for the first time on January 1, 2008, at 1:00 P.M., and then to go off every six hours after that:</p>
<pre>// Declare our local variables.
HANDLE hTimer;
SYSTEMTIME st;
FILETIME ftLocal, ftUTC;
LARGE_INTEGER liUTC;

// Create an auto-reset timer.
hTimer = CreateWaitableTimer(NULL, FALSE, NULL);

// First signaling is at January 1, 2008, at 1:00 P.M. (local time).
st.wYear         = 2008; // Year
st.wMonth        = 1;    // January
st.wDayOfWeek    = 0;    // Ignored
st.wDay          = 1;    // The first of the month
st.wHour         = 13;   // 1PM
st.wMinute       = 0;    // 0 minutes into the hour
st.wSecond       = 0;    // 0 seconds into the minute
st.wMilliseconds = 0;    // 0 milliseconds into the second

SystemTimeToFileTime(&amp;st, &amp;ftLocal);

// Convert local time to UTC time.
LocalFileTimeToFileTime(&amp;ftLocal, &amp;ftUTC);
// Convert FILETIME to LARGE_INTEGER because of different alignment.
liUTC.LowPart  = ftUTC.dwLowDateTime;
liUTC.HighPart = ftUTC.dwHighDateTime;
<a name="505"></a><a name="IDX-2581C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>
// Set the timer.
SetWaitableTimer(hTimer, &amp;liUTC, 6 * 60 * 60 * 1000,
   NULL, NULL, FALSE); <b>...</b></pre>
<p>Instead of setting an absolute time that the timer should first go off, you can have the timer go off at a time relative to calling <b>SetWaitableTimer</b>. You simply pass a negative value in the <b>pDueTime</b> parameter. The value you pass must be in 100-nanosecond intervals. Because we don&#8217;t normally <a name="506"></a><a name="IDX-2591C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>think in intervals of 100 nanoseconds, you might find this useful: 1 second = 1,000 milliseconds = 1,000,000 microseconds = 10,000,000 100-nanoseconds. </p>
<p>The following code sets a timer to initially go off 5 seconds after the call to <b>SetWaitableTimer</b>:</p>
<pre>// Declare our local variables.
HANDLE hTimer;
LARGE_INTEGER li;

// Create an auto-reset timer.
hTimer = CreateWaitableTimer(NULL, FALSE, NULL);

// Set the timer to go off 5 seconds after calling SetWaitableTimer.
// Timer unit is 100 nanoseconds.
const int nTimerUnitsPerSecond = 10000000;

// Negate the time so that SetWaitableTimer knows we
// want relative time instead of absolute time.
li.QuadPart = -(5 * nTimerUnitsPerSecond);

// Set the timer.
SetWaitableTimer(hTimer, &amp;li, 6 * 60 * 60 * 1000,
   NULL, NULL, FALSE); <b>...</b></pre>
<h2>Waitable Timers vs User Timers </h2>
<p>The biggest difference is that User timers require a lot of additional user interface infrastructure in your application, which makes them more resource intensive. Also, waitable timers are kernel objects, which means that they can be shared by multiple threads and are securable.</p>
<ul>
<li>User timers generate <b>WM_TIMER</b> messages that come back to the thread that called <b>SetTimer</b> (for callback timers) or the thread that created the window (for window-based timers). So only one thread is notified when a User timer goes off. Multiple threads, on the other hand, can wait on waitable timers, and several threads can be scheduled if the timer is a manual-reset timer. </li>
<li>With waitable timers, you&#8217;re more likely to be notified when the time actually expires. The <b>WM_TIMER</b> messages are always the lowest-priority messages and are retrieved when no other messages are in a thread&#8217;s queue. </li>
</ul>
<h1>Semaphore Kernel Objects</h1>
<p>Semaphore kernel objects are used for resource counting. They contain a usage count, as all kernel objects do, but they also contain two additional signed 32-bit values: a maximum resource count and a current resource count. The maximum resource count identifies the maximum number of resources that the semaphore can control; the current resource count indicates the number of these resources that are currently available.</p>
<p><u>The rules for a semaphore are as follows:</u></p>
<ul>
<li>If the current resource count is greater than <b>0</b>, the semaphore is signaled. </li>
<li>If the current resource count is <b>0</b>, the semaphore is nonsignaled. </li>
<li>The system never allows the current resource count to be negative. </li>
<li>The current resource count can never be greater than the maximum resource count. </li>
</ul>
<p>A thread gains access to a resource by calling a wait function, passing the handle of the semaphore guarding the resource. Internally, the wait function checks the semaphore&#8217;s current resource count and if its value is greater than <b>0</b> (the semaphore is signaled), the counter is decremented by <b>1</b> and the calling thread remains schedulable. </p>
<p>Unfortunately, there is just no way to get the current resource count of a semaphore without altering it.</p>
<h1>Mutex Kernel Objects</h1>
<p>Mutex kernel objects ensure that a thread has mutual exclusive access to a single resource. A mutex object contains a usage count, thread ID, and recursion counter. Mutexes behave identically to critical sections. However, mutexes are kernel objects, while critical sections are user-mode synchronization objects.</p>
<p>This means that mutexes are slower than critical sections. But it also means that threads in different processes can access a single mutex, and it means that a thread can specify a timeout value while waiting to gain access to a resource.</p>
<p><u>The rules for a mutex are as follows:</u></p>
<ul>
<li>If the thread ID is <b>0</b> (an invalid thread ID), the mutex is not owned by any thread and is signaled. </li>
<li>If the thread ID is a nonzero value, a thread owns the mutex and the mutex is nonsignaled. </li>
</ul>
<p>A thread gains access to the shared resource by calling a wait function, passing the handle of the mutex guarding the resource. Internally, the wait function checks the thread ID to see if it is <b>0</b> (the mutex is signaled). If the thread ID is <b>0</b>, the thread ID is set to the calling thread&#8217;s ID, the recursion counter is set to <b>1</b>, and the calling thread remains schedulable.</p>
<p>Every time a thread successfully waits on a mutex, the object&#8217;s recursion counter is incremented. The only way the recursion counter can have a value greater than <b>1</b> is if the thread waits on the same mutex multiple times.</p>
<h2>Abandonment Issues</h2>
<p>So if a thread owning a mutex terminates (using <b>ExitThread</b>, <b>TerminateThread</b>, <b>ExitProcess</b>, or <b>TerminateProcess</b>) before releasing the mutex, the system considers the mutex to be <i>abandoned</i>— the thread that owns it can never release it because the thread has died.</p>
<p>Because the system keeps track of all mutex and thread kernel objects, it knows exactly when mutexes become abandoned. When a mutex becomes abandoned, the system automatically resets the mutex object&#8217;s thread ID to <b>0</b> and its recursion counter to <b>0</b>. Then the system checks to see whether any threads are currently waiting for the mutex. </p>
<p>This is the same as before except that the wait function does not return the usual <b>WAIT_OBJECT_0</b> value to the thread. Instead, the wait function returns the special value of <b>WAIT_ABANDONED</b>. This special return value (which applies only to mutex objects) indicates that the mutex the thread was waiting on was owned by another thread that was terminated before it finished using the shared resource. The newly scheduled thread has no idea what state the resource is currently in—the resource might be totally corrupt. </p>
<p>In real life, most applications never check explicitly for the <b>WAIT_ABANDONED</b> return value because a thread is rarely just terminated. (This whole discussion provides another great example of why you should never call the <b>TerminateThread</b> function.)</p>
<h1>Mutex vs Critical Section</h1>
<table border="1" cellspacing="2" cellpadding="3" width="990">
<tbody>
<tr>
<td valign="top" width="201">
<p align="center"><strong>Characteristic</strong></p>
</td>
<td valign="top" width="398">
<p align="center"><strong>Mutex</strong></p>
</td>
<td valign="top" width="381">
<p align="center"><strong>Critical Section</strong></p>
</td>
</tr>
<tr>
<td valign="top" width="204"><strong>Mode</strong></td>
<td valign="top" width="397">Kernel Mode</td>
<td valign="top" width="380">User Mode</td>
</tr>
<tr>
<td valign="top" width="206"><strong>Performance</strong></td>
<td valign="top" width="396">Slow</td>
<td valign="top" width="379">Fast</td>
</tr>
<tr>
<td valign="top" width="208"><strong>Can be used across process boundaries?</strong></td>
<td valign="top" width="395">Yes</td>
<td valign="top" width="379">No</td>
</tr>
<tr>
<td valign="top" width="209"><strong>Declaration</strong></td>
<td valign="top" width="395"><strong>HANDLE</strong> hMutex;</td>
<td valign="top" width="378"><strong>CRITICAL_SECTION</strong> cs;</td>
</tr>
<tr>
<td valign="top" width="210"><strong>Initialization</strong></td>
<td valign="top" width="394">hMutext = <strong>CreateMutex</strong>(NULL, FALSE, NULL);</td>
<td valign="top" width="378"><strong>InitializeCriticalSection</strong>(&amp;s);</td>
</tr>
<tr>
<td valign="top" width="211"><strong>Cleanup</strong></td>
<td valign="top" width="394"><strong>CloseHandle</strong>(hMutext);</td>
<td valign="top" width="377"><strong>DeleteCriticalSection</strong>(&amp;cs);</td>
</tr>
<tr>
<td valign="top" width="211"><strong>Infinite Wait</strong></td>
<td valign="top" width="394"><strong>WaitForSingleObject</strong>(hMutex, INFINITE);</td>
<td valign="top" width="377"><strong>EnterCriticalSection</strong>(&amp;cs);</td>
</tr>
<tr>
<td valign="top" width="211"><strong>0 Wait</strong></td>
<td valign="top" width="394"><strong>WaitForSingleObject</strong>(hMutex, 0);</td>
<td valign="top" width="377"><strong>TryEnterCriticalSection</strong>(&amp;cs);</td>
</tr>
<tr>
<td valign="top" width="211"><strong>Arbitrary Wait</strong></td>
<td valign="top" width="394"><strong>WaitForSingleObject</strong>(hMutex, dwMilliseconds);</td>
<td valign="top" width="377">N/A</td>
</tr>
<tr>
<td valign="top" width="211"><strong>Release</strong></td>
<td valign="top" width="394"><strong>ReleaseMutext</strong>(hMutext);</td>
<td valign="top" width="377"><strong>LeaveCriticalSection</strong>(&amp;cs);</td>
</tr>
<tr>
<td valign="top" width="211"><strong>Can be waited on with other kernel objects?</strong></td>
<td valign="top" width="394">Yes (e.g WaitForMultipleObjects or similar)</td>
<td valign="top" width="380">No</td>
</tr>
</tbody>
</table>
<p>&#160;</p>
<h1>API Table</h1>
<table border="1" cellspacing="2" cellpadding="3" width="999">
<tbody>
<tr>
<td valign="top" width="500">
<div align="center">
<pre><strong><font face="Verdana">Function</font></strong></pre>
</p></div>
</td>
<td valign="top" width="491">
<p align="center"><strong>Description</strong></p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>DWORD <strong>WaitForSingleObject</strong>(</pre>
<pre>&#160;&#160; HANDLE hObject,</pre>
<pre>&#160;&#160; DWORD dwMilliseconds);</pre>
</td>
<td valign="top" width="491">
<p>When a thread calls this function, the first parameter, <b>hObject</b>, identifies a kernel object that supports being signaled/nonsignaled. (Any object mentioned in the list on the previous page works just great.) The second parameter, <b>dwMilliseconds</b>, allows the thread to indicate how long it is willing to wait for the object to become signaled.</p>
<p><a name="485"></a><a name="IDX-2441C4AB577-C32E-4A16-BF0F-07F3124E1"></a>The following function call tells the system that the calling thread wants to wait until the process identified by the <b>hProcess</b> handle terminates:</p>
<pre>WaitForSingleObject(hProcess, INFINITE);</pre>
<p>The second parameter tells the system that the calling thread is willing to wait forever (an infinite amount of time) or until this process terminates.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>DWORD <strong>WaitForMultipleObjects</strong>(</pre>
<pre>&#160;&#160; DWORD dwCount,</pre>
<pre>&#160;&#160; CONST HANDLE* phObjects,</pre>
<pre>&#160;&#160; BOOL bWaitAll,</pre>
<pre>&#160;&#160; DWORD dwMilliseconds);</pre>
</td>
<td valign="top" width="491">
<p>The following function, <b>WaitForMultipleObjects</b>, is similar to <b>WaitForSingleObject</b> except that it allows the calling thread to check the signaled state of several kernel objects simultaneously.</p>
<p>The <b>dwCount</b> parameter indicates the number of kernel objects you want the function to check. This value must be between 1 and <b>MAXIMUM_WAIT_OBJECTS</b> (defined as <b>64</b> in the WinNT.h header file). The <b>phObjects</b> parameter is a pointer to an array of kernel object handles.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateEvent</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; BOOL bManualReset,</pre>
<pre>&#160;&#160; BOOL bInitialState,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>creates an event kernel object.</p>
<p>The <b>bManualReset</b> parameter is a Boolean value that tells the system whether to create a manualreset event (<b>TRUE</b>) or an auto-reset event (<b>FALSE</b>). The <b>bInitialState</b> parameter indicates whether the event should be initialized to signaled (<b>TRUE</b>) or nonsignaled (<b>FALSE</b>).</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateEventEx</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; PCTSTR pszName,</pre>
<pre>&#160;&#160; DWORD dwFlags,</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess);</pre>
</td>
<td valign="top" width="491">
<p>allows you to create or open a potentially existing event requesting reduced access</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>OpenEvent</strong>(</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess,</pre>
<pre>&#160;&#160; BOOL bInherit,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>Threads in other processes can gain access to the object by calling <b>CreateEvent</b> using the same value passed in the <b>pszName</b> parameter;</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>SetEvent</strong>(HANDLE hEvent);</pre>
</td>
<td valign="top" width="491">
<p>change the event to the signaled state</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>ResetEvent</strong>(HANDLE hEvent);</pre>
</td>
<td valign="top" width="491">
<p>change the event to the nonsignaled state</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>PulseEvent</strong>(HANDLE hEvent);</pre>
</td>
<td valign="top" width="491">
<p>makes an event signaled and then immediately nonsignaled; it&#8217;s just like calling <b>Set-Event</b> immediately followed by <b>ResetEvent</b>. If you call <b>PulseEvent</b> on a manual-reset event, any and all threads waiting on the event when it is pulsed are schedulable. If you call <b>PulseEvent</b> on an auto-reset event, only one waiting thread becomes schedulable. If no threads are waiting on the event when it is pulsed, there is no effect.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateWaitableTimer</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; BOOL bManualReset,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>create a waitable timer.</p>
<p>the <b>bManualReset</b> parameter indicates a manual-reset or auto-reset timer. When a manual-reset timer is signaled, all threads waiting on the timer become schedulable. When an auto-reset timer is signaled, only one waiting thread becomes schedulable.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>OpenWaitableTimer</strong>(</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess,</pre>
<pre>&#160;&#160; BOOL bInheritHandle,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>process can obtain its own process-relative handle to an existing waitable timer by calling</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>SetWaitableTimer</strong>(</pre>
<pre>&#160;&#160; HANDLE hTimer,</pre>
<pre>&#160;&#160; const LARGE_INTEGER *pDueTime,</pre>
<pre>&#160;&#160; LONG lPeriod,</pre>
<pre>&#160;&#160; PTIMERAPCROUTINE pfnCompletionRoutine,</pre>
<pre>&#160;&#160; PVOID pvArgToCompletionRoutine,</pre>
<pre>&#160;&#160; BOOL bResume);</pre>
</td>
<td valign="top" width="491">
<p>tell the timer when you want it to become signaled.</p>
<p>Obviously, the <b>hTimer</b> parameter indicates the timer that you want to set. The next two parameters, <b>pDueTime</b> and <b>lPeriod</b>, are used together. The <b>pDueTime</b> parameter indicates when the timer should go off for the first time, and the <b>lPeriod</b> parameter indicates how frequently the timer should go off after that</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>CancelWaitableTimer</strong>(HANDLE hTimer);</pre>
</td>
<td valign="top" width="491">
<p>This simple function takes the handle of a timer and cancels it so that the timer never goes off unless there is a subsequent call to <b>SetWaitableTimer</b> to reset the timer.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateSemaphore</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTE psa,</pre>
<pre>&#160;&#160; LONG lInitialCount,</pre>
<pre>&#160;&#160; LONG lMaximumCount,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>creates a semaphore kernel object</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateSemaphoreEx</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; LONG lInitialCount,</pre>
<pre>&#160;&#160; LONG lMaximumCount,</pre>
<pre>&#160;&#160; PCTSTR pszName,</pre>
<pre>&#160;&#160; DWORD dwFlags,</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess);</pre>
</td>
<td valign="top" width="491">
<p>Same as <i>CreateSemaphore</i>, but you can use the following function to directly provide access rights in the <b>dwDesiredAccess</b> parameter. Notice that the <b>dwFlags</b> is reserved and should be set to <b>0</b>.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>OpenSemaphore</strong>(</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess,</pre>
<pre>&#160;&#160; BOOL bInheritHandle,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>another process can obtain its own process-relative handle to an existing semaphore.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>ReleaseSemaphore</strong>(&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; </pre>
<pre>&#160;&#160; HANDLE hSemaphore,</pre>
<pre>&#160;&#160; LONG lReleaseCount,</pre>
<pre>&#160;&#160; PLONG plPreviousCount);</pre>
</td>
<td valign="top" width="491">
<p>A thread increments a semaphore&#8217;s current resource count.</p>
<p>This function simply adds the value in <b>lReleaseCount</b> to the semaphore&#8217;s current resource count. </p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateMutex</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; BOOL bInitialOwner,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>To use a mutex, one process must first create the mutex.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>CreateMutexEx</strong>(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; PCTSTR pszName,</pre>
<pre>&#160;&#160; DWORD dwFlags,</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess);</pre>
</td>
<td valign="top" width="491">
<p>You can also use the following function to directly provide access rights in the <b>dwDesiredAccess</b> parameter. The <b>dwFlags</b> parameter replaces the <b>bInitialOwned</b> parameter of <b>CreateMutex</b>: <b>0</b> means <b>FALSE</b>, and <b>CREATE_MUTEX_</b> <b>INITIAL_OWNER</b> is equivalent to <b>TRUE</b>.</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>HANDLE <strong>OpenMutex</strong>(</pre>
<pre>&#160;&#160; DWORD dwDesiredAccess,</pre>
<pre>&#160;&#160; BOOL bInheritHandle,</pre>
<pre>&#160;&#160; PCTSTR pszName);</pre>
</td>
<td valign="top" width="491">
<p>another process can obtain its own process-relative handle to an existing mutex</p>
</td>
</tr>
<tr>
<td valign="top" width="500">
<pre>BOOL <strong>ReleaseMutex</strong>(HANDLE hMutex);</pre>
</td>
<td valign="top" width="492">
<p>When the thread that currently has access to the resource no longer needs its access, it must release the mutex</p>
</td>
</tr>
</tbody>
</table>
<h1>References</h1>
<blockquote>
<p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/393/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/393/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/393/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/393/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/393/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/393/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/393/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/393/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=393&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/08/29/thread-synchronization-with-kernel-objects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>
	</item>
		<item>
		<title>Thread Synchronization in User Mode</title>
		<link>http://mhesham.wordpress.com/2011/08/15/thread-synchronization-in-user-mode/</link>
		<comments>http://mhesham.wordpress.com/2011/08/15/thread-synchronization-in-user-mode/#comments</comments>
		<pubDate>Mon, 15 Aug 2011 12:45:24 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[critical section]]></category>
		<category><![CDATA[interlocked]]></category>
		<category><![CDATA[multithreading]]></category>
		<category><![CDATA[slim read write lock]]></category>
		<category><![CDATA[spin locks]]></category>
		<category><![CDATA[thread synchronization]]></category>
		<category><![CDATA[threads]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/08/15/thread-synchronization-in-user-mode/</guid>
		<description><![CDATA[Threads need to communicate with each other in two basic situations: When you have multiple threads accessing a shared resource in such a way that the resource does not become corrupt. When one thread needs to notify one or more other threads that a specific task has been completed. Atomic Access: The Interlocked Family of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=387&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Threads need to communicate with each other in two basic situations:</p>
<ul>
<li>When you have multiple threads accessing a shared resource in such a way that the resource does not become corrupt. </li>
<li>When one thread needs to notify one or more other threads that a specific task has been completed. </li>
</ul>
<h1>Atomic Access: The <em>Interlocked</em> Family of Functions</h1>
<p>A big part of thread synchronization has to do with <i>atomic access</i>—a thread&#8217;s ability to access a resource with the guarantee that no other thread will access that same resource at the same time.</p>
<p>Consider the following:</p>
<pre>// Define a global variable.
long g_x = 0;

DWORD WINAPI ThreadFunc1(PVOID pvParam) {
   g_x++;
   return(0);
}

DWORD WINAPI ThreadFunc2(PVOID pvParam) {
   g_x++;
   return(0);
}</pre>
<p>We create two threads: one thread executes <b>ThreadFunc1</b>, and the other thread executes <b>ThreadFunc2</b>.</p>
<p>If one thread executes this code followed by another thread, here is what effectively executes:</p>
<pre>MOV EAX, [g_x]       ; Thread 1: Move 0 into a register.
INC EAX              ; Thread 1: Increment the register to 1.
MOV [g_x], EAX       ; Thread 1: Store 1 back in g_x.

MOV EAX, [g_x]       ; Thread 2: Move 1 into a register.
INC EAX              ; Thread 2: Increment the register to 2.
MOV [g_x], EAX       ; Thread 2: Store 2 back in g_x.</pre>
<p>Windows is a preemptive, multithreaded environment. So a thread can be switched away from at any time and another thread might continue executing at any time.</p>
<pre>MOV EAX, [g_x]       ; Thread 1: Move 0 into a register.
INC EAX              ; Thread 1: Increment the register to 1.

MOV EAX, [g_x]       ; Thread 2: Move 0 into a register.
INC EAX              ; Thread 2: Increment the register to 1.
MOV [g_x], EAX       ; Thread 2: Store 1 back in g_x.

MOV [g_x], EAX       ; Thread 1: Store 1 back in g_x.</pre>
<p>To solve the problem just presented we need to guarantee that the incrementing of the value is done atomically—that is, without interruption. The interlocked family of functions provides the solution we need. All the functions manipulate a value atomically. Take a look at <b>InterlockedExchangeAdd</b> and its sibling <b>InterlockedExchangeAdd64</b> that works on <b>LONGLONG</b> values:</p>
<p>No thread should ever attempt to modify the shared variable by using simple C++ statements:</p>
<pre>// The long variable shared by many threads
LONG g_x; <b>...</b>

// Incorrect way to increment the long
g_x++; <b>...</b>

// Correct way to increment the long
InterlockedExchangeAdd(&amp;g_x, 1);</pre>
<p>You must also ensure that the variable addresses that you pass to these functions are properly aligned or the functions might fail. The C run-time library offers an <b>_aligned_malloc</b> function that you can use to allocate a block of memory that is properly aligned.</p>
<p><b>InterlockedExchange</b> is extremely useful when you implement a spinlock.</p>
<pre>// Global variable indicating whether a shared resource is in use or not
BOOL g_fResourceInUse = FALSE; <b>...</b>
void Func1() {
   // Wait to access the resource.
   while (InterlockedExchange (&amp;g_fResourceInUse, TRUE) == TRUE)
      Sleep(0);

   // Access the resource.
   ...

   // We no longer need to access the resource.
   InterlockedExchange(&amp;g_fResourceInUse, FALSE);
}</pre>
<ul>
<li><font face="Verdana">This code assumes that all threads using the spinlock run at the same priority level. You might also want to disable thread priority boosting.</font> </li>
<li>You should ensure that the lock variable and the data that the lock protects are maintained in different cache lines. </li>
<li>You should avoid using spinlocks on single-CPU machines. If a thread is spinning, it&#8217;s wasting precious CPU time, which prevents the other thread from changing the value. </li>
</ul>
<p>You have access to a series of functions that allow you to easily manipulate a stack called an <i>Interlocked</i> <i>Singly Linked List</i>. Each operation, such as pushing or popping an element, is assured to be executed in an atomic way.</p>
<h1>Cache Lines</h1>
<p>If you want to build a high-performance application that runs on multiprocessor machines, you must be aware of CPU cache lines. When a CPU reads a byte from memory, it does not just fetch the single byte; it fetches enough bytes to fill a cache line. Cache lines consist of 32 (for older CPUs), 64, or even 128 bytes (depending on the CPU), and they are always aligned on 32-byte, 64-byte, or 128-byte boundaries, respectively. Cache lines exist to improve performance. Usually, an application manipulates a set of adjacent bytes. If these bytes are in the cache, the CPU does not have to access the memory bus, which requires much more time.</p>
<p>However, cache lines make memory updates more difficult in a multiprocessor environment, as you can see in this example:</p>
<ul>
<li>CPU1 reads a byte, causing this byte and its adjacent bytes to be read into CPU1&#8242;s cache line. </li>
<li>CPU2 reads the same byte, which causes the same bytes in step 1 to be read into CPU2&#8242;s cache line. </li>
<li>CPU1 changes the byte in memory, causing the byte to be written to CPU1&#8242;s cache line. But the information is not yet written to RAM. </li>
<li>CPU2 reads the same byte again. Because this byte was already in CPU2&#8242;s cache line, it doesn&#8217;t have to access memory. But CPU2 will not see the new value of the byte in memory. </li>
</ul>
<p>What all this means is that you should group your application&#8217;s data together in cache line—size chunks and on cache-line boundaries. The goal is to make sure that different CPUs access different memory addresses separated by at least a cache-line boundary. Also, you should separate your read-only data (or infrequently read data) from read-write data. And you should group together pieces of data that are accessed around the same time.</p>
<pre>struct CUSTINFO {
   DWORD    dwCustomerID;     // Mostly read-only
   int      nBalanceDue;      // Read-write
   wchar_t  szName[100];      // Mostly read-only
   FILETIME ftLastOrderDate;  // Read-write
};</pre>
<pre><font face="Verdana">you can use the C/C++ compiler's <b>__declspec(align(#))</b> directive to control field alignment. Here is an improved version of this structure:</font></pre>
<pre>#define CACHE_ALIGN 64

// Force each structure to be in a different cache line.
struct __declspec(align(CACHE_ALIGN)) CUSTINFO {
   DWORD    dwCustomerID;     // Mostly read-only
   wchar_t  szName[100];      // Mostly read-only

   // Force the following members to be in a different cache line.
   __declspec(align(CACHE_ALIGN))
   int nBalanceDue;           // Read-write
   FILETIME ftLastOrderDate;  // Read-write
};</pre>
<p>It is best for data to be always accessed by a single thread (function parameters and local variables are the easiest way to ensure this) or for the data to be always accessed by a single CPU (using thread affinity). If you do either of these, you avoid cache-line issues entirely.</p>
<h1>Critical Sections</h1>
<p>A <i>critical section</i> is a small section of code that requires exclusive access to some shared resource before the code can execute. This is a way to have several lines of code &quot;atomically&quot; manipulate a resource. By <i>atomic</i>, I mean that the code knows that no other thread will access the resource. Of course, the system can still preempt your thread and schedule other threads. However, it will not schedule any other threads that want to access the same resource until your thread leaves the critical section.</p>
<p>Here is some problematic code that demonstrates what happens without the use of a critical section:</p>
<pre>const int COUNT = 1000;
int g_nSum = 0;

DWORD WINAPI FirstThread(PVOID pvParam) {
   g_nSum = 0;
   for (int n = 1; n &lt;= COUNT; n++) {
      g_nSum += n;
   }
   return(g_nSum);
}

DWORD WINAPI SecondThread(PVOID pvParam) {
   g_nSum = 0;
   for (int n = 1; n &lt;= COUNT; n++) {
      g_nSum += n;
   }
   return(g_nSum);
}</pre>
<p>Let&#8217;s correct the code using a critical section:</p>
<pre>const int COUNT = 10;
int g_nSum = 0;
CRITICAL_SECTION g_cs;

DWORD WINAPI FirstThread(PVOID pvParam) {
   EnterCriticalSection(&amp;g_cs);
   g_nSum = 0;
   for (int n = 1; n &lt;= COUNT; n++) {
      g_nSum += n;
   }
   LeaveCriticalSection(&amp;g_cs);
   return(g_nSum);
}

DWORD WINAPI SecondThread(PVOID pvParam) {
   EnterCriticalSection(&amp;g_cs);
   g_nSum = 0;
   for (int n = 1; n &lt;= COUNT; n++) {
      g_nSum += n;
   }
   LeaveCriticalSection(&amp;g_cs);
   return(g_nSum);
}</pre>
<pre><font face="Verdana"></font></pre>
<p>The great thing about critical sections is that they are easy to use and they use the interlocked functions internally, so they execute quickly. The major disadvantage of critical sections is that you cannot use them to synchronize threads in multiple processes.</p>
<p><u>To use critical sections:</u></p>
<ul>
<li>All threads that want to access the resource must know the address of the <b>CRITICAL_SECTION</b> structure that protects the resource. </li>
<li>The members within the <b>CRITICAL_SECTION</b> structure be initialized before any threads attempt to access the protected resource. The structure is initialized via a call to VOID <strong>InitializeCriticalSection</strong>(PCRITICAL_SECTION pcs); </li>
<li>When you know that your process&#8217; threads will no longer attempt to access the shared resource, you should clean up the <b>CRITICAL_SECTION</b> structure by calling this function: VOID <strong>DeleteCriticalSection</strong>(PCRITICAL_SECTION pcs); </li>
<li>When you write code that touches a shared resource, you must prefix that code with a call to: VOID <strong>EnterCriticalSection</strong>(PCRITICAL_SECTION pcs); </li>
<li>At the end of your code that touches the shared resource, you must call this function: VOID <strong>LeaveCriticalSection</strong>(PCRITICAL_SECTION pcs); </li>
</ul>
<h2>Critical Sections and Spin Locks</h2>
<p>When a thread attempts to enter a critical section owned by another thread, the calling thread is placed immediately into a wait state. This means that the thread must transition from user mode to kernel mode (about 1000 CPU cycles). This transition is very expensive. On a multiprocessor machine, the thread that currently owns the resource might execute on a different processor and might relinquish control of the resource shortly. In fact, the thread that owns the resource might release it before the other thread has completed executing its transition into kernel mode. If this happens, a lot of CPU time is wasted.</p>
<p>To improve the performance of critical sections, Microsoft has incorporated spinlocks into them. So when <b>EnterCriticalSection</b> is called, it loops using a spinlock to try to acquire the resource some number of times. Only if all the attempts fail does the thread transition to kernel mode to enter a wait state.</p>
<p>To use a spinlock with a critical section, you should initialize the critical section by calling this function:</p>
<pre>BOOL InitializeCriticalSectionAndSpinCount(
   PCRITICAL_SECTION pcs,
   DWORD dwSpinCount);</pre>
<h1>Slim Reader-Writer Locks</h1>
<p>An <b>SRWLock</b> has the same purpose as a simple critical section: to protect a single resource against access made by different threads. However, unlike a critical section, an <b>SRWLock</b> allows you to distinguish between threads that simply want to read the value of the resource (the readers) and other threads that are trying to update this value (the writers). It should be possible for all readers to access the shared resource at the same time because there is no risk of data corruption if you only read the value of a resource. The need for synchronization begins when a writer thread wants to update the resource. In that case, the access should be exclusive: no other thread, neither a reader nor a writer, should be allowed to access the resource. This is exactly what an <b>SRWLock</b> allows you to do in your code and in a very explicit way.</p>
<table border="0" cellspacing="0" cellpadding="2" width="401">
<tbody>
<tr>
<td valign="top" width="145" align="center"><strong>VS</strong></td>
<td valign="top" width="138"><strong>SRW Owner</strong></td>
<td valign="top" width="116">&nbsp;</td>
</tr>
<tr>
<td valign="top" width="145"><strong>Request Owner</strong></td>
<td valign="top" width="138" align="center"><font color="#ffffff"></font><font color="#000000"><strong>Reader</strong></font></td>
<td valign="top" width="116" align="center"><font color="#000000"><strong>Writer</strong></font><font color="#ffffff"></font></td>
</tr>
<tr>
<td valign="top" width="145" align="right"><font color="#ffffff"></font><strong><font color="#000000">Reader</font><font></font></strong></td>
<td valign="top" width="138" align="center"><font color="#00d000">Allow</font></td>
<td valign="top" width="116" align="center"><font color="#ff0000">Block</font></td>
</tr>
<tr>
<td valign="top" width="145" align="right"><font color="#ffffff"></font><strong><font color="#000000">Writer</font><font></font></strong></td>
<td valign="top" width="138" align="center"><font color="#ff0000">Block</font></td>
<td valign="top" width="116" align="center"><font color="#ff0000">Block</font></td>
</tr>
</tbody>
</table>
<p>&#160;</p>
<p>As we see from the table, that SRWLocks are very suitable when Readers are more than Writers.</p>
<p>This article is a very good one to understand SRWLocks <a href="http://blogs.msdn.com/b/matt_pietrek/archive/2006/10/19/slim-reader-writer-locks.aspx">http://blogs.msdn.com/b/matt_pietrek/archive/2006/10/19/slim-reader-writer-locks.aspx</a></p>
<p><u>To use <strong>SRWLocks</strong>:</u></p>
<ol>
<li>First, you allocate an <b>SRWLOCK</b> structure and initialize it with the <b>InitializeSRWLock</b> function: VOID <strong>InitializeSRWLock</strong>(PSRWLOCK SRWLock); </li>
<li><u>For readers:</u>
<ol>
<li>Thread can try to acquire an exclusive access to the resource protected by the <b>SRWLock</b> by calling <b>AcquireSRWLockExclusive</b> with the address of the <b>SRWLOCK</b> object as its parameter: VOID <strong>AcquireSRWLockExclusive</strong>(PSRWLOCK SRWLock); </li>
<li>When the resource has been updated, the lock is released by calling <b>ReleaseSRWLockExclusive</b> with the address of the <b>SRWLOCK</b> object as its parameter: VOID <strong>ReleaseSRWLockExclusive</strong>(PSRWLOCK SRWLock); </li>
</ol>
</li>
<li><u>For writers:</u>
<ol>
<li>the same two-step scenario occurs but with the following two new functions: VOID <strong>AcquireSRWLockShared</strong>(PSRWLOCK SRWLock); VOID <strong>ReleaseSRWLockShared</strong>(PSRWLOCK SRWLock); </li>
</ol>
</li>
</ol>
<p>If you want to get the best performance in an application, you should try to use nonshared data first and then use volatile reads, volatile writes, interlocked APIs, <b>SRWLocks</b>, critical sections. And if all of these won&#8217;t work for your situation, then and only then, use kernel objects.</p>
<h1>Condition Variables</h1>
<p>You have seen that an <b>SRWLock</b> is used when you want to allow producer and consumer threads access to the same resource either in exclusive or shared mode. In these kinds of situations, if there is nothing to consume for a reader thread, it should release the lock and wait until there is something new produced by a writer thread. If the data structure used to receive the items produced by a writer thread becomes full, the lock should also be released and the writer thread put to sleep until reader threads have emptied the data structure.</p>
<p>Condition Variables are used in scenarios where a thread has to atomically release a lock on a resource and blocks until a condition is met through the <b>SleepConditionVariableCS</b> or <b>SleepConditionVariableSRW</b> functions.</p>
<p>A thread blocked inside these <b>Sleep*</b> functions is awakened when <b>WakeConditionVariable</b> or <b>WakeAllConditionVariable</b> is called by another thread that detects that the right condition is satisfied, such as the presence of an element to consume for a reader thread or enough room to <a name="457"></a><a name="IDX-2281C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>insert a produced element for a writer thread.</p>
<p>This article solves the well known consumer/producer problem using condition variables with critical sections.</p>
<h1>API Table</h1>
<table border="1" cellspacing="2" cellpadding="5" width="915">
<tbody>
<tr>
<td valign="top" width="563">
<p align="center"><strong>Function</strong></p>
</td>
<td valign="top" width="344">
<p align="center"><strong>Description</strong></p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>LONG <strong>InterlockedExchangeAdd</strong>(</pre>
<pre>&#160;&#160; PLONG volatile plAddend,</pre>
<pre>&#160;&#160; LONG lIncrement);</pre>
<pre>LONGLONG InterlockedExchangeAdd64(</pre>
<pre>&#160;&#160; PLONGLONG volatile pllAddend,</pre>
<pre>&#160;&#160; LONGLONG llIncrement);</pre>
</td>
<td valign="top" width="344">
<p>Performs an atomic addition of two 32-bit values.</p>
<p>To operate on 64-bit values, used InterlockedExchangeAdd64</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>void * <strong>_aligned_malloc</strong>(size_t size, size_t alignment);</pre>
</td>
<td valign="top" width="344">
<p>Used to allocate a block of memory that is properly aligned.</p>
<p>The <b>size</b> argument identifies the number of bytes you want to allocate, and the <b>alignment</b> argument indicates the byte boundary that you want the block aligned on. The value you pass for the <b>alignment</b> argument must be an integer power of 2.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>LONG <strong>InterlockedExchange</strong>(</pre>
<pre>&#160;&#160; PLONG volatile plTarget,</pre>
<pre>&#160;&#160; LONG lValue);</pre>
<pre><a name="429"></a><a name="pg2111C4AB577-C32E-4A16-BF0F-07F3124E103"></a>LONGLONG <strong>InterlockedExchange64</strong>(</pre>
<pre>&#160;&#160; PLONGLONG volatile plTarget,</pre>
<pre>&#160;&#160; LONGLONG lValue);</pre>
<pre>PVOID <strong>InterlockedExchangePointer</strong>(</pre>
<pre>&#160;&#160; PVOID* volatile ppvTarget,</pre>
<pre>&#160;&#160; PVOID pvValue);</pre>
</td>
<td valign="top" width="344">
<p>Replace the current value whose address is passed in the first parameter with a value passed in the second parameter.</p>
<p>For a 32-bit application, both functions replace a 32-bit value with another 32-bit value. But for a 64-bit application, <b>InterlockedExchange</b> replaces a 32-bit value while <b>InterlockedExchangePointer</b> replaces a 64-bit value. Both functions return the original value.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>PVOID <strong>InterlockedCompareExchange</strong>(</pre>
<pre>&#160;&#160; PLONG plDestination,</pre>
<pre>&#160;&#160; LONG lExchange,</pre>
<pre>&#160;&#160; LONG lComparand);</pre>
<pre>PVOID <strong>InterlockedCompareExchangePointer</strong>(</pre>
<pre>&#160;&#160; PVOID* ppvDestination,</pre>
<pre>&#160;&#160; PVOID pvExchange,</pre>
<pre>&#160;&#160; PVOID pvComparand);</pre>
</td>
<td valign="top" width="344">
<p>These two functions perform an atomic test and set operation: for a 32-bit application, both functions operate on 32-bit values, but in a 64-bit application, <b>InterlockedCompareExchange</b> operates on 32-bit values while <b>InterlockedCompareExchangePointer</b> operates on 64-bit values. In pseudocode, here is what happens:</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>LONG <strong>InterlockedIncrement</strong>(PLONG plAddend);</pre>
<pre>&#160;</pre>
<pre>LONG <strong>InterlockedDecrement</strong>(PLONG plAddend);</pre>
</td>
<td valign="top" width="344">
<p>These two functions perform atomic increment or decrement</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>InitializeCriticalSection</strong>(PCRITICAL_SECTION pcs);</pre>
</td>
<td valign="top" width="344">
<p>This function initializes the members of a <b>CRITICAL_SECTION</b> structure (pointed to by <b>pcs</b>).</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>DeleteCriticalSection</strong>(PCRITICAL_SECTION pcs);</pre>
</td>
<td valign="top" width="344">
<p>Resets the member variables inside the structure. Naturally, you should not delete a critical section if any threads are still using it.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>EnterCriticalSection</strong>(PCRITICAL_SECTION pcs);</pre>
</td>
<td valign="top" width="344">
<p>When you write code that touches a shared resource u should prefix this code with this function.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>BOOL <strong>TryEnterCriticalSection</strong>(PCRITICAL_SECTION pcs);</pre>
</td>
<td valign="top" width="344">
<p><b>TryEnterCriticalSection</b> never allows the calling thread to enter a wait state. Instead, its return value indicates whether the calling thread was able to gain access to the resource. So if <b>TryEnterCriticalSection</b> sees that the resource is being accessed by another thread, it returns <b>FALSE</b>. In all other cases, it returns <b>TRUE</b>.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>LeaveCriticalSection</strong>(PCRITICAL_SECTION pcs);</pre>
</td>
<td valign="top" width="344">
<p>Call this function at the end of your code that touches the shared resource.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>BOOL <strong>InitializeCriticalSectionAndSpinCount</strong>(</pre>
<pre>&#160;&#160; PCRITICAL_SECTION pcs,</pre>
<pre>&#160;&#160; DWORD dwSpinCount);</pre>
</td>
<td valign="top" width="344">
<p>To use a spinlock with a critical section.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>DWORD <strong>SetCriticalSectionSpinCount</strong>(</pre>
<pre>&#160;&#160; PCRITICAL_SECTION pcs,</pre>
<pre>&#160;&#160; DWORD dwSpinCount);</pre>
</td>
<td valign="top" width="344">
<p>To change a critical section&#8217;s spin count.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>BOOL <strong>SleepConditionVariableCS</strong>(</pre>
<pre>&#160;&#160; PCONDITION_VARIABLE pConditionVariable,</pre>
<pre>&#160;&#160; PCRITICAL_SECTION pCriticalSection,</pre>
<pre>&#160;&#160; DWORD dwMilliseconds);</pre>
</td>
<td valign="top" width="344">
<p>Sleeps on the specified condition variable and releases the specified critical section as an atomic operation.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>BOOL <strong>SleepConditionVariableSRW</strong>(</pre>
<pre>&#160;&#160; PCONDITION_VARIABLE pConditionVariable,</pre>
<pre>&#160;&#160; PSRWLOCK pSRWLock,</pre>
<pre>&#160;&#160; DWORD dwMilliseconds,</pre>
<pre>&#160;&#160; ULONG Flags);</pre>
</td>
<td valign="top" width="344">
<p>Sleeps on the specified condition variable and releases the specified SRW lock as an atomic operation.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>WakeConditionVariable</strong>(</pre>
<pre>&#160;&#160; PCONDITION_VARIABLE ConditionVariable);</pre>
</td>
<td valign="top" width="344">
<p>Wakes a single thread waiting on the specified condition variable.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>WakeAllConditionVariable</strong>(</pre>
<pre>&#160;&#160; PCONDITION_VARIABLE ConditionVariable);</pre>
</td>
<td valign="top" width="344">
<p>Wakes all threads waiting on the specified condition variable.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>InitializeSRWLock</strong>(PSRWLOCK SRWLock);</pre>
</td>
<td valign="top" width="344">
<p>Initialize an SRW lock.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>AcquireSRWLockExclusive</strong>(PSRWLOCK SRWLock);</pre>
</td>
<td valign="top" width="344">
<p>Acquires an SRW lock in exclusive mode.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>ReleaseSRWLockExclusive</strong>(PSRWLOCK SRWLock);</pre>
</td>
<td valign="top" width="344">
<p>Releases an SRW lock that was opened in exclusive mode.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>AcquireSRWLockShared</strong>(PSRWLOCK SRWLock);</pre>
</td>
<td valign="top" width="344">
<p>Acquires an SRW lock in shared mode.</p>
</td>
</tr>
<tr>
<td valign="top" width="563">
<pre>VOID <strong>ReleaseSRWLockShared</strong>(PSRWLOCK SRWLock);</pre>
</td>
<td valign="top" width="344">
<p>Releases an SRW lock that was opened in shared mode.</p>
</td>
</tr>
</tbody>
</table>
<p>&#160;</p>
<h1>References</h1>
<blockquote>
<p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/387/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=387&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/08/15/thread-synchronization-in-user-mode/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>
	</item>
		<item>
		<title>Changing Thread Path of Execution</title>
		<link>http://mhesham.wordpress.com/2011/08/11/changing-thread-path-of-execution/</link>
		<comments>http://mhesham.wordpress.com/2011/08/11/changing-thread-path-of-execution/#comments</comments>
		<pubDate>Thu, 11 Aug 2011 16:31:50 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[thread context switching]]></category>
		<category><![CDATA[threads]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/08/11/changing-thread-path-of-execution/</guid>
		<description><![CDATA[Every thread has a context structure, which is maintained inside the thread&#8217;s kernel object. This context structure reflects the state of the thread&#8217;s CPU registers when the thread was last executing. Every 20 milliseconds or so (as returned by the second parameter of the GetSystemTimeAdjustment function), Windows looks at all the thread kernel objects currently [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=372&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Every thread has a context structure, which is maintained inside the thread&#8217;s kernel object. This context structure reflects the state of the thread&#8217;s CPU registers when the thread was last executing.</p>
<p>Every 20 milliseconds or so (as returned by the second parameter of the <b>GetSystemTimeAdjustment</b> function), Windows looks at all the thread kernel objects currently in existence. Of these objects, only some are considered schedulable. Windows selects one of the schedulable thread kernel objects and loads the CPU&#8217;s registers with the values that were last saved in the thread&#8217;s context. This action is called a <i>context switch</i>. </p>
<p>The code primary thread (main function) below creates a new thread where its entry point is ThreadFunc1, and while it is running it suspends this secondary and changes its path of execution to the address of another function.</p>
<h1>Code</h1>
<p>DWORD WINAPI ThreadFunc1(PVOID pvParam)   <br />{    <br />&#160;&#160;&#160; _tprintf_s(_T(&quot;I am ThreadFunc1\n&quot;));    <br />&#160;&#160;&#160; while(1)    <br />&#160;&#160;&#160; {</p>
<p>&#160;&#160;&#160; }   <br />&#160;&#160;&#160; _tprintf_s(_T(&quot;Exiting ThreadFunc1\n&quot;));</p>
<p>&#160;&#160;&#160; return 0;   <br />}</p>
<p>DWORD WINAPI ThreadFunc2(PVOID pvParam)   <br />{    <br />&#160;&#160;&#160; _tprintf_s(_T(&quot;I am ThreadFunc2\n&quot;));    <br />&#160;&#160;&#160; while(1)    <br />&#160;&#160;&#160; {</p>
<p>&#160;&#160;&#160; }   <br />&#160;&#160;&#160; _tprintf_s(_T(&quot;Exiting ThreadFunc2\n&quot;));    <br />&#160;&#160;&#160; <br />&#160;&#160;&#160; return 0;    <br />}</p>
<p>int _tmain(int argc, TCHAR* argv[])   <br />{    <br />&#160;&#160;&#160; // create a new thread with ThreadFunc1 as its entry-point    <br />&#160;&#160;&#160; HANDLE hThread = chBEGINTHREADEX(NULL, 0, ThreadFunc1, NULL, 0, NULL);</p>
<p>&#160;&#160;&#160; if(!hThread)   <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; PrintLastError();</p>
<p>&#160;&#160;&#160; // lets give the thread some time to do some work   <br />&#160;&#160;&#160; Sleep(2000);</p>
<p>&#160;&#160;&#160; SuspendThread(hThread);</p>
<p>&#160;&#160;&#160; CONTEXT cThread;   <br />&#160;&#160;&#160; // get control registers such as EIP (instruction pointer)    <br />&#160;&#160;&#160; cThread.ContextFlags = CONTEXT_CONTROL;    <br />&#160;&#160;&#160; GetThreadContext(hThread, &amp;cThread);</p>
<p>&#160;&#160;&#160; // change the target thread path of execution to ThreadFunc2   <br />&#160;&#160;&#160; cThread.Eip = (DWORD)ThreadFunc2;    <br />&#160;&#160;&#160; SetThreadContext(hThread, &amp;cThread);</p>
<p>&#160;&#160;&#160; ResumeThread(hThread);</p>
<p>&#160;&#160;&#160; WaitForSingleObject(hThread, INFINITE);   <br />&#160;&#160;&#160; CloseHandle(hThread);</p>
<p>&#160;&#160;&#160; return 0;   <br />}</p>
<h1>Output</h1>
<p><a href="http://mhesham.files.wordpress.com/2011/08/image5.png"><img style="background-image:none;border-bottom:0;border-left:0;padding-left:0;padding-right:0;display:inline;border-top:0;border-right:0;padding-top:0;margin:5px;" title="image" border="0" alt="image" src="http://mhesham.files.wordpress.com/2011/08/image_thumb5.png?w=862&#038;h=572" width="862" height="572" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/372/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=372&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/08/11/changing-thread-path-of-execution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/08/image_thumb5.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Thread Scheduling, Priorities and Affinities</title>
		<link>http://mhesham.wordpress.com/2011/08/11/thread-scheduling-priorities-and-affinities/</link>
		<comments>http://mhesham.wordpress.com/2011/08/11/thread-scheduling-priorities-and-affinities/#comments</comments>
		<pubDate>Thu, 11 Aug 2011 18:35:15 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[affinities]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[priorities]]></category>
		<category><![CDATA[process]]></category>
		<category><![CDATA[scheduling]]></category>
		<category><![CDATA[thread context switching]]></category>
		<category><![CDATA[threads]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/08/11/thread-scheduling-priorities-and-affinities/</guid>
		<description><![CDATA[Windows is called a preemptive multithreaded operating system because a thread can be stopped at any time and another thread can be scheduled Suspending and Resuming a Thread Creating a thread in the suspended state allows you to alter the thread&#8217;s environment before the thread has a chance to execute any code. Once you alter [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=369&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Windows is called a <em>preemptive multithreaded operating system</em> because a thread can be stopped at any time and another thread can be scheduled</p>
<h1>Suspending and Resuming a Thread</h1>
<p>Creating a thread in the suspended state allows you to alter the thread&#8217;s environment before the thread has a chance to execute any code. Once you alter the thread&#8217;s environment, you must make the thread schedulable. You do this by calling <strong>ResumeThread</strong> and passing it the thread handle returned by the call to <strong>CreateThread</strong> (or the thread handle from the structure pointed to by the <strong>ppiProcInfo</strong> parameter passed to <strong>CreateProcess</strong>):</p>
<pre>DWORD ResumeThread(HANDLE hThread);</pre>
<p>A single thread can be suspended several times. If a thread is suspended three times, it must be resumed three times before it is eligible for assignment to a CPU. In addition to using the <strong>CREATE_</strong> <strong>SUSPENDED</strong> flag when you create a thread, you can suspend a thread by calling <strong>SuspendThread</strong>:</p>
<pre>DWORD SuspendThread(HANDLE hThread);</pre>
<p><strong>SuspendThread</strong> is asynchronous with respect to kernel-mode execution, but user-mode execution does not occur until the thread is resumed.</p>
<p>In real life, an application must be careful when it calls <strong>SuspendThread</strong> because you have no idea what the thread might be doing when you attempt to suspend it.</p>
<p><strong>SuspendThread</strong> is safe only if you know exactly what the target thread is (or might be doing) and you take extreme measures to avoid problems or deadlocks caused by suspending the thread.</p>
<p><strong><span style="text-decoration:underline;">Note:</span></strong> The concept of suspending or resuming a <em>process</em> doesn&#8217;t exist for Windows because processes are never scheduled CPU time.</p>
<h1>Sleeping</h1>
<p>A thread can also tell the system that it does not want to be schedulable for a certain amount of time. This is accomplished by calling <strong>Sleep</strong>:</p>
<pre>VOID Sleep(DWORD dwMilliseconds);</pre>
<p><span style="text-decoration:underline;">There are a few important things to notice about <strong>Sleep</strong>:</span></p>
<ul>
<li>Calling <strong>Sleep</strong> allows the thread to voluntarily give up the remainder of its time slice.</li>
<li>The system makes the thread not schedulable for <em>approximately</em> the number of milliseconds specified. That&#8217;s right—if you tell the system you want to sleep for 100 milliseconds, you will sleep approximately that long, but possibly several seconds or minutes more.</li>
<li>You can call <strong>Sleep</strong> and pass <strong>INFINITE</strong> for the <strong>dwMilliseconds</strong> parameter. This tells the system to never schedule the thread.</li>
<li>You can pass <strong>0</strong> to <strong>Sleep</strong>. This tells the system that the calling thread relinquishes the remainder of its time slice, and it forces the system to schedule another thread.</li>
</ul>
<p>Windows is not a real-time operating system. Your thread will probably wake up at the right time, but whether it does depends on what else is going on in the system.</p>
<h1>Switching to Another Thread</h1>
<p><a name="375"></a><a name="IDX-1781C4AB577-C32E-4A16-BF0F-07F3124E103E"></a></p>
<p>The system offers a function called <strong>SwitchToThread</strong> that allows another schedulable thread to run if one exists:</p>
<pre>BOOL SwitchToThread();</pre>
<p>When you call this function, the system checks to see whether there is a thread that is being starved of CPU time. If no thread is starving, <strong>SwitchToThread</strong> returns immediately. If there is a starving thread, <strong>SwitchToThread</strong> schedules that thread (which might have a lower priority than the thread calling <strong>SwitchToThread</strong>). The starving thread is allowed to run for one time quantum and then the system scheduler operates as usual.</p>
<p>This function allows a thread that wants a resource to force a lower-priority thread that might currently own the resource to relinquish the resource. If no other thread can run when <strong>SwitchToThread</strong> is called, the function returns <strong>FALSE</strong>; otherwise, it returns a nonzero value.</p>
<h1>A Thread&#8217;s Execution Times</h1>
<p>Sometimes you want to time how long it takes a thread to perform a particular task. What many people do is write code similar to the following, taking advantage of the new <strong>GetTickCount64</strong> function:</p>
<pre>// Get the current time (start time).
ULONGLONG qwStartTime = GetTickCount64();

// Perform complex algorithm here.

// Subtract start time from current time to get duration.
ULONGLONG qwElapsedTime = GetTickCount64() - qwStartTime;</pre>
<p>This code makes a simple assumption: it won&#8217;t be interrupted. However, in a preemptive operating system, you never know when your thread will be scheduled CPU time. When CPU time is taken away from your thread, it becomes more difficult to time how long it takes your thread to perform various tasks. What we need is a function that returns the amount of CPU time that the thread has received. Fortunately, prior to Windows Vista, the operating system offers a function called <strong>GetThreadTimes</strong> that returns this information:</p>
<pre>BOOL GetThreadTimes(
   HANDLE hThread,
   PFILETIME pftCreationTime,
   PFILETIME pftExitTime,
   PFILETIME pftKernelTime,
   PFILETIME pftUserTime);</pre>
<p>Using this function, you can determine the amount of time needed to execute a complex algorithm by using code such as the following.</p>
<p><a name="380"></a><a name="IDX-180"></a></p>
<pre>__int64 FileTimeToQuadWord (PFILETIME pft) {
   return(Int64ShllMod32(pft-&gt;dwHighDateTime, 32) | pft-&gt;dwLowDateTime);
}

void PerformLongOperation () {

   FILETIME ftKernelTimeStart, ftKernelTimeEnd;
   FILETIME ftUserTimeStart,   ftUserTimeEnd;
   FILETIME ftDummy;
   __int64 qwKernelTimeElapsed, qwUserTimeElapsed,
      qwTotalTimeElapsed;

   // Get starting times.
   GetThreadTimes(GetCurrentThread(), &amp;ftDummy, &amp;ftDummy,
      &amp;ftKernelTimeStart, &amp;ftUserTimeStart);

   // Perform complex algorithm here.

   // Get ending times.
   GetThreadTimes(GetCurrentThread(), &amp;ftDummy, &amp;ftDummy,
      &amp;ftKernelTimeEnd, &amp;ftUserTimeEnd);

   // Get the elapsed kernel and user times by converting the start
   // and end times from FILETIMEs to quad words, and then subtract
   // the start times from the end times.
   qwKernelTimeElapsed = FileTimeToQuadWord(&amp;ftKernelTimeEnd) -
      FileTimeToQuadWord(&amp;ftKernelTimeStart);

   qwUserTimeElapsed = FileTimeToQuadWord(&amp;ftUserTimeEnd) -
      FileTimeToQuadWord(&amp;ftUserTimeStart);

   // Get total time duration by adding the kernel and user times.
   qwTotalTimeElapsed = qwKernelTimeElapsed + qwUserTimeElapsed;

   // The total elapsed time is in qwTotalTimeElapsed.
}</pre>
<h1></h1>
<h1>Thread Context</h1>
<p>The <strong>CONTEXT</strong> structure allows the system to remember a thread&#8217;s state so that the thread can pick up where it left off the next time it has a CPU to run on.</p>
<p>Windows actually lets you look inside a thread&#8217;s kernel object and grab its current set of CPU registers. To do this, you simply call <strong>GetThreadContext</strong>:</p>
<pre>BOOL GetThreadContext(
   HANDLE hThread,
   PCONTEXT pContext);</pre>
<p>You should call <strong>SuspendThread</strong> before calling <strong>GetThreadContext</strong>; otherwise, the thread might be scheduled and the thread&#8217;s context might be different from what you get back.</p>
<p>It&#8217;s amazing how much power Windows offers the developer! But, if you think that&#8217;s cool, you&#8217;re gonna love this: Windows lets you change the members in the <strong>CONTEXT</strong> structure and then place the new register values back into the thread&#8217;s kernel object by calling <strong>SetThreadContext</strong>:</p>
<pre>BOOL SetThreadContext(
   HANDLE hThread,
   CONST CONTEXT *pContext);</pre>
<p>Again, the thread whose context you&#8217;re changing should be suspended first or the results will be unpredictable.</p>
<p>Before calling <strong>SetThreadContext</strong>, you must initialize the <strong>ContextFlags</strong> member of <strong>CONTEXT</strong> again, as shown here:</p>
<pre>CONTEXT Context;

// Stop the thread from running.
SuspendThread(hThread);

// Get the thread's context registers.
Context.ContextFlags = CONTEXT_CONTROL;
GetThreadContext(hThread, &amp;Context);

// Make the instruction pointer point to the address of your choice.
// Here I've arbitrarily set the address instruction pointer to
// 0x00010000.
Context.Eip = 0x00010000;

// Set the thread's registers to reflect the changed values.
// It's not really necessary to reset the ContextFlags member
// because it was set earlier.
Context.ContextFlags = CONTEXT_CONTROL;
SetThreadContext(hThread, &amp;Context);
<a name="388"></a><a name="idx-1871C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>
// Resuming the thread will cause it to begin execution
// at address 0x00010000.
ResumeThread(hThread);</pre>
<p>This will probably cause an access violation in the remote thread; the unhandled exception message box will be presented to the user, and the remote process will be terminated. That&#8217;s right—the remote process will be terminated, not your process. You will have successfully crashed another process while yours continues to execute just fine!</p>
<h1>Thread Priorities</h1>
<p>Every thread is assigned a priority number ranging from 0 (the lowest) to 31 (the highest). When the system decides which thread to assign to a CPU, it examines the priority 31 threads first and schedules them in a round-robin fashion. If a priority 31 thread is schedulable, it is assigned to a CPU. At the end of this thread&#8217;s time slice, the system checks to see whether there is another priority 31 thread that can run; if so, it allows that thread to be assigned to a CPU.</p>
<p>Starvation occurs when higher-priority threads use so much CPU time that they prevent lower-priority threads from executing.</p>
<p>Higher-priority threads always preempt lower-priority threads, regardless of what the lower-priority threads are executing.</p>
<p>when the system boots, it creates a special thread called the <em>zero page thread</em>. This thread is assigned priority 0 and is the only thread in the entire system that runs at priority 0. The zero page thread is responsible for zeroing any free pages of RAM in the system when there are no other threads that need to perform work.</p>
<p>Application developers never work with priority levels. Instead, the system maps the process&#8217; priority class and a thread&#8217;s relative priority to a priority level. It is precisely this mapping that Microsoft does not want to commit to. In fact, this mapping has changed between versions of the system.</p>
<p><a href="http://mhesham.files.wordpress.com/2011/08/image2.png"><img style="padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;margin:5px;" title="image" src="http://mhesham.files.wordpress.com/2011/08/image_thumb2.png?w=1401&#038;h=403" alt="image" width="1401" height="403" border="0" /></a></p>
<p><a href="http://mhesham.files.wordpress.com/2011/08/image3.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;margin:5px;" title="image" src="http://mhesham.files.wordpress.com/2011/08/image_thumb3.png?w=1056&#038;h=300" alt="image" width="1056" height="300" border="0" /></a></p>
<p><a href="http://mhesham.files.wordpress.com/2011/08/image4.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;margin:5px;" title="image" src="http://mhesham.files.wordpress.com/2011/08/image_thumb4.png?w=954&#038;h=337" alt="image" width="954" height="337" border="0" /></a></p>
<p>In general, a thread with a high priority level should not be schedulable most of the time. When the thread has something to do, it quickly gets CPU time. At this point, the thread should execute as few CPU instructions as possible and go back to sleep, waiting to be schedulable again. In contrast, a thread with a low priority level can remain schedulable and execute a lot of CPU instructions to do its work. If you follow these rules, the entire operating system will be responsive to its users.</p>
<p>To create a thread with an idle relative thread priority, you execute code similar to the following:</p>
<pre>DWORD dwThreadID;
HANDLE hThread = CreateThread(NULL, 0, ThreadFunc, NULL,
   CREATE_SUSPENDED, &amp;dwThreadID);
SetThreadPriority(hThread, THREAD_PRIORITY_IDLE);
ResumeThread(hThread);
CloseHandle(hThread);</pre>
<h1>Dynamically Boosting Thread Priority Levels</h1>
<p>The system determines the thread&#8217;s priority level by combining a thread&#8217;s relative priority with the priority class of the thread&#8217;s process. This is sometimes referred to as the thread&#8217;s <em>base priority level</em>. Occasionally, the system boosts the priority level of a thread—usually in response to some I/O event such as a window message or a disk read.</p>
<p>For example, a thread with a normal thread priority in a high-priority class process has a base priority level of 13. If the user presses a key, the system places a <strong>WM_KEYDOWN</strong> message in the thread&#8217;s queue. Because a message has appeared in the thread&#8217;s queue, the thread is schedulable. In addition, the keyboard device driver can tell the system to temporarily boost the thread&#8217;s level. So the thread might be boosted by 2 and have a current priority level of 15.</p>
<p>The thread is scheduled for one time slice at priority 15. Once that time slice expires, the system drops the thread&#8217;s priority by 1 to 14 for the next time slice. The thread&#8217;s third time slice is executed with a priority level of 13. Any additional time slices required by the thread are executed at priority level 13, the thread&#8217;s base priority level.</p>
<p>Another situation causes the system to dynamically boost a thread&#8217;s priority level. Imagine a priority 4 thread that is ready to run but cannot because a priority 8 thread is constantly schedulable. In this scenario, the priority 4 thread is being starved of CPU time. When the system detects that a thread has been starved of CPU time for about three to four seconds, it dynamically boosts the starving thread&#8217;s priority to 15 and allows that thread to run for twice its time quantum. When the double time quantum expires, the thread&#8217;s priority immediately returns to its base priority.</p>
<h1>API Table</h1>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="551">
<p align="center"><strong>Function</strong></p>
</td>
<td valign="top" width="551">
<p align="center"><strong>Description</strong></p>
</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>DWORD ResumeThread(HANDLE hThread);</pre>
</td>
<td valign="top" width="551">Resumes the specified thread (i.e making it schedulable)</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>DWORD SuspendThread(HANDLE hThread);</pre>
</td>
<td valign="top" width="551">Suspends the specified thread.</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>VOID Sleep(DWORD dwMilliseconds);</pre>
</td>
<td valign="top" width="551">thread can also tell the system that it does not want to be schedulable for a certain amount of time</td>
</tr>
<tr>
<td valign="top" width="551">BOOL SwitchToThread();</td>
<td valign="top" width="551">allows another schedulable thread to run if one exists</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL GetThreadTimes(
   HANDLE hThread,
   PFILETIME pftCreationTime,
   PFILETIME pftExitTime,
   PFILETIME pftKernelTime,
   PFILETIME pftUserTime);</pre>
</td>
<td valign="top" width="551">returns the amount of CPU time that the thread has received</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL GetThreadContext(
   HANDLE hThread,
   PCONTEXT pContext);</pre>
</td>
<td valign="top" width="551">lets you look inside a thread&#8217;s kernel object and grab its current set of CPU registers</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL SetPriorityClass(
   HANDLE hProcess,
   DWORD fdwPriority);</pre>
</td>
<td valign="top" width="551">once the child process is running, it can change its own priority class</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>DWORD GetPriorityClass(HANDLE hProcess);</pre>
</td>
<td valign="top" width="551">Query the priority class of a certain process</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL SetThreadPriority(
   HANDLE hThread,
   int nPriority);</pre>
</td>
<td valign="top" width="551">To set a thread&#8217;s relative priority, you must call these functions:</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>int GetThreadPriority(HANDLE hThread);</pre>
</td>
<td valign="top" width="551">To get a thread&#8217;s relative priority, you must call these functions:</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL SetProcessPriorityBoost(
   HANDLE hProcess,
   BOOL bDisablePriorityBoost);</pre>
</td>
<td valign="top" width="551">tells the system to enable or disable priority boosting for all threads within a process</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL GetProcessPriorityBoost(
   HANDLE hProcess,
   PBOOL pbDisablePriorityBoost);<a name="405"></a><a name="IDX-1951C4AB577-C32E-4A16-BF0F-07F3124E1"></a></pre>
</td>
<td valign="top" width="551">determine whether process priority boosting is enabled or disabled:</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL SetThreadPriorityBoost(
   HANDLE hThread,
   BOOL bDisablePriorityBoost);</pre>
</td>
<td valign="top" width="551">you enable or disable priority boosting for individual threads</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL GetThreadPriorityBoost(
   HANDLE hThread,
   PBOOL pbDisablePriorityBoost);</pre>
</td>
<td valign="top" width="551">determine whether thread priority boosting is enabled or disabled:</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL SetProcessAffinityMask(
   HANDLE hProcess,
   DWORD_PTR dwProcessAffinityMask);</pre>
</td>
<td valign="top" width="551">To limit threads in a single process to run on a subset of the available CPUs</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL GetProcessAffinityMask(
   HANDLE hProcess,
   PDWORD_PTR pdwProcessAffinityMask,
   PDWORD_PTR pdwSystemAffinityMask);</pre>
</td>
<td valign="top" width="551">returns a process&#8217; affinity mask</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>DWORD_PTR SetThreadAffinityMask(
   HANDLE hThread,
   DWORD_PTR dwThreadAffinityMask);</pre>
</td>
<td valign="top" width="551">set affinity masks for individual threads</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>DWORD SetThreadIdealProcessor(
   HANDLE hThread,
   DWORD dwIdealProcessor);</pre>
</td>
<td valign="top" width="551">It would be better if you could tell the system that you want a thread to run on a particular CPU but allow the thread to migrate to another CPU if one is available.Use SetThreadIdealProcessor to set an ideal CPU for a thread us</td>
</tr>
</tbody>
</table>
<h1>References</h1>
<blockquote><p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/369/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=369&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/08/11/thread-scheduling-priorities-and-affinities/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/08/image_thumb2.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/08/image_thumb3.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://mhesham.files.wordpress.com/2011/08/image_thumb4.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Thread Basics</title>
		<link>http://mhesham.wordpress.com/2011/08/10/thread-basics/</link>
		<comments>http://mhesham.wordpress.com/2011/08/10/thread-basics/#comments</comments>
		<pubDate>Wed, 10 Aug 2011 11:08:52 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Book Snapshot]]></category>
		<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[Windows via C/C++]]></category>
		<category><![CDATA[multitasking]]></category>
		<category><![CDATA[multithreading]]></category>
		<category><![CDATA[threads]]></category>
		<category><![CDATA[winapi]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/08/10/thread-basics/</guid>
		<description><![CDATA[A thread consists of two components: A kernel object that the operating system uses to manage the thread. The kernel object is also where the system keeps statistical information about the thread. A thread stack that maintains all the function parameters and local variables required as the thread executes code. Threads are always created in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=342&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><u>A thread consists of two components:</u></p>
<ol>
<li>A kernel object that the operating system uses to manage the thread. The kernel object is also where the system keeps statistical information about the thread. </li>
<li>A thread stack that maintains all the function parameters and local variables required as the thread executes code. </li>
</ol>
<p>Threads are always created in the context of some process and live their entire life within that process. What this really means is that the thread executes code and manipulates data within its process&#8217; address space. So if you have two or more threads running in the context of a single process, the threads share a single address space. The threads can execute the same code and manipulate the same data. Threads can also share kernel object handles because the handle table exists for each process, not each thread.</p>
<h1>Your First Thread Function</h1>
<p>Every thread must have an entry-point function where it begins execution. We already discussed this entry-point function for your primary thread: <b>_tmain</b> or <b>_tWinMain</b>. If you want to create a secondary thread in your process, it must also have an entry-point function, which should look something like this:</p>
<pre>DWORD WINAPI ThreadFunc(PVOID pvParam){
   DWORD dwResult = 0;
   ...
   return(dwResult);

}</pre>
<h1>The <em>CreateThread</em> Function</h1>
<p>If you want to create one or more secondary threads, you simply have an already running thread call <b>CreateThread. </b></p>
<p>HANDLE <strong>CreateThread</strong>( PSECURITY_ATTRIBUTES psa, DWORD cbStackSize, PTHREAD_START_ROUTINE pfnStartAddr, PVOID pvParam, DWORD dwCreateFlags, PDWORD pdwThreadID); </p>
<p>PDWORD pdwThreadID); </p>
<p>The <b>CreateThread</b> function is the Windows function that creates a thread. However, if you are writing C/C++ code, you should never call <b>CreateThread</b>. Instead, you should use the Microsoft C++ run-time library function <b>_beginthreadex</b></p>
<h2>Thread Stack Size</h2>
<p>The <b>cbStackSize</b> parameter specifies how much address space the thread can use for its own stack. Every thread owns its own stack. When <b>CreateProcess</b> starts a process, it internally calls <b>CreateThread</b> to initialize the process&#8217; primary thread. For the <b>cbStackSize</b> parameter, <b>CreateProcess</b> uses a value stored inside the executable file. You can control this value using the linker&#8217;s <b>/STACK</b> switch:</p>
<pre>/STACK:[<i>reserve</i>][,<i>commit</i>]</pre>
<p>The <b>reserve</b> argument sets the amount of address space the system should reserve for the thread&#8217;s stack. The default is 1 MB. The <b>commit</b> argument specifies the amount of physical storage that should be initially committed to the stack&#8217;s reserved region. </p>
<p>When you call <b>CreateThread</b>, passing a value other than 0 causes the function to reserve and commit all storage for the thread&#8217;s stack. The amount of reserved space is either the amount specified by the <b>/STACK</b> linker switch or the value of <b>cbStack</b>, whichever is larger. If you pass <b>0</b> to the <b>cbStack</b> parameter, <b>CreateThread</b> reserves a region and commits the amount of storage indicated by the <b>/STACK</b> linker switch information embedded in the .exe file by the linker.</p>
<h1>Thread Termination</h1>
<p><u>A thread can be terminated in four ways:</u></p>
<ol>
<li>The thread function returns. (This is highly recommended.) </li>
<li>The thread kills itself by calling the <b>ExitThread</b> function. (Avoid this method.) </li>
<li>A thread in the same process or in another one calls the <b>TerminateThread</b> function. (Avoid this method.) </li>
<li>The process containing the thread terminates. (Avoid this method.) </li>
</ol>
<h2>The Thread Function Returns</h2>
<p>You should always design your thread functions so that they return when you want the thread to terminate. This is the only way to guarantee that all your thread&#8217;s resources are cleaned up properly.</p>
<p><u>Having your thread function return ensures the following:</u></p>
<ul>
<li>All C++ objects created in your thread function will be destroyed properly via their destructors. </li>
<li>The operating system will properly free the memory used by the thread&#8217;s stack. </li>
<li>The system will set the thread&#8217;s exit code (maintained in the thread&#8217;s kernel object) to your thread function&#8217;s return value. </li>
<li>The system will decrement the usage count of the thread&#8217;s kernel object. </li>
</ul>
<p>When a thread dies by returning or calling <b>ExitThread</b>, the stack for the thread is destroyed. However, if <b>TerminateThread</b> is used, the system does not destroy the thread&#8217;s stack until the process that owned the thread terminates.</p>
<p>if several threads run concurrently in your application, you need to explicitly handle how each one stops before the main thread returns. Otherwise, all other running threads will die abruptly and silently.</p>
<h2>When a Thread Terminates</h2>
<p>The following actions occur when a thread terminates:</p>
<ol>
<li>All User object handles owned by the thread are freed. In Windows, most objects are owned by the process containing the thread that creates the objects. However, a thread owns two User objects: windows and hooks. When a thread dies, the system automatically destroys any windows and uninstalls any hooks that were created or installed by the thread. Other objects are destroyed only when the owning process terminates. </li>
<li>The thread&#8217;s exit code changes from <b>STILL_ACTIVE</b> to the code passed to <b>ExitThread</b> or <b>TerminateThread</b>. </li>
<li>The state of the thread kernel object becomes signaled. </li>
<li>If the thread is the last active thread in the process, the system considers the process terminated as well. </li>
<li>The thread kernel object&#8217;s usage count is decremented by 1 </li>
</ol>
<h1>Working with C/C++ Run Time Libraries</h1>
<p>To <a name="348"></a><a name="IDX-1611C4AB577-C32E-4A16-BF0F-07F3124E103E"></a>create a new thread, you must not call the operating system&#8217;s <b>CreateThread</b> function—you must call the C/C++ run-time library function <b>_beginthreadex</b>:</p>
<pre>unsigned long _beginthreadex(
   void *security,
   unsigned stack_size,
   unsigned (*start_address)(void *),
   void *arglist,
   unsigned initflag,
   unsigned *thrdaddr);</pre>
<pre><font face="Verdana">The <b>_beginthreadex</b> function has the same parameter list as the <b>CreateThread</b> function, but the parameter names and types are not exactly the same.</font></pre>
<pre><font face="Verdana">If you really want to forcibly kill your thread, you can have it call <b>_endthreadex</b> (instead of <b>ExitThread</b>) </font></pre>
<p>The C/C++ run-time library also places synchronization primitives around certain functions. For example, if two threads simultaneously call <b>malloc</b>, the heap can become corrupted. The C/C++ run-time library prevents two threads from allocating memory from the heap at the same time. It does this by making the second thread wait until the first has returned from <b>malloc</b>. Then the second thread is allowed to enter.Obviously, all this additional work affects the performance of the multithreaded version of the C/C++ run-time library.</p>
<h1>Gaining a Sense of One&#8217;s Own Identity</h1>
<p>Windows offers functions that make it easy for a thread to refer to its process kernel object or to its own thread kernel object:</p>
<pre>HANDLE GetCurrentProcess();
HANDLE GetCurrentThread();</pre>
<p>The following functions allow a thread to query its process&#8217; unique ID or its own unique ID:</p>
<pre>DWORD GetCurrentProcessId();
DWORD GetCurrentThreadId();</pre>
<h2>Converting a Pseudohandle to a Real Handle</h2>
<p>Usually, you use <strong>DuplicateHandle</strong> function to create a new process-relative handle from a kernel object handle that is relative to another process. However, we can use it in an unusual way convert a Pseudohandle to a Real Handle:</p>
<pre>DWORD WINAPI ParentThread(PVOID pvParam) {
   HANDLE hThreadParent;

   DuplicateHandle(
      GetCurrentProcess(),     // Handle of process that thread
                               // pseudohandle is relative to

      GetCurrentThread(),      // Parent thread's pseudohandle
      GetCurrentProcess(),     // Handle of process that the new, real,
                               // thread handle is relative to

      &amp;hThreadParent,          // Will receive the new, real, handle
                               // identifying the parent thread
      0,                       // Ignored due to DUPLICATE_SAME_ACCESS
      FALSE,                   // New thread handle is not inheritable
      DUPLICATE_SAME_ACCESS);  // New thread handle has same
                               // access as pseudohandle

   CreateThread(NULL, 0, ChildThread, (PVOID) hThreadParent, 0, NULL);
   // Function continues...
}
DWORD WINAPI ChildThread(PVOID pvParam) {
   HANDLE hThreadParent = (HANDLE) pvParam;
   FILETIME ftCreationTime, ftExitTime, ftKernelTime, ftUserTime;
   GetThreadTimes(hThreadParent,
      &amp;ftCreationTime, &amp;ftExitTime, &amp;ftKernelTime, &amp;ftUserTime);
   CloseHandle(hThreadParent);
   // Function continues...
}</pre>
<p>Now when the parent thread executes, it converts the ambiguous pseudohandle identifying the parent thread to a new, real handle that unambiguously identifies the parent thread, and it passes this real handle to <b>CreateThread</b>. When the child thread starts executing, its <b>pvParam</b> parameter contains the real thread handle. Any calls to functions passing this handle will affect the parent thread, not the child thread.</p>
<p>Because <b>DuplicateHandle</b> increments the usage count of the specified kernel object, it is important to decrement the object&#8217;s usage count by passing the target handle to <b>CloseHandle</b> when you finish using the duplicated object handle.</p>
<h1>API Table</h1>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="551">
<p align="center"><strong>Function</strong></p>
</td>
<td valign="top" width="551">
<p align="center"><strong>Description</strong></p>
</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>HANDLE CreateThread(</pre>
<pre>&#160;&#160; PSECURITY_ATTRIBUTES psa,</pre>
<pre>&#160;&#160; DWORD cbStackSize,</pre>
<pre>&#160;&#160; PTHREAD_START_ROUTINE pfnStartAddr,</pre>
<pre>&#160;&#160; PVOID pvParam,</pre>
<pre>&#160;&#160; DWORD dwCreateFlags,</pre>
<pre>&#160;&#160; PDWORD pdwThreadID);</pre>
</td>
<td valign="top" width="551">
<p>If you want to create one or more secondary threads, you simply have an already running thread call <b>CreateThread</b>:</p>
</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>VOID ExitThread(DWORD dwExitCode);</pre>
</td>
<td valign="top" width="551">
<p>You can force your thread to terminate</p>
</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL TerminateThread(</pre>
<pre>&#160;&#160; HANDLE hThread,</pre>
<pre>&#160;&#160; DWORD dwExitCode);</pre>
</td>
<td valign="top" width="551">
<p>Unlike <b>ExitThread</b>, which always kills the calling thread, <b>TerminateThread</b> can kill any thread</p>
</td>
</tr>
<tr>
<td valign="top" width="551">
<pre>BOOL GetExitCodeThread(</pre>
<pre>&#160;&#160; HANDLE hThread,</pre>
<pre>&#160;&#160; PDWORD pdwExitCode);</pre>
</td>
<td valign="top" width="551">
<p>Check whether the thread identified by <b>hThread</b> has terminated and, if it has, determine its exit code.</p>
<p>The exit code value is returned in the <b>DWORD</b> pointed to by <b>pdwExitCode</b>. If the thread hasn&#8217;t terminated when <b>GetExitCodeThread</b> is called, the function fills the <b>DWORD</b> with the <b>STILL_ACTIVE</b> identifier (defined as 0&#215;103). If the function is successful, <b>TRUE</b> is returned</p>
</td>
</tr>
</tbody>
</table>
<h1>References</h1>
<blockquote>
<p><a href="http://www.microsoft.com/learning/en/us/book.aspx?ID=11241&amp;locale=en-us">Windows® via C/C++, Fifth Edition</a></p>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/342/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=342&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/08/10/thread-basics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>
	</item>
		<item>
		<title>How process can skip from its JOB &#8211; A taste of dirty application</title>
		<link>http://mhesham.wordpress.com/2011/08/08/how-process-can-skip-from-its-job-a-taste-of-dirty-application/</link>
		<comments>http://mhesham.wordpress.com/2011/08/08/how-process-can-skip-from-its-job-a-taste-of-dirty-application/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 10:09:58 +0000</pubDate>
		<dc:creator>MHesham</dc:creator>
				<category><![CDATA[Operating Systems]]></category>
		<category><![CDATA[Windows Programming]]></category>
		<category><![CDATA[jobs]]></category>
		<category><![CDATA[process]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[windows programming]]></category>

		<guid isPermaLink="false">https://mhesham.wordpress.com/2011/08/08/how-process-can-skip-from-its-job-a-taste-of-dirty-application/</guid>
		<description><![CDATA[I am writing this post while I can’t believe if what I did is normal or not, I will mention what I`ve did then I will pose some questions that really need answer. Job Object is a very powerful concept in Windows, and it us used to impose limitations on process assigned to it, some [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=340&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I am writing this post while I can’t believe if what I did is normal or not, I will mention what I`ve did then I will pose some questions that really need answer.</p>
<p><strong>Job Object</strong> is a very powerful concept in Windows, and it us used to impose limitations on process assigned to it, some of this limitations enhance process security. For more information about jobs refer <a href="http://mhesham.wordpress.com/2011/08/07/jobs/">Jobs post</a>.</p>
<p><u>An important information you should know about Jobs:</u></p>
<p>Once a process is part of a job, it cannot be moved to another job and it cannot become jobless (so to speak). Also note that when a process that is part of a job spawns another process, the new process is automatically made part of the parent’s job. However, you can alter this behavior in the following ways</p>
<p><u>Let’s consider this scenario:</u></p>
<p>you write a Host application that runs other applications and manages them, this applications may have unknown implementation (you are not its author), and you create these application as a child process (Client process) of your Host process. Imagine that some of these client process are infected by some virus which requires to cross the limitations you imposed by your Job (e.g spawn other children processes, access USER objects outside the process, etc..). To ensure that your client processes run in a safe environment you impose the necessary restrictions.</p>
<p>Imagine if one application (infected one) was written to skip from the Job assigned to it and run as a jobless process with no restrictions! HOW will you control such thing? This is horrible and this is what I could do! I though that doing such thing is not doable as it is very dangerous, but I could do it.</p>
<h1>Dirty application code</h1>
<p>#include &lt;iostream&gt;   <br />#include &lt;Windows.h&gt;    <br />#include &lt;tchar.h&gt;    <br />#include &lt;strsafe.h&gt;    <br />#include &quot;XWinAssist.h&quot;    <br />using namespace std;</p>
<p>#define HostJobName _T(&quot;XHostJob&quot;)   <br />#define AppName _T(&quot;cmd.exe&quot;)    <br />#define AppCount 5</p>
<p>void CreateJoblessSelf()   <br />{    <br />&#160;&#160;&#160; const int cchSize = 128;    <br />&#160;&#160;&#160; DWORD dwSize = cchSize;    <br />&#160;&#160;&#160; TCHAR szProcessName[cchSize];    <br />&#160;&#160;&#160; QueryFullProcessImageName(GetCurrentProcess(), 0, szProcessName, &amp;dwSize);</p>
<p>&#160;&#160;&#160; STARTUPINFO si = { sizeof(si) };   <br />&#160;&#160;&#160; PROCESS_INFORMATION pi;</p>
<p>&#160;&#160;&#160; // Here I create myself again in the same console window   <br />&#160;&#160;&#160; // I specify that I want to create myself and skip the current job    <br />&#160;&#160;&#160; BOOL fCreate = CreateProcess(NULL, szProcessName, NULL, NULL, FALSE, CREATE_BREAKAWAY_FROM_JOB, NULL, NULL, &amp;si, &amp;pi);</p>
<p>&#160;&#160;&#160; if(!fCreate)   <br />&#160;&#160;&#160; {    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; _tprintf_s(_T(&quot;Failed to create jobless self\n&quot;));    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; PrintLastError();    <br />&#160;&#160;&#160; }</p>
<p>&#160;&#160;&#160; CloseHandle(pi.hThread);   <br />&#160;&#160;&#160; CloseHandle(pi.hProcess);    <br />}</p>
<p>void StartSomeProcess()   <br />{    <br />&#160;&#160;&#160; const int iSize = 128;    <br />&#160;&#160;&#160; TCHAR szCommandLine[iSize];    <br />&#160;&#160;&#160; _tcscpy_s(szCommandLine, iSize, AppName);    <br />&#160;&#160;&#160; <br />&#160;&#160;&#160; STARTUPINFO si = { sizeof(si) };    <br />&#160;&#160;&#160; PROCESS_INFORMATION pi;</p>
<p>&#160;&#160;&#160; // Start some cmd.exe prcess, for sure this can be any process you imagine   <br />&#160;&#160;&#160; BOOL fCreate = CreateProcess(NULL, szCommandLine, NULL, NULL, FALSE, CREATE_NEW_CONSOLE, NULL, NULL, &amp;si, &amp;pi);</p>
<p>&#160;&#160;&#160; if(!fCreate)   <br />&#160;&#160;&#160; {&#160;&#160;&#160; <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; _tprintf_s(_T(&quot;Failed to start some process\n&quot;));    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; PrintLastError();    <br />&#160;&#160;&#160; }</p>
<p>&#160;&#160;&#160; CloseHandle(pi.hThread);   <br />&#160;&#160;&#160; CloseHandle(pi.hProcess);    <br />}</p>
<p>int _tmain(int argc, TCHAR* argv[], TCHAR* envp[])   <br />{    <br />&#160;&#160;&#160; BOOL bInJob = FALSE;    <br />&#160;&#160;&#160; PROCESS_INFORMATION pi =&#160; { 0 };</p>
<p>&#160;&#160;&#160; // find if we are already assigned to a job   <br />&#160;&#160;&#160; IsProcessInJob(GetCurrentProcess(), NULL, &amp;bInJob);</p>
<p>&#160;&#160;&#160; // I will try to skip from the job already assigned to me   <br />&#160;&#160;&#160; if (bInJob)    <br />&#160;&#160;&#160; {    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; _tprintf_s(_T(&quot;\nProcess already in a job\n&quot;));    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; _tprintf_s(_T(&quot;Spawining JOBLESS SELF &#8230;\n&quot;));</p>
<p>&#160;&#160;&#160;&#160;&#160;&#160;&#160; // instantiating myself but with no restrictions   <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; CreateJoblessSelf();</p>
<p>&#160;&#160;&#160;&#160;&#160;&#160;&#160; _tprintf_s(_T(&quot;\nRestarting &#8230;\n&quot;));   <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; return 0;    <br />&#160;&#160;&#160; }    <br />&#160;&#160;&#160; else    <br />&#160;&#160;&#160; {    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; // Reaching this line means that I am totally free and have no job restrictions    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; _tprintf_s(_T(&quot;\nMUHAHAHAHA &#8230; I am JOBLESS Process (Evil Laugh)\n&quot;));    <br />&#160;&#160;&#160; }</p>
<p>&#160;&#160;&#160; // Spawn some process to mobilize system resources   <br />&#160;&#160;&#160; // this could be some other infectious job    <br />&#160;&#160;&#160; for(int i = 0; i &lt; AppCount; ++i)    <br />&#160;&#160;&#160; {    <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; StartSomeProcess();    <br />&#160;&#160;&#160; }</p>
<p>&#160;&#160;&#160; return(0);   <br />}</p>
<h2>Environment:</h2>
<ul>
<li>Microsoft Win7 Professional</li>
<li>Visual Studio 2008 Team Suite</li>
</ul>
<h1>Questions and Exclamations</h1>
<p>I have some questions in my mind I need answer for, I tried to search online but with no useful information.</p>
<ol>
<li>Is there any permission or access rights required to allow a process to spawn other process outside its job? I think there should be some.</li>
<li>Can this behavior be really dangerous and be used to make havoc in the system?</li>
<li>Is this considered as a security flaw?</li>
</ol>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mhesham.wordpress.com/340/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mhesham.wordpress.com/340/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mhesham.wordpress.com/340/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mhesham.wordpress.com/340/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mhesham.wordpress.com/340/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mhesham.wordpress.com/340/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mhesham.wordpress.com/340/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mhesham.wordpress.com/340/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mhesham.wordpress.com&amp;blog=9661348&amp;post=340&amp;subd=mhesham&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mhesham.wordpress.com/2011/08/08/how-process-can-skip-from-its-job-a-taste-of-dirty-application/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebdc8c831f2ae1f6a876629e325c99f6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mhesham</media:title>
		</media:content>
	</item>
	</channel>
</rss>
