Windows Memory Architecture

Content

  1. How a Virtual Address Space Is Partitioned
  2. Regions in an Address Space
  3. Committing Physical Storage Within a Region
  4. Physical Storage and the Paging File
  5. Page Protection Attributes
  6. The Importance of Data Alignment

Introduction

Every process is given its very own virtual address space. For 32-bit processes, this address space is 4 GB because a 32-bit pointer can have any value from 0×00000000 through 0xFFFFFFFF.

Every process has its own private address space. Process A can have a data structure stored in its address space at address 0×12345678, while Process B can have a totally different data structure stored in its address space—at address 0×12345678. When threads running in Process A access memory at address 0×12345678, these threads are accessing Process A’s data structure. When threads running in Process B access memory at address 0×12345678, these threads are accessing Process B’s data structure. Threads running in Process A cannot access the data structure in Process B’s address space, and vice versa.

This address space is simply a range of memory addresses. Physical storage needs to be assigned or mapped to portions of the address space before you can successfully access data without raising access violations.

How a Virtual Address Space Is Partitioned

Each process’ virtual address space is split into partitions. The address space is partitioned based on the underlying implementation of the operating system.

image

The partition of the process’ address space from 0×00000000 to 0x0000FFFF inclusive is set aside to help programmers catch NULL-pointer assignments. If a thread in your process attempts to read from or write to a memory address in this partition, an access violation is raised.

This User-Mode partition is where the process’ address space resides. The usable address range and approximate size of the user-mode partition depends on the CPU architecture.

image

This Kernel-Mode partition is where the operating system’s code resides. The code for thread scheduling, memory management, file systems support, networking support, and all device drivers is loaded in this partition. Everything residing in this partition is shared among all processes. Although this partition is just above the user-mode partition in every process, all code and data in this partition is completely protected. If your application code attempts to read or write to a memory address in this partition, your thread raises an access violation.

Regions in an Address Space

When a process is created and given its address space, the bulk of this usable address space is free, or unallocated. To use portions of this address space, you must allocate regions within it by calling VirtualAlloc. the act of allocating a region is called reserving.

Whenever you reserve a region of address space:

  • The system ensures that the region begins on an allocation granularity boundary. All the CPU platforms use the same allocation granularity of 64 KB—that is, allocation requests are rounded to a 64-KB boundary.
  • The system ensures that the size of the region is a multiple of the system’s page size. A page is a unit of memory that the system uses in managing memory. Like the allocation granularity. The x86 and x64 systems use a 4-KB page size, but the IA-64 uses an 8-KB page size.

If you attempt to reserve a 10-KB region of address space, the system will automatically round up your request and reserve a region whose size is a multiple of the page size. This means that on x86 and x64 systems, the system will reserve a region that is 12 KB.

When your program’s algorithms no longer need to access a reserved region of address space, the region should be freed. This process is called releasing the region of address space and is accomplished by calling the VirtualFree function.

Committing Physical Storage Within a Region

To use a reserved region of address space, you must allocate physical storage and then map this storage to the reserved region. This process is called committing physical storage. Physical storage is always committed in pages. To commit physical storage to a reserved region, you again call the VirtualAlloc function.

When your program’s algorithms no longer need to access committed physical storage in the reserved region, the physical storage should be freed. This process is called decommitting the physical storage and is accomplished by calling the VirtualFree function.

Physical Storage and the Paging File

The file on the disk is typically called a paging file, and it contains the virtual memory that is available to all processes.

when an application commits physical storage to a region of address space by calling the VirtualAlloc function, space is actually allocated from a file on the hard disk. The size of the system’s paging file is the most important factor in determining how much physical storage is available to applications; the amount of RAM you have has very little effect.

Now when a thread in your process attempts to access a block of data in the process’ address space.

physical address in memory, and then the desired access is performed.

In the second possibility, the data that the thread is attempting to access is not in RAM but is contained somewhere in the paging file. In this case, the attempted access is called a page fault, and the CPU notifies the operating system of the attempted access. The operating system then locates a free page of memory in RAM; if a free page cannot be found, the system must free one. If a page has not been modified, the system can simply free the page. But if the system needs to free a page that was modified, it must first copy the page from RAM to the paging file. Next the system goes to the paging file, locates the block of data that needs to be accessed, and loads the data into the free page of memory. The operating system then updates its table indicating that the data’s virtual memory address now maps to the appropriate physical memory address in RAM. The CPU now retries the instruction that generated the initial page fault, but this time the CPU is able to map the virtual memory address to a physical RAM address and access the block of data.

The more often the system needs to copy pages of memory to the paging file and vice versa, the more your hard disk thrashes and the slower the system runs. (Thrashing means that the operating system spends all its time swapping pages in and out of memory instead of running programs.)

When you invoke an application, the system opens the application’s .exe file and determines the size of the application’s code and data. Then the system reserves a region of address space and notes that the physical storage associated with this region is the .exe file itself. That’s right—instead of allocating space from the paging file, the system uses the actual contents, or image, of the .exe file as the program’s reserved region of address space. This, of course, makes loading an application very fast and allows the size of the paging file to remain small.

When a program’s file image (that is, an .exe or a DLL file) on the hard disk is used as the physical storage for a region of address space, it is called a memory-mapped file. When an .exe or a DLL is loaded, the system automatically reserves a region of address space and maps the file’s image to this region.

Page Protection Attributes

Individual pages of physical storage allocated can be assigned different protection attributes.

image

Some malware applications write code into areas of memory intended for data (such as a thread’s stack) and then the application executes the malicious code. Windows’ Data Execution Prevention (DEP) feature provides protection against this type of malware attack. With DEP enabled, the operating system uses the PAGE_EXECUTE_* protections only on regions of memory that are intended to have code execute; other protections (typically PAGE_READWRITE) are used for regions of memory intended to have data in them (such as thread stacks and the application’s heaps).

Windows supports a mechanism that allows two or more processes to share a single block of storage. So if 10 instances of Notepad are running, all instances share the application’s code and data pages.

When an .exe or a .dll module is mapped into an address space, the system calculates how many pages are writable. (Usually, the pages containing code are marked as PAGE_EXECUTE_READ while the pages containing data are marked PAGE_READWRITE.) Then the system allocates storage from the paging file to accommodate these writable pages. This paging file storage is not used unless the module’s writable pages are actually written to.

When a thread in one process attempts to write to a shared block, the system intervenes and performs the following steps:

  1. The system finds a free page of memory in RAM.

  2. The system copies the contents of the page attempting to be modified (in the image) to the free page found in step 1. This free page will be assigned either PAGE_READWRITE or PAGE_EXECUTE_READWRITE protection. The original page’s protection and data does not change at all.

  3. The system then updates the process’ page tables so that the accessed virtual address now translates to the new page of RAM.

After the system has performed these steps, the process can access its own private instance of this page of storage.

A memory block is a set of contiguous pages that all have the same protection attributes and that are all backed by the same type of physical storage.

Protection attributes are given to a region for the sake of efficiency only, and they are always overridden by protection attributes assigned to physical storage.

A block’s protection attributes override the protection attributes of the region that contains the block.

The Importance of Data Alignment

Data alignment is not so much a part of the operating system’s memory architecture as it is a part of the CPU’s architecture.

CPUs operate most efficiently when they access properly aligned data. Data is aligned when the memory address of the data modulo of the data’s size is 0. For example, a WORD value should always start on an address that is evenly divided by 2, a DWORD value should always start on an address that is evenly divided by 4, and so on. When the CPU attempts to read a data value that is not properly aligned, the CPU will do one of two things. It will either raise an exception or the CPU will perform multiple, aligned memory accesses to read the full misaligned data value.

Here is some code that accesses misaligned data:

VOID SomeFunc(PVOID pvDataBuffer) {

   // The first byte in the buffer is some byte of information
   char c = * (PBYTE) pvDataBuffer;

   // Increment past the first byte in the buffer
   pvDataBuffer = (PVOID)((PBYTE) pvDataBuffer + 1);

   // Bytes 2-5 contain a double-word value
   DWORD dw = * (DWORD *) pvDataBuffer;

   // The line above raises a data misalignment exception on some CPUs
...

Obviously, if the CPU performs multiple memory accesses, the performance of your application is hampered. At best, it will take the system twice as long to access a misaligned value as it will to access an aligned value—but the access time could be even worse! To get the best performance for your application, you’ll want to write your code so that the data is properly aligned.

References

Windows® via C/C++, Fifth Edition

Operating Systems Overview

I dealt with computers first when I was in the elementary school around 1998, where I used to play Sky Roads on DOS computers. 2000 was the year I convinced my father to buy a computer at home, Some guy came at home and installed Windows so that we can deal with the computer, This Windows was like a mysterious thing or like a ghost that controls my PC, Actually I didn’t pay attention to know what Windows is all about rather than it is a kind of “Operating System” ! but as an end-user I was very good at using it.

Now as a computer science student, I figured out what Operating System – abbreviated OS – is all about, this article intend to give a brief overview to Operating Systems.


What is OS?

  • A program that controls the execution of application programs, and act as an interface between the application and the computer hardware.
  • It has 3 objectives:
    • Convenience; makes computer more convenient to use.
    • Efficiency; allows us to use computer resources in an efficient manner.
    • Ability to evolve; allow for further development to allow to system functions.

OS as a user/computer interface

  • OS provides a variety of services in the following areas:
    1. Program development: editors, debuggers they are tools supplied with the OS.
    2. Program execution: automate a number of steps to execute a program.
    3. Access to I/O devices: act as a façade to I/O devices.
    4. Controlled access to files: provide protection mechanism to control access to files.
    5. System access: protect system data and resources from unauthorized users.
    6. Error detection and response: detect program and hardware errors so as to clear the error condition with the least impact on the running applications.
    7. Accounting: monitoring system resources and collect usage statistics, help in judging whether to upgrade the resources or it is efficient enough.

OS as a Resource Manager

  • Memory allocation is controlled by the OS and the MMU (Memory Management Unit).
  • The OS decides when I/O device can be used by a program in execution.
  • Control access to and use of files.
  • The processor operation itself is controlled by the OS, that OS decides how much time the processor can spend on a particular program.

What makes OS evolve ?

  1. Hardware upgrades and new types of hardware: New types of hardware require that the OS be able to deal with it, so OS should be built with support for that hardware.
  2. New services: OS offers new services demanded by the users or system managers.
  3. Bug Fixes: Bugs appear over time, and detected by users, so OS should be fixed for this bugs, and sometimes fixing a bug raise another bug.

OS Evolution

In the dark ages, when there was no OS from late 1940s to mid-1950s (I call this years: before OS, abbreviated BOS like BC) the programmer had to deal directly with the computer hardware, These computers were run from a console consisting of display lights, toggle switches, some form of input device.

  • Serial Processing:

    • This systems presented two main problems
      1. Scheduling: users had to sign-up sheet to reserve computer time, users couldn’t`t know precisely how long it will take to finish their program.
      2. Setup time: A single program called a job had to be installed before used with its compiler and code, saving the object program and linking and so on to run the program.
    • Users have access to computer in series.
    • Simple Batch Systems
      • Monitor: a software program that handle executing the jobs provided by the user on tapes or disks.
  • Multiprogrammed Batch Systems

    • The I/O devices are much slower than the processor, leaving the processor idle most of the time waiting for the I/O devices to finish their operations.
    • Uniprogramming: the processor starts executing a certain program and when it reaches an I/O instructions, it must wait until that I/O instructions is fully executed before proceeding.
    • Multiprogramming: in contrast to the uniprogramming, when a job needs to wait for an I/O instruction, the processor switches to another job executing it until the first job finishes its waiting I/O instructions, the processor continue to swap between jobs as it reaches an I/O operation.
    • Multiprogramming batch system must rely on certain hardware capabilities such as process switching when swapping between program execution.
    • Interrupt-driven I/O or DMA helps a lot in multiprogramming environments, allowing the processor to issues an I/O command and proceed executing another program.

  • Time-Sharing Systems

    • As multiprogramming allows processor to handle multiple batch jobs at time, it can allow the processor to handle multiple interactive jobs at time, through time sharing.
    • Time Slicing: there is a system clock that generates interrupts at a constant rate, allowing the OS regain control and assign the processor to another process.

Time sharing and multiprogramming raise a host for new problems

  • If multiple jobs are in memory they must be protected from interfering with each other.
  • File systems must be protected from access by unauthorized users.
  • The programs contention for resources (mass storage, printer, … ) must be handled by the OS.

Major Achievements in OS

  • The Process
    • Possible Definitions:
      • A program in execution.
      • An instance of a program running on a computer.
    • The interrupt helped programmers in developing early multiprogramming and multiuser interactive systems.
    • Errors caused by handling more than one process at time is:
      1. Improper synchronization
      2. Failed mutual exclusion: allow only one routine at a time to perform an update against the file.
      3. Non-determinate program operation: programs may interfere with each other when they share memory and their process is interleaved by the processor.
      4. Deadlocks: 2 programs hung up waiting for each others to release a resource.
    • The execution context or the process state is the internal data by which the OS is able to control the process.
    • The context contains the content of the processor registers, as with information to use by the OS as the priority of the process.
  • Memory Management
    • Process Isolation: OS should prevent independent processes from interfering with each other.
    • Automatic allocation and management: Programs should be dynamically allocated, and allocation should be transparent to the programmer.
    • Support of modular programming: Programmers should be able to write their own programs and create, destroy and alter the size of it dynamically.
    • Long-term storage: saving information for extended periods of time.
    • Protection and access control.
  • Information Protection and Security
    • the use of time-sharing systems, computer networks has brought concern for the protection of information.
    • We are concerned with the problem of controlling access to the computer system.
    • Work in this area can be grouped in:
      • Availability: protect the system against interruption.
      • Confidentiality: users cannot read data for which access is unauthorized.
      • Data integrity: protection of data from unauthorized modification.
      • Authenticity: verification of the identity of the users and validity messages.
  • Scheduling and Resource Management
    • Any resource allocation and scheduling policy must consider:
      1. Fairness: jobs of the same class competing for a resource are to be given equal and fair access to that resource.
      2. Differential responsiveness.
      3. Efficiency: OS should maximize processor utilization, minimize response time, and accommodate as many users as possible.
    • OS elements involved in the scheduling of process and the allocation of resources in a multiprogramming environment:
      • Short-term queue: contains processes in the main memory and are ready to run as soon as the processor is made available.
      • Short-term scheduler: decides which process in the short-term queue to use the processor, a common strategy is to give each process some time in term (round-ribbon) technique.
      • Long-term queue: list of all new jobs waiting to use the processor, the OS adds jobs to the system by transferring process from the long-term queue to the short-term queue.
      • I/O queues: each device has a queue for the processes waiting to use that device; it is the OS that decide which process to assign to an available I/O device.

References

Operating Systems: Internals and Design Principles (6th Edition), William Stallings

Follow

Get every new post delivered to your Inbox.