Debugging Memory Leaks in Windows
Unlike managed languages, like C# and Java, where memory used by a program is automatically freed up by the garbage collector when no longer needed, native languages like C or C++ must explicitly manage memory. This means that the programmer must decide when and how to allocate memory for the objects he creates, and when and how to free it up when the objects are no longer in use.
Though managed programs are also prone to various memory-related bugs, native languages makes bugs much more likely to occur. This article will describe various types of memory-related problems that are common in native programs, and show how to fix them.
Since memory management and the tools available to fix memory-related bugs vary from one operating system to another, this article will concentrate on debugging memory leaks on Windows.
Memory Management in Windows
Before discussing the finer points of debugging memory-related problems in Windows, let’s have a quick overview of Windows’ memory management system.
Virtual Memory
Today’s desktop computer operating systems are extremely complex, allowing hundreds of applications to run simultaneously by sharing the system’s limited resources. If each one of these programs were to directly control the computer’s physical memory, all hell would break loose and the system would become inoperable in a matter of seconds. Every program would compete with every other program to scoop up every last shred of memory, and one tyrannical application could hog all the memory at the expense of all the others.
Virtual memory, or virtual address space, to be more precise, is a technique by which the operating system assigns contiguous virtual addresses to each application for their exclusive use, while allocating a physical memory space to these addresses according to the overall needs of all the applications running on the system.
When certain blocks of memory are frequently accessed by programs, they are allocated in the ultra-fast physical memory by the operating system; less frequently accessed blocks are rather allocated in the slower swap file, on the hard drive.
The operating system can always decide to move a memory block from one space to another, whether within the physical memory or swap file, or between the two. Since addresses used in the process are virtual, programs won’t know the difference, and objects continue to reside at the same, virtual, address, while the operating system maps the links between memory spaces and virtual addresses.
This memory management technique enables optimal use of limited resources according to the overall needs of all the programs running on the system at any given time.
Memory-Related Bugs
There are various types of memory-related problems; the following is a description of the two most common ones.
Memory Leak
Memory leak refers to a gradual loss of memory space due to memory blocks not being released when they are no longer needed.
Leaky applications will require more and more memory, until the program is shut down.
On 32-bit operating systems, where memory space is limited to around 3 gigabytes per process, leaks can increase memory required by any given application to the point where memory runs out and the application crashes.
Though 64-bit applications have much more memory space, memory leaks can still cause significant problems. For example, the operating system will have less and less physical memory at its disposal, making increasing use of the slower swap file on the hard drive, and slowing down all applications on the system.
Memory Corruption
Memory corruption can occur in many situations, but is usually caused by programs writing at the wrong memory space. This can be due to a program not being allocated enough memory for its needs, or to an object being released before all parts of the program are finished with it.
The effects of memory corruption can be particularly insidious. The simplest cases (from a debugging developer standpoint) involve access to the wrong space in a protected memory area. In this case, an access violation application crash will occur immediately upon accessing, making it that much easier to identify the cause of the error.
However, in most cases, the erroneous write will be allowed and not flagged, corrupting the memory space with faulty data. Some time later, and sometimes much, much later, the program will reference the faulty data in the memory space, causing the application to misbehave. Depending on the program’s reading of the corrupt data, accessing this space can cause a crash, an incorrect calculation, or no problem at all. The worst thing is that the failure can happen at any time and in random areas in the code between application launches, without any kind of consistency. This is when debugging gets tricky.
Identifying memory leaks with UMDH
Though there are many software applications that specialize in analyzing memory leaks, one of the most useful and simple tools is UMDH, provided free of charge by Microsoft with its Debugging Tools for Windows (see my article Introduction to WinDBG for further details).
Unlike other products with very complex user interfaces, UMDH is a simple command-line interface program that only does two things, but does them very well.
- It generates a snapshot of all of an application’s memory allocations
- It compares two snapshots to identify the source of leaks
Preparation
Using UMDH requires some preparation. First, you have to configure Windows to capture all allocation call stacks for the targeted application and ensure access to the necessary debugging symbols to match up the call stacks to the names of program functions and the location of various points of interest in source files.
Activating call stack capture
The simplest way to activate call stack capture for any given application is to use the Global Flags application, also provided with Debugging Tools for Windows. Once in Global Flags, just click on the third tab, Image File, enter the process name (for example, example.exe) in the Image: field, hit [TAB], and check the Create user mode stack trace database box.
This configuration will be saved in the Windows registry until manually deactivated. To deactivate this debugging option, just repeat the steps and uncheck the box.
Configure symbol access
To access debugging symbols, you must define an environment variable called _NT_SYMBOL_PATH whose value should be set as follows:
c: \application\pdb\file\location;srv*c:\temporary\folder*http://msdl.microsoft.com/download/symbols
In this example, c:\application\pdf\file\location
is the path, or paths (separated by semi-columns), to the PDB files of the application to be debugged, and srv*c:\temporary\[…]/download/symbols
is the location of Windows’ public symbol-file server and its components. c:\temporary\folder
could be any folder on the computer; the Windows symbol files downloaded from the server are copied in this folder to speed-up access for later uses.
Debugging
Once your prep work is done, using UMDH requires a few simple steps:
1) Start and run the targeted application until its memory use is known and stable
2) Take a snapshot of memory allocations with UMDH
Proceed as follows to take a snapshot of the memory allocations with UMDH:
umdh -p:PID [-f:ExitFile.txt]
You can find the application’s Process ID (PID) by using Windows Task Manager. Use Windows’ online help to find out the exact procedure for your version of Windows.
3) Perform the operation that produces the memory leak
You should repeat the leaky operation several times, since the results of the analysis will be sorted by decreasing memory size. The more often the leak is reproduced, the higher it will be on the list and the easier it will be to identify.
4) Take a second snapshot of memory allocation with UMDH
This second snapshot should be taken at a time when you would expect the amount of memory used to be the same as in step 2. For example, if the application creates 10 objects when you press a button to display a dialogue box, you can expect these 10 objects to be released once the dialogue box is closed. By taking the first snapshot before you press the button and the second one after the dialogue box is closed, you can determine whether this is the case.
5) Use UMDH to compare the two snapshots
To compare the two snapshots, just use UMDH a little differently than in steps 2 and 4:
umdh [-l] file1.txt file2.txt > Comparablefile.txt
Here, option -l prints the source file names as well as the line numbers in the call stacks during the analysis. Redirecting to a text file ( > Comparablefile.txt ) is optional, but strongly recommended, given the usual volume of data generated by UMDH.
The comparison of the UMDH snapshots can take from just a few seconds to several minutes, depending on the time lapse between snapshots and the severity of the memory leak.
The analysis will provide a list of all call stacks where memory has been allocated but not released in the intervening time between the two snapshots.
Example:
Here is a simple C++ program with a memory leak:
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
cout << "Take a snapshot, then press ENTER" << endl;
cin.ignore();
// Allocation of memory
int* onethousand = new int[1000];
int* tenthousand = new int[10000];
// This memory should be released as follows to avoid leakage:
// delete[] onethousand;
// delete[] tenthousand;
cost << "Take the second snapshot, then press ENTER" << endl;
cin.ignore();
}
</iostream>
The UMDH analysis of this program produces the following result:
+ 9c64 ( 9c64 - 0) 1 allocs BackTrace267DF9C
+ 1 ( 1 - 0) BackTrace267DF9C allocations
ntdll!RtlpCallInterceptRoutine+26
ntdll!RtlpAllocateHeapInternal+4D95B
ntdll!RtlAllocateHeap+28
ucrtbased!_toupper+248
ucrtbased!_toupper+56
ucrtbased!_malloc_dbg+1A
ucrtbased!malloc+14
leak!operator new+D (f:\dd\vctools\crt\vcstartup\src\heap\new_scalar.cpp, 19)
leak!operator new[]+C (f:\dd\vctools\crt\vcstartup\src\heap\new_array.cpp, 15)
leak!main+8C (d:\users\françois\documents\visual studio 2015\projects\leak\leak\main.cpp, 11)
leak!invoke_main+1E (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, 74)
leak!__scrt_common_main_seh+15A (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, 264)
leak!__scrt_common_main+D (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, 309)
leak!mainCRTStartup+8 (f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp, 17)
KERNEL32!BaseThreadInitThunk+24
ntdll!__RtlUserThreadStart+2F
ntdll!_RtlUserThreadStart+1B
+ fc4 ( fc4 - 0) 1 allocs BackTrace267DF08
+ 1 ( 1 - 0) BackTrace267DF08 allocations
ntdll!RtlpCallInterceptRoutine+26
ntdll!RtlpAllocateHeapInternal+4D95B
ntdll!RtlAllocateHeap+28
ucrtbased!_toupper+248
ucrtbased!_toupper+56
ucrtbased!_malloc_dbg+1A
ucrtbased!malloc+14
leak!operator new+D (f:\dd\vctools\crt\vcstartup\src\heap\new_scalar.cpp, 19)
leak!operator new[]+C (f:\dd\vctools\crt\vcstartup\src\heap\new_array.cpp, 15)
leak!main+70 (d:\users\françois\documents\visual studio 2015\projects\leak\leak\main.cpp, 10)
leak!invoke_main+1E (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, 74)
leak!__scrt_common_main_seh+15A (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, 264)
leak!__scrt_common_main+D (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl, 309)
leak!mainCRTStartup+8 (f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp, 17)
KERNEL32!BaseThreadInitThunk+24
ntdll!__RtlUserThreadStart+2F
ntdll!_RtlUserThreadStart+1B
(Result summarized for easier reading)
Reading the results from the top, you can see two call stacks, both of which go back to lines 11 and 10 of the program’s main.cpp file. For each call stack, the two top lines starting with + represent
- The difference in bites allocated at this point in the code between the first and second snapshots,
- The difference in number of allocations at this point between the first and second snapshots.
If you look at the first call stack, you can see a leak of 9c64 bytes (in hexadecimal, or 40,036 bytes), allocated all at once. Knowing that we allocated 10 000 int-type elements at this point, and that one int-type element on a 32-bit application in Windows (which is the case in this example) represents 4 bytes, we can deduce that the tool detected a leak of about 10 009 elements.
The extra 36 bytes can be ascribed to factors that do not impact the usefulness of this method to find and fix memory leaks. In this case, the additional memory is due to a header that my version of Windows adds to each memory allocation for management purposes. Different versions of Windows will provide different results.
The second call stack tells us that, as expected, we have a leak of FC4 (4036) bytes. Once again, we have an extra 36 bytes, which confirms my assumption regarding the allocation header.
Once the leak site has been identified, the hardest part of the work is done. All you need to do is to check the code to see how to fix it so that it properly releases memory once it is no longer needed.
Happy debugging!