Saturday, July 26, 2008

Heap usage

I haven't posted in a while, due to lack of time and too much work, and I think this post has been rather overdue.
One of the things aspects of AIX's malloc (or for that matter any other operating system) is that if you free the memory you have allocated, it won't reflect in the svmon output. This is because malloc subsystem caches the memory, to be used for further malloc.
An easy way to see how much memory your application is using (the memory malloced by it, + the memory in the free pool maintained by the malloc subsystem) is to use the variable process_brk which is exported by libc.
The way I usually go about it is to use the dbx subcommand
(dbx) p &process_brk
This gives me the address to dump, which I dump using a command similar to the one below
(dbx) 0x12345678/3X
This will give an output of three words..
12345678 12345678 1
The first word signifies what was the brk value before the first malloc was done, and the second word tells you what was the brk value after the second allocation was done. The third word tells how many sbrk()s were done. Of course, this gives me a very good estimate of the total memory used by my program.
Another benifit of this thing is to check for heap/stack collision. To check whether there has been a heap stack collision in my 32-bit app, what I normally do is to dump the stack-pointer, and check whether the stack_pointer falls within the process_brk minimum and maximum limits.
Hope this helps.

Tuesday, July 8, 2008

A few dbx quickies

Here are some of the dbx commands I frequently use to speed things up:

1. addcmd : Whenever I hit a breakpoint, I need to display the stack. I use addcmd to automatically display the stack when a breakpoint is hit.

2. set $repeat : I usually get tired of pressing 's' again and again while stepping through a program. I set this variable to automatically re-execute the last command when enter is pressed.

3. set -o vi : same as ksh's set -o vi

4. -E option : I often have to debug programs with MALLOC DEBUG exported. However, I don't want dbx itself to run with MALLOC DEBUG on. I invoke dbx as follows:
dbx -E MALLOCTYPE=debug -E MALLOCDEBUG=report_allocations,catch_overflow ./a.out

5. goto : Useful if I want to skip executing some code at times. Usually I do this if there is some user input to be read, or some sleep event, I just skip all that by using goto. Goto directly sets the instruction pointer.

I hope you find these tips useful. I'll put some more tips at some later point of time.

Sunday, June 22, 2008

Identifying memory leaks on AIX

Memory leaks are often a pain, especially if the program runs for a long time. The pain is increased by the difficulty of pin-pointing the exact place where the memory leak occurs. There are many commercial tools available which do an excellent job of pin-pointing memory leaks, but most of them come at a cost. If you use AIX, though, you have a pretty much plain jane, but very flexible and usable option of the malloc debug tool called report_allocations.
With every call to malloc, a log is added which stores the stack where it was allocated from (yes, it does some dbx like stack walk thingy), and with every call to free() the corresponding log is deleted. An atexit handler is registered which prints all the logged entries. Guess this clarifies how report_allocations work.
To enable report_allocations one needs export the following environment variables

export MALLOCTYPE=debug
export MALLOCDEBUG="report_allocations,stack_depth:4"

stack_depth indicates how much of the stack information is to be stored, the default being 4 and the highest being 32.
Malloc debug does not do any aggregation of allocations of a similar kind (that is allocations of the same size from the same stack trace), but prints all allocations one by one. This is sometimes a drawback, however, one can easily write a script around malloc-debug to aggregate the data. One such script is readily available in a developerworks article. The author of the script, however, warns that the script may break in some corner conditions. However, I guess we can fix those issues. Over to developerworks for more info:

Monday, June 16, 2008

The beginnings: Speeding up C++ STL Vector applications

Having started this blog quite some time ago, I didn't have any idea what I was going to post in it. I only knew that it would contain tips and tricks for AIX which one might find handy, some of which one might find elsewhere, some of which one may not find elsewhere. I still am not sure as to what sort of tips and tricks I'm going to put, but I'll put them, thats for sure. Don't ask me about the frequency.

This first post is for guys who use STL Vectors heavily in their C++ code. If you guys want an improvement in performance, you might want to use the 3.1 malloc allocation policy. All you have to do is to export one of the following environment variable before running your code:

export MALLOCTYPE=3.1 #for 32-bit applications
export MALLOCTYPE=3.1_64BIT #for 64-bit applications

How does it help? Well, it appears, that a vector keeps growing itself as and when it needs. Hence a lot of calls to realloc(). What happens with the 3.1 allocation policy is that requests are rounded up to the next higher power of 2. That is if you request 1026 bytes, you get 2048 bytes. Hence when a realloc is done, it often returns without doing any work since the memory is already there.
3.1 is also faster than the default allocation policy as the default allocation policy uses a tree based data structure, essentially forcing the average time of O(log(n)), 3.1 is a bucket-based allocator with time complexit O(1).
The caveat to using 3.1 is that it is wasteful of memory, so its upto you to decide what you want to compromise on. Memory usage or speed.

How we found this out? Well, we were running some benchmark tests, and on exporting MALLOCTYPE=3.1 we found the performance to shoot up by 40%. Of course, the benchmark we were running was memory bound.