Programing Tips

What is an assert and how do I use it?

Asserts are means for debuging programs, an assert verifies that a certain condition is met. You use asserts when there is some assumptions in your code and you would like to verify that those assumptions are not violated.

For example, suppose you have a function f() that receives a pointer. You wrote the program in such a way that f() is not supposed to recieve a NULL pointer. You will write it in the floowing way:

void f(int* p)
{
  assert(p != NULL);
  ...
}

When you compile the program in debug mode, the call to assert will verify that p is really not a NULL. When you compile the program for release (no -g option in gcc, this is compiler dependent), the assert will do nothing, actualy it will produce no code at all.

Here is the source of assert.h, so you can see how it works:


#ifdef NDEBUG
#undef assert
#define assert(EX) ((void)0)

#else

extern void __assert(const char *, const char *, int);
#define assert(EX)  ((EX)?((void)0):__assert( # EX , __FILE__, __LINE__))
#endif /* NDEBUG */

Asserts are NOT used for stopping a program in case of an error.

Why does the following program core dumps on some unix, but not on others?

#include
#include

int printi(char *buffer); 


int main()
{
  char *buffer;
  int i;

  buffer = (char *)malloc(20);	// malloc always return aligned memory buffer
  if (buffer == NULL) return 1;
  i = printi(buffer);
  return 1;
}

int printi(char * buffer)
{
  int* array;
  buffer[0] = 'i';
          
  array = (int *)(buffer + 1);	// This is the problem, array is unaligned.
  array[0] = 2;			// The program fails in this point.
  printf("the num is %d\n", array[0]);
  return 1;
}

In a nutshell, the problem is that some CPU's load data from memory address that is aligned to 2, 4 or 8 bytes (depending on the CPU). This simplifies the CPU and makes memory access faster. Addresses that are multiple of 4 (2, or 8 on some machines) are called aligned address and those that are not, are called unaligned addresses. When you try to access unaligned memory, the CPU generates an exception which results in a core dump.

How come the program works if I have an array of characters? After all, in an array of characters, most characters are not on aligned memory address. The answer is that the compiler generates code that loads the appropriate, aligned 32 bit word and extracts byte you ask for (this takes several machine instructions).

What can I do? In most cases this says you have some problem with your program, make sure you realy need to do this.
If you do, the answer is not simple. Some compilers allow you to define pointers to unaligned memory (visual C on PC, the digital compiler on alpha). Unfortunatly, it seems like gcc (g++) does not support such a feature which forces you to write some hacky code.
If you like the gcc to warn you of such potential problems, you can use the -Wcast-align warning.

C and C++ linkage or
what does extern "C" means?

Sometimes we see header files that have the following lines:
...

#ifdef __cplusplus
extern "C" {
#endif

// The main of the header

#ifdef __cplusplus
}
#endif

The #ifdef makes sure the extern "C" will work when compiling C++ programs, and not C programs. This allows both C and C++ programs to share the same header file.

Defining a C++ function as extern "C", tells the compiler to generate function names and function calls that are compatible with the old "C" standard. This way a C program can call C++ programs and the other way round.

When compiling C programs, for the function int foo(int, int), the compiler will generate a symbol in the object file with the name _foo (for the sake of completeness, the names of the symbols are compiler dependent). When compiling the same function in C++, the compiler will generate a name that describes the name of the function, the return value, the number and type of arguments, and possibly other things (name space for example). This also means that functions that are defined as extern "C" cannot be overloaded.

Here is a sample dump of what the compiler generates.
C++ Version C style functions
/* test.cpp */

typedef void* EXHandle;
typedef double VALUE_TYPE;
VALUE_TYPE EXEval(EXHandle exr)
{
}

int foo(int i, float f)
{
}

[shacharf@soul]$ objdump --syms test.o

test.o:     file format elf32-i386

SYMBOL TABLE:
*ABS*  00000000 ext.cpp
.text  00000000 
.data  00000000 
.bss   00000000 
.text  00000000 gcc2_compiled.
.eh_frame      00000000 
.eh_frame      00000000 __FRAME_BEGIN__
.note  00000000 
.comment       00000000 
.text  00000009 EXEval__FPv
.text  00000009 foo__Fif
/* test.cpp */

typedef void* EXHandle;
typedef double VALUE_TYPE;
extern "C" VALUE_TYPE EXEval(EXHandle expr)
{
}

extern "C" int foo(int i, float f)
{
}

[shacharf@soul proj99w]$ objdump --syms test.o

test.o:     file format elf32-i386

SYMBOL TABLE:
*ABS*  00000000 ext.cpp
.text  00000000 
.data  00000000 
.bss   00000000 
.text  00000000 gcc2_compiled.
.eh_frame      00000000 
.eh_frame      00000000 __FRAME_BEGIN__
.note  00000000 
.comment       00000000 
.text  00000009 EXEval
.text  00000009 foo

My program runs fine on Windows but core dumps on Linux

Sometimes your programs runs fine on one operating system and fails on another. The problem may appear while calling system functions like malloc, free, strcpy. Such a behavior usually does not imply a problem in the system,  but rather a bug in your program.

The most common reason for these bugs is that you allocate a memory buffer of K bytes and write more than K bytes.

Why does a problem occur in malloc or free?

Whenever you allocate a buffer, malloc uses a few bytes before (and perhaps after) the buffer it assigned to your program. If you write over those bytes, the data structures of mallocs gets corrupted and the next time you call maloc it fails.

The table on the right shows 100 bytes that malloc assigned to a user program at address 1000. i.e malloc(100) retured 1000.
If you write to address 1100 and up, you override malloc's data structure which results segmentation fault (access violation) during malloc or free (new or delete).

Address Usage
980 20 bytes for malloc data structures
1000 User buffer, what mallocs returns
1100 20 bytes for malloc data structures
1120

...

 

 

How can I find where the problem is?

In MSDN (visual-c) search for "Debugging Techniques, Problems, and Solutions", "debug heap" or _malloc_dbg. and read about the heap, and how to debug such problems. The short way of doing it is as follows:

Add the following line to an include file that is used by all of your project:

#define _CRTDBG_MAP_ALLOC
#include "crtdbg.h"

In the main, add the following lines:

#ifdef WIN32
#ifdef _DEBUG
{
    /* enable memory allocation debugging */

    /* Get current flag */
    int tmpFlag = _CrtSetDbgFlag( _CRTDBG_REPORT_FLAG );

    /* Turn on leak-checking bit */
    tmpFlag |= _CRTDBG_CHECK_ALWAYS_DF;

    /* Set flag to the new value */
    _CrtSetDbgFlag( tmpFlag );
}
#endif
#endif // WIN32

An assert will occur if malloc or debug finds that any of its data structure was corrupted. The assert will report the address where the problem occurs. Note that your program may slow down significantly.

How to use this information

You have two options:

  1. Inspect the code, the problem is somewhere between the last malloc and the point where the problem was found.
  2. Run the program once, write down the address that is corrupted. On the second time you run the program, put a breakpoint on that address. Seek "pointers, corrupting memory addresses" in the MSDN.