Catching Exceptions and Printing Stack Traces for C on Windows, Linux, & Mac

Debugging C can be a real pain, especially when all you have to go by is that it was a segfault. Great!

In an effort to make testing C code a little less painful, I’ve recently added stack trace support to Unity (for gcc on Windows or Posix systems). That way, when a test crashes, I will at least know where it crashed. I learned quite a bit in the process and thought I’d share.

A Troubled Program

First, let’s start with a test program that will emit some of the signals we want to catch. It is entirely self contained except for a set_signal_handler() that we will define separately for windows and Posix systems, with a slight variation on the Posix version for OS X.

The test program:

#include <stdio.h>
#include <signal.h>
#include <assert.h>

int  divide_by_zero();
void cause_segfault();
void stack_overflow();
void infinite_loop();
void illegal_instruction();
void cause_calamity();

static char const * icky_global_program_name;

int main(int argc, char * argv[])
{
  (void)argc;

  /* store off program path so we can use it later */
  icky_global_program_name = argv[0];

  set_signal_handler();

  cause_calamity();

  puts("OMG! Nothing bad happend!");
  return 0;
}

void cause_calamity()
{
  /* uncomment one of the following error conditions to cause a calamity of 
   your choosing! */

  // (void)divide_by_zero();
  // cause_segfault();
  // assert(false);
  // infinite_loop();
  // illegal_instruction();
  // stack_overflow();
}

int divide_by_zero()
{
  int a = 1;
  int b = 0; 
  return a / b;
}

void cause_segfault()
{
  int * p = (int*)0x12345678;
  *p = 0;
}

void stack_overflow();
void stack_overflow()
{
  int foo[1000]; //allocate something big on the stack
  (void)foo;
  stack_overflow();
}

/* break out with ctrl+c to test SIGINT handling */
void infinite_loop()
{
  while(1) {};
}

void illegal_instruction()
{
  /* I couldn't find an easy way to cause this one, so I'm cheating */
  raise(SIGILL);
}

Just Catching Exceptions

There is basically no portable way to get stack traces. However, if you’re just interested in catching the exceptions and not doing much more that printing an error message, there is a semi-portable c99 solution.

Here’s a simple implementation:

#include <stdio.h>
#include <signal.h>
#include <stdlib.h>

void almost_c99_signal_handler(int sig)
{
  switch(sig)
  {
    case SIGABRT:
      fputs("Caught SIGABRT: usually caused by an abort() or assert()\n", stderr);
      break;
    case SIGFPE:
      fputs("Caught SIGFPE: arithmetic exception, such as divide by zero\n",
            stderr);
      break;
    case SIGILL:
      fputs("Caught SIGILL: illegal instruction\n", stderr);
      break;
    case SIGINT:
      fputs("Caught SIGINT: interactive attention signal, probably a ctrl+c\n",
            stderr);
      break;
    case SIGSEGV:
      fputs("Caught SIGSEGV: segfault\n", stderr);
      break;
    case SIGTERM:
    default:
      fputs("Caught SIGTERM: a termination request was sent to the program\n",
            stderr);
      break;
  }
  _Exit(1);
}

void set_signal_handler()
{
  signal(SIGABRT, almost_c99_signal_handler);
  signal(SIGFPE,  almost_c99_signal_handler);
  signal(SIGILL,  almost_c99_signal_handler);
  signal(SIGINT,  almost_c99_signal_handler);
  signal(SIGSEGV, almost_c99_signal_handler);
  signal(SIGTERM, almost_c99_signal_handler);
}

You can paste this into our test program and compile it with the usual: gcc main.c. If you need anything more than that, you’re going to have to stick with something specific for your platform.

This being C, there are of course loads of caveats. The C99 standard guarantees basically nothing about the signals. The only mechanism for invoking your signal handlers that is guarantee to work is calling raise(). (Note, the abort() function must call raise(SIGABRT), and assert() must call abort().)

If your signal occurs as a result of an abort() or raise(), then your code in your signal handler is just as safe as normal C code is…

If your signal occurs in any other way, the only interactions with the outside world guaranteed to be safe inside your signal handler are writes to values of type sig_atomic_t and calls to _Exit().

This pretty much means the only “guaranteed safe” thing you can do is change the exit code. However, most implementations let you get away with quite a bit more.

Things to Note

Even though SIGFPE stands for “Signal: Floating Point Exception,” this signal is generated on pretty much any arithmetic exception (floating point or integer).

Don’t expect to be able to catch stack overflows. Often the signal handlers are invoked on the same stack that caused the signal to occur. So when a stack overflow occurs, your signal handler is called immediately, causing another stack overflow, and your program just dies with a segfault. On Posix systems there is a way around this.

For more details, see section 7.14 (page 246) of the C99 standard.

Catching Exceptions in Posix

One of the nice things about the Posix signal handler is that we can define an alternate signal stack that our signal handler will use. This allows us to catch and handle things like a stack overflow that would normally kill our handler instantly.

Unfortunately, we have to disable the signal stack on OS X, or backtrace() won’t work. Apparently linux does some magic that lets backtrace still look at the right stack when called from the signal handler that OS X lacks. Turning off the signal stack lets us get our stack traces, but it means our signal handler will fail for stack overflows. We’ll just get a normal segfault. I don’t know how to fix this. If you know a way around this please let me know. Most other errors should still be caught just fine, though.

#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <err.h>

void posix_signal_handler(int sig, siginfo_t *siginfo, void *context)
{
  (void)context;
  switch(sig)
  {
    case SIGSEGV:
      fputs("Caught SIGSEGV: Segmentation Fault\n", stderr);
      break;
    case SIGINT:
      fputs("Caught SIGINT: Interactive attention signal, (usually ctrl+c)\n",
            stderr);
      break;
    case SIGFPE:
      switch(siginfo->si_code)
      {
        case FPE_INTDIV:
          fputs("Caught SIGFPE: (integer divide by zero)\n", stderr);
          break;
        case FPE_INTOVF:
          fputs("Caught SIGFPE: (integer overflow)\n", stderr);
          break;
        case FPE_FLTDIV:
          fputs("Caught SIGFPE: (floating-point divide by zero)\n", stderr);
          break;
        case FPE_FLTOVF:
          fputs("Caught SIGFPE: (floating-point overflow)\n", stderr);
          break;
        case FPE_FLTUND:
          fputs("Caught SIGFPE: (floating-point underflow)\n", stderr);
          break;
        case FPE_FLTRES:
          fputs("Caught SIGFPE: (floating-point inexact result)\n", stderr);
          break;
        case FPE_FLTINV:
          fputs("Caught SIGFPE: (floating-point invalid operation)\n", stderr);
          break;
        case FPE_FLTSUB:
          fputs("Caught SIGFPE: (subscript out of range)\n", stderr);
          break;
        default:
          fputs("Caught SIGFPE: Arithmetic Exception\n", stderr);
          break;
      }
    case SIGILL:
      switch(siginfo->si_code)
      {
        case ILL_ILLOPC:
          fputs("Caught SIGILL: (illegal opcode)\n", stderr);
          break;
        case ILL_ILLOPN:
          fputs("Caught SIGILL: (illegal operand)\n", stderr);
          break;
        case ILL_ILLADR:
          fputs("Caught SIGILL: (illegal addressing mode)\n", stderr);
          break;
        case ILL_ILLTRP:
          fputs("Caught SIGILL: (illegal trap)\n", stderr);
          break;
        case ILL_PRVOPC:
          fputs("Caught SIGILL: (privileged opcode)\n", stderr);
          break;
        case ILL_PRVREG:
          fputs("Caught SIGILL: (privileged register)\n", stderr);
          break;
        case ILL_COPROC:
          fputs("Caught SIGILL: (coprocessor error)\n", stderr);
          break;
        case ILL_BADSTK:
          fputs("Caught SIGILL: (internal stack error)\n", stderr);
          break;
        default:
          fputs("Caught SIGILL: Illegal Instruction\n", stderr);
          break;
      }
      break;
    case SIGTERM:
      fputs("Caught SIGTERM: a termination request was sent to the program\n",
            stderr);
      break;
    case SIGABRT:
      fputs("Caught SIGABRT: usually caused by an abort() or assert()\n", stderr);
      break;
    default:
      break;
  }
  posix_print_stack_trace();
  _Exit(1);
}

static uint8_t alternate_stack[SIGSTKSZ];
void set_signal_handler()
{
  /* setup alternate stack */
  {
    stack_t ss = {};
    /* malloc is usually used here, I'm not 100% sure my static allocation
       is valid but it seems to work just fine. */
    ss.ss_sp = (void*)alternate_stack;
    ss.ss_size = SIGSTKSZ;
    ss.ss_flags = 0;

    if (sigaltstack(&ss, NULL) != 0) { err(1, "sigaltstack"); }
  }

  /* register our signal handlers */
  {
    struct sigaction sig_action = {};
    sig_action.sa_sigaction = posix_signal_handler;
    sigemptyset(&sig_action.sa_mask);

    #ifdef __APPLE__
        /* for some reason we backtrace() doesn't work on osx
           when we use an alternate stack */
        sig_action.sa_flags = SA_SIGINFO;
    #else
        sig_action.sa_flags = SA_SIGINFO | SA_ONSTACK;
    #endif

    if (sigaction(SIGSEGV, &sig_action, NULL) != 0) { err(1, "sigaction"); }
    if (sigaction(SIGFPE,  &sig_action, NULL) != 0) { err(1, "sigaction"); }
    if (sigaction(SIGINT,  &sig_action, NULL) != 0) { err(1, "sigaction"); }
    if (sigaction(SIGILL,  &sig_action, NULL) != 0) { err(1, "sigaction"); }
    if (sigaction(SIGTERM, &sig_action, NULL) != 0) { err(1, "sigaction"); }
    if (sigaction(SIGABRT, &sig_action, NULL) != 0) { err(1, "sigaction"); }
  }
}

Stack Traces in Posix

The backtrace() function is the preferred method of getting a stack trace on Posix systems. However, as you’ll see in a second, we’ll need a little extra processing to get to line numbers.

#include <execinfo.h>
#include <stdio.h>

#define MAX_STACK_FRAMES 64
static void *stack_traces[MAX_STACK_FRAMES];
void posix_print_stack_trace()
{
  int i, trace_size = 0;
  char **messages = (char **)NULL;

  trace_size = backtrace(stack_traces, MAX_STACK_FRAMES);
  messages = backtrace_symbols(stack_traces, trace_size);

  /* skip the first couple stack frames (as they are this function and
     our handler) and also skip the last frame as it's (always?) junk. */
  // for (i = 3; i < (trace_size - 1); ++i)
  // we'll use this for now so you can see what's going on
  for (i = 0; i < trace_size; ++i)
  {
    if (addr2line(icky_global_program_name, stack_traces[i]) != 0)
    {
      printf("  error determining line # for: %s\n", messages[i]);
    }

  }
  if (messages) { free(messages); } 
}

Use addr2line to Get Location in Source

Using backtrace() and backtrace_symbols() is, at best, just going to give you the name of the functions in your stack trace (with some address offsets). However, the very nice addr2line utility (or atos on OS X) can give you the source file and line numbers if you compiled your executable with debug symbols.

Something like the following works nicely:

#include <stdlib.h>
#include <stdio.h>

/* Resolve symbol name and source location given the path to the executable 
   and an address */
int addr2line(char const * const program_name, void const * const addr)
{
  char addr2line_cmd[512] = {0};

  /* have addr2line map the address to the relent line in the code */
  #ifdef __APPLE__
    /* apple does things differently... */
    sprintf(addr2line_cmd,"atos -o %.256s %p", program_name, addr); 
  #else
    sprintf(addr2line_cmd,"addr2line -f -p -e %.256s %p", program_name, addr); 
  #endif

  /* This will print a nicely formatted string specifying the
     function and source line of the address */
  return system(addr2line_cmd);
}

You can compile all this with gcc -g main.c in linux, but you’ll need gcc -g -fno-pie main.c in OS X. (Pie does address randomizing to prevent code injection attacks. It also happens to prevent us from resolving addresses.)

Note that the addresses from stack traces are the return addresses. These lines are actually where the functions are going to return to — not where they are called from! Often it’s just the line after where they are called.

Catching Exceptions in Windows (MinGW)

As usual, there is the way everyone else does things, and the way Windows does things. In Windows, you can catch exceptions using the SetUnhandledExceptionFilter() function like so:

#include <windows.h>
#include <stdio.h>
LONG WINAPI windows_exception_handler(EXCEPTION_POINTERS * ExceptionInfo)
{
  switch(ExceptionInfo->ExceptionRecord->ExceptionCode)
  {
    case EXCEPTION_ACCESS_VIOLATION:
      fputs("Error: EXCEPTION_ACCESS_VIOLATION\n", stderr);
      break;
    case EXCEPTION_ARRAY_BOUNDS_EXCEEDED:
      fputs("Error: EXCEPTION_ARRAY_BOUNDS_EXCEEDED\n", stderr);
      break;
    case EXCEPTION_BREAKPOINT:
      fputs("Error: EXCEPTION_BREAKPOINT\n", stderr);
      break;
    case EXCEPTION_DATATYPE_MISALIGNMENT:
      fputs("Error: EXCEPTION_DATATYPE_MISALIGNMENT\n", stderr);
      break;
    case EXCEPTION_FLT_DENORMAL_OPERAND:
      fputs("Error: EXCEPTION_FLT_DENORMAL_OPERAND\n", stderr);
      break;
    case EXCEPTION_FLT_DIVIDE_BY_ZERO:
      fputs("Error: EXCEPTION_FLT_DIVIDE_BY_ZERO\n", stderr);
      break;
    case EXCEPTION_FLT_INEXACT_RESULT:
      fputs("Error: EXCEPTION_FLT_INEXACT_RESULT\n", stderr);
      break;
    case EXCEPTION_FLT_INVALID_OPERATION:
      fputs("Error: EXCEPTION_FLT_INVALID_OPERATION\n", stderr);
      break;
    case EXCEPTION_FLT_OVERFLOW:
      fputs("Error: EXCEPTION_FLT_OVERFLOW\n", stderr);
      break;
    case EXCEPTION_FLT_STACK_CHECK:
      fputs("Error: EXCEPTION_FLT_STACK_CHECK\n", stderr);
      break;
    case EXCEPTION_FLT_UNDERFLOW:
      fputs("Error: EXCEPTION_FLT_UNDERFLOW\n", stderr);
      break;
    case EXCEPTION_ILLEGAL_INSTRUCTION:
      fputs("Error: EXCEPTION_ILLEGAL_INSTRUCTION\n", stderr);
      break;
    case EXCEPTION_IN_PAGE_ERROR:
      fputs("Error: EXCEPTION_IN_PAGE_ERROR\n", stderr);
      break;
    case EXCEPTION_INT_DIVIDE_BY_ZERO:
      fputs("Error: EXCEPTION_INT_DIVIDE_BY_ZERO\n", stderr);
      break;
    case EXCEPTION_INT_OVERFLOW:
      fputs("Error: EXCEPTION_INT_OVERFLOW\n", stderr);
      break;
    case EXCEPTION_INVALID_DISPOSITION:
      fputs("Error: EXCEPTION_INVALID_DISPOSITION\n", stderr);
      break;
    case EXCEPTION_NONCONTINUABLE_EXCEPTION:
      fputs("Error: EXCEPTION_NONCONTINUABLE_EXCEPTION\n", stderr);
      break;
    case EXCEPTION_PRIV_INSTRUCTION:
      fputs("Error: EXCEPTION_PRIV_INSTRUCTION\n", stderr);
      break;
    case EXCEPTION_SINGLE_STEP:
      fputs("Error: EXCEPTION_SINGLE_STEP\n", stderr);
      break;
    case EXCEPTION_STACK_OVERFLOW:
      fputs("Error: EXCEPTION_STACK_OVERFLOW\n", stderr);
      break;
    default:
      fputs("Error: Unrecognized Exception\n", stderr);
      break;
  }
  fflush(stderr);
  /* If this is a stack overflow then we can't walk the stack, so just show
    where the error happened */
  if (EXCEPTION_STACK_OVERFLOW != ExceptionInfo->ExceptionRecord->ExceptionCode)
  {
      windows_print_stacktrace(ExceptionInfo->ContextRecord);
  }
  else
  {
      addr2line(icky_global_program_name, (void*)ExceptionInfo->ContextRecord->Eip);
  }

  return EXCEPTION_EXECUTE_HANDLER;
}

void set_signal_handler()
{
  SetUnhandledExceptionFilter(windows_exception_handler);
}

I found this article on “Win32 Structured Exception Handling “ to be particularly helpful.

The windows_print_stacktrace() is defined in the next section.

Stack Traces in Windows (MinGW)

With MinGW, we can use the same addr2line() function that we defined earlier once we have an address:

#include <windows.h>
#include <imagehlp.h>
void windows_print_stacktrace(CONTEXT* context)
{
  SymInitialize(GetCurrentProcess(), 0, true);

  STACKFRAME frame = { 0 };

  /* setup initial stack frame */
  frame.AddrPC.Offset         = context->Eip;
  frame.AddrPC.Mode           = AddrModeFlat;
  frame.AddrStack.Offset      = context->Esp;
  frame.AddrStack.Mode        = AddrModeFlat;
  frame.AddrFrame.Offset      = context->Ebp;
  frame.AddrFrame.Mode        = AddrModeFlat;

  while (StackWalk(IMAGE_FILE_MACHINE_I386 ,
                   GetCurrentProcess(),
                   GetCurrentThread(),
                   &frame,
                   context,
                   0,
                   SymFunctionTableAccess,
                   SymGetModuleBase,
                   0 ) )
  {
    addr2line(icky_global_program_name, (void*)frame.AddrPC.Offset);
  }

  SymCleanup( GetCurrentProcess() );
}

You can compile the windows version with gcc -g main.c -limagehlp. And you can use the same addr2line() as with the Posix examples. Yay MinGW!

You can also get this to work with the Visual C compiler if you use the SymGetSymFromAddr64() function instead of the addr2line utility as demonstrated here. That post is about trying to get stack traces to work in MinGW, but the code should work in Visual Studio.

If you want to print out a stack trace without having an exception, you’ll have to get the local context with the RtlCaptureContext() function.

I also imagine that if you’re using cygwin, you can just stick with the Posix versions of everything, and things should just work. But I haven’t tried it.

Does it work!?

Sample output on Windows when I uncomment cause_segfault():
 
Error: EXCEPTION_ACCESS_VIOLATION
cause_segfault at Z:\Projects\stack_traces/c_signal.c:428
cause_calamity at Z:\Projects\\stack_traces/c_signal.c:414
main at Z:\Projects\stack_traces/c_signal.c:398
?? at crt1.c:0
??
??:0
??
??:0

A compete collection of all the variants in a single .c file that can be built on Window, Linux and OS X is here.

I hope this saves someone else a few hours of time. Happy hacking :)

 

Conversation
  • Bram says:

    I’m having trouble using this on OSX.
    The man page of atos says I need a load address.
    Your example does not provide a load address, which is why, I think, it does not resolve the symbols for me.

  • charlotte says:

    Thank you for this interesting article. In the listing of posix_signal_handler() I think you need a break statement at line 49 before case SIGILL. Also, it might be wise to have a default case with fputs(“Error: Unrecognized Exception\n”, stderr); like you do with the Windows signal handler.

  • Usama says:

    You have explained how to trap exceptions way more simply than others. I am on windows and couldn’t make it trap stackoverflow exception. Applications is really big with number of threads and about 20 modules.

  • Abhijeet Bhilare says:

    I am using this code on windows 10 with Qt 5.4.0 mingw. I am getting few errors while compiling this code.
    error: undefined reference to `_imp__SymInitialize@12′
    error: undefined reference to `_imp__SymGetModuleBase@8′
    error: undefined reference to `_imp__SymFunctionTableAccess@8′
    error: undefined reference to `_imp__StackWalk@36′
    error: undefined reference to `_imp__SymCleanup@4′

    How to resolve this issues.

    • Braden Steffaniak says:

      You need to include the imagehlp library e.g. “gcc -g main.c -limagehlp”

  • Comments are closed.