18 Comments

Things About C that Make Me Say, “WHAT?!?”

Disapproval
It’s hard to work with C code on a daily basis without developing a love-hate relationship with it. I’ve been working with C for several years now, and most days I truly enjoy using it. There are other days, however, when it makes me want to pull my hair out.

I have found several aspects of C that really tripped me up or left me staring at the screen with an expression of disbelief on my face. I’d like to share a few of them with you so hopefully you can avoid them!  

Uninitialized Booleans

The specification for C states that using undefined variables can cause “undefined behavior.” In some case compiler’s seem to take that to an extreme. For example, what do you think would happen if you ran the code below?

bool i_love_c;
if(i_love_c) printf(“I love writing in C. ”);
if(!i_love_c) printf(“I hate writing in C.”);

Logically, it would seem like only one or the other could be true. But sure enough, the output is, “I love writing in C. I hate writing in C.” So, what happened? When the compiler NOTs a Boolean, it tries to do a logical negation, but it does so by taking a short cut and toggling only the least significant bit.

An initialized Boolean always has a value of either 1 or 0, which is fine under normal circumstances. However, if left uninitialized, a Boolean could be any value. For example, take the number 119. When the compiler performs !119, it just toggles the least-significant bit, resulting in the number 118. Now it makes sense: 118 and 119 are both non-zero values, so the if-statement passes for both of them.

But now let’s take this one step further. What if we change the code to look like this:

bool i_love_c;
if(i_love_c == true) printf(“I love writing in C. ”);
if(i_love_c == false) printf(“I hate writing in C.”);

Surely that can’t both pass this time… can they? Believe it or not they do. The output is again, “I love writing in C. I hate writing in C.” Unfortunately, this time I can’t explain it. The value true equals 1 and false equals 0, so how can anything equal both 1 and 0? This is what they mean when they say “undefined behavior.” The thing to keep in mind is, always always always initialize your variables.

Using sizeof() on Arrays

I love the sizeof() function, and I use it all the time, but you have to be really careful when using it on arrays. Just keep in mind that if the declaration of the array is not within the same scope as the sizeof() function call, it will not return the correct result. Let’s take a look. This time we have two functions:

void main(void)
{
  char name[] = "HELLO";
  printf("array size = %d\n", sizeof(name));
  print_array_size(name);
}
void print_array_size(char array[])
{
  printf("array size = %d\n", sizeof(array));
}

The size of the name array is 6 bytes: one for each letter in HELLO, and one for a null character. The call to sizeof() in the main function returns the correct value because it’s in the same scope as the declaration of the array. However, if we pass the array to another function, what happens?

"array size = 6"
"array size = 4"

Fail! Inside the print_array_size() function, the declaration of the name array is no longer in scope, therefore, sizeof() cannot determine the correct length of the array. Where does the value of 4 come from? It’s the size of a char * type on my 32-bit machine. To avoid this pitfall just remember that in C, when you pass an array to a function you are actually passing a pointer to the first element in the array. The sizeof() function cannot determine the length of an array when all it has is an address.

Initializing Character Arrays

C does not have a native string type. Instead, we are forced to use character arrays. Maybe it’s just because I’ve gotten used to it, but I really don’t mind. Still, you can get into trouble if you’re not careful. For example, do you see anything wrong with the code below?

char my_name[6] = “JORDAN”;
printf(“Hello my name is %s”, my_name);

The output produced on my machine is: “Hello my name is JORDANΓöÇ[EwL0@”. Wow, I’d like to hear you try to pronounce that! If it’s not immediately obvious, this happens because strings are expected to be null terminated in C. We declared an array that is 6 bytes long and filled it with 6 letters, but we didn’t leave room for the null terminator at the end, and the compiler doesn’t put one there. The %s format character in the printf function tells it to begin printing characters beginning at the start of my_name and stop when it reaches a zero. Because we didn’t successfully null-terminate our string, a list of garbage characters (the next values in memory following our name) are printed out until a zero is finally reached.

For the reason just stated, if we use the strlen() function on my_name array, it returns an incorrect value of 10. However, if we use the sizeof() function, we see the correct value, 6. To be safe, if you want to declare a character array and initialize it to a string, just use empty square brackets such as char my_name[] = “JORDAN”. In this case, the compiler will automatically allocate enough bytes for your string and the null terminator.

C also allows you to declare character arrays using char pointers and string literals. Doing so can also get you into a lot of trouble. Do you see anything wrong with the code below?

char * p_your_name = “JOHNNY”;
memcpy(p_your_name, “GEORGE”, 6);

The problem is that p_your_name is a pointer to a string literal. String literals are not intended to be modified. For example, when I try to run the above code on my Windows machine, the program crashes, and I get a pop-up-box that says, “The program stopped working…” Not desirable. This happens because the compiler on my computer stores the string literal in read-only memory. Many embedded devices may not support read-only memory and will allow you to do this without any problems. Just keep in mind that you are venturing into undefined behavior land, and you might get screwed in the end.

Initialization of Structures and Enumerations

To all you C-newbies out there, be very careful and thoughtful when it comes to initializing your structs. Remember that the data contained in your struct is important. If it wasn’t, it wouldn’t be there, so give it the common courtesy of a decent initialization! This is even more important when the struct contains an enum type. Consider the following code.

typedef enum {APPLE, BANANA} FRUIT_T;
 
typedef struct{
    FRUIT_T f;
    uint8_t fruit_count;
} FRUIT_BASKET_T;
 
void main(void)
{
    FRUIT_BASKET_T fruit_basket;
    switch (fruit_basket.f)
    {
        case APPLE:
            break;
        case BANANA:
            break;
        default:
            printf("Never get here. Crash 'N Burn.");
    }
}

What do you think happens when this code runs? It outputs: “Never get here. Crash ‘N Burn.” This happens because enums are actually represented as integers. If you leave an enum uninitialized, it can be equal to any integer value, which means there’s a very good chance that it will not be equal one of the correct values of your enumeration. Also, imagine if you explicitly define an enumeration so that the first value is equal to 1. If a structure that contains that enum type is initialized to all zeros, the same problem occurs.

What About You?

These are some of the unusual aspects of C that I have found. What about you? What parts of the language make you want to pull your hair out?