Open Mind Tree: C Programming Lecture 18 (Last Lecture)

Common Practices

With its extensive use, a number of common practices and conventions have evolved to help avoid errors in C programs. These are simultaneously a demonstration of the application of good software engineering principles to a language and an indication of the limitations of C. Although few are used universally, and some are controversial, each of these enjoys wide use.

Dynamic multidimensional arrays

Although onedimensional arrays are easy to create dynamically using malloc, and fixedsize

multidimensional arrays are easy to create using the builtin language feature, dynamic

multidimensional arrays are trickier. There are a number of different ways to create them, each with different tradeoffs. The two most popular ways to create them are:

• They can be allocated as a single block of memory, just like static multidimensional arrays.
      This requires that the array be rectangular (i.e. subarrays of lower dimensions are static and
      have the same size). The disadvantage is that the syntax of declaration the pointer is a little
      tricky for programmers at first. For example, if one wanted to create an array of ints of 3
      columns and rows rows, one would do

int (*multi_array)[3] = malloc(rows * sizeof(int[3]));

(Note that here multi_array is a pointer to an array of 3 ints.)

Because of arraypointer interchangeability, you can index this just like static

multidimensional arrays, i.e. multi_array[5][2] is the element at the 6th row and 3rd
column.

• They can be allocated by first allocating an array of pointers, and then allocating subarrays
      and storing their addresses in the array of pointers (this approach is also known as an Iliffe
      vector). The syntax for accessing elements is the same as for multidimensional arrays
      described above (even though they are stored very differently). This approach has the
      advantage of the ability to make ragged arrays (i.e. with subarrays of different sizes).
      However, it also uses more space and requires more levels of indirection to index into, and
      can have worse cache performance. It also requires many dynamic allocations, each of which
      can be expensive.

For more information, see the comp.lang.c FAQ, question 6.16.

In some cases, the use of multidimensional arrays can best be addressed as an array of structures. Before userdefined data structures were available, a common technique was to define a multi
dimensional array, where each column contained different information about the row. This approach is also frequently used by beginner programmers. For example, columns of a twodimensional
character array might contain last name, first name, address, etc.

In cases like this, it is better to define a structure that contains the information that was stored in the columns, and then create an array of pointers to that structure. This is especially true when the
number of data points for a given record might vary, such as the tracks on an album. In these cases,
it is better to create a structure for the album that contains information about the album, along with a dynamic array for the list of songs on the album. Then an array of pointers to the album structure
can be used to store the collection.

Constructors and destructors

In most objectoriented languages, objects cannot be created directly by a client that wishes to use them. Instead, the client must ask the class to build an instance of the object using a special routine called a constructor. Constructors are important because they allow an object to enforce invariants about its internal state throughout its lifetime. Destructors, called at the end of an object's lifetime, are important in systems where an object holds exclusive access to some resource, and it is desirable to ensure that it releases these resources for use by other objects.

Since C is not an objectoriented language, it has no builtin support for constructors or destructors.

It is not uncommon for clients to explicitly allocate and initialize records and other objects.
However, this leads to a potential for errors, since operations on the object may fail or behave
unpredictably if the object is not properly initialized. A better approach is to have a function that
creates an instance of the object, possibly taking initialization parameters, as in this example:

struct string {

size_t size;
char *data; };

struct string *create_string(const char *initial) {
assert (initial != NULL);

struct string *new_string = malloc(sizeof(*new_string)); if (new_string != NULL) {

new_string>size = strlen(initial);
new_string>data = strdup(initial); }

return new_string;

}

Similarly, if it is left to the client to destroy objects correctly, they may fail to do so, causing

resource leaks. It is better to have an explicit destructor which is always used, such as this one:

void free_string(struct string *s) {

assert (s != NULL);

free(s>data); /* free memory held by the structure */

free(s); /* free the structure itself */

}

It is often useful to combine destructors with #Nulling freed pointers.

Sometimes it is useful to hide the definition of the object to ensure that the client does not allocate it manually. To do this, the structure is defined in the source file (or a private header file not available to users) instead of the header file, and a forward declaration is put in the header file:

struct string;

struct string *create_string(const char *initial); void free_string(struct string *s);

Nulling freed pointers

As discussed earlier, after free() has been called on a pointer, it becomes a dangling pointer.
Worse still, most modern platforms cannot detect when such a pointer is used before being
reassigned.

One simple solution to this is to ensure that any pointer is set to a null pointer immediately after being freed:

free(p);
p = NULL;

Unlike dangling pointers, a hardware exception will arise on many modern architectures when a null pointer is dereferenced. Also, programs can include error checks for the null value, but not for a dangling pointer value. To ensure it is done at all locations, a macro can be used:

#define FREE(p) do { free(p); (p) = NULL; } while(0)

(To see why the macro is written this way, see #Macro conventions.) Also, when this technique is used, destructors should zero out the pointer that they are passed, and their argument must be passed by reference to allow this. For example, here's the destructor from #Constructors and destructors

updated:

void free_string(struct string **s) {

assert(s != NULL && *s != NULL);

FREE((*s)>data); /* free memory held by the structure */

FREE(*s); /* free the structure itself */

}

Unfortunately, this idiom will not do anything to any other pointers that may be pointing to the freed memory. For this reason, some C experts regard this idiom as dangerous due to creating a false
sense of security.

Macro conventions

Because preprocessor macros in C work using simple token replacement, they are prone to a number of confusing errors, some of which can be avoided by following a simple set of conventions:

1. Placing parentheses around macro arguments wherever possible. This ensures that, if they
are expressions, the order of operations does not affect the behavior of the expression. For
example:

• Wrong: #define square(x) x*x

2. Better: #define square(x) (x)*(x)

3. Placing parentheses around the entire expression if it is a single expression. Again, this
avoids changes in meaning due to the order of operations.

• Wrong: #define square(x) (x)*(x)

4. Better: #define square(x) ((x)*(x))

5. If a macro produces multiple statements, or declares variables, it can be wrapped in a do { ...
      } while(0) loop, with no terminating semicolon. This allows the macro to be used like a
      single statement in any location, such as the body of an if statement, while still allowing a
      semicolon to be placed after the macro invocation without creating a null statement. Care
      must be taken that any new variables do not potentially mask portions of the macro's
      arguments.

• Wrong: #define FREE(p) free(p); p = NULL;

6. Better: #define FREE(p) do { free(p); p = NULL; } while(0)

7. Avoiding using a macro argument twice or more inside a macro, if possible; this causes
problems with macro arguments that contain side effects, such as assignments.