Preprocessor#

Let’s recap what we have learned so far about the C preprocessor:

  • It is a preliminary compilation step, happening before the compilation proper (phase 7)

  • Its input is a stream of tokens

What that means, is that the preprocessor manipulates text, not values:

  • It cannot use the result of expressions like 1 + 3, sizeof(int), or strlen("Hello")[1] that are evaluated during phase 7.

  • What it can do is more akin to string manipulation than math: it is meant to modify / generate code, not to do computation

Interacting with the preprocessor is done by starting a line with the # character, followed by a preprocessing directive. (Any number of spaces can be present before and after the # character)

Directives#

File inclusion#

#include < filename >

Look for a file called filename in folders provided to the preprocessor[2] with the -I flag, and in standard folders configured at compiler installation. Once the file is found, its content is pasted verbatim in place of the #include line

#include " filename "

Same as above, but look into the current directory first

Source : cppreference

Note

No assumption is made about the content of the included file, it technically doesn’t have to be valid C, or even code at all…

Which directories does my compiler look into ?

To list the folders where you compiler’s preprocessor looks for files, you can execute the following command:

$(cc -print-prog-name=cpp) -v < /dev/null
Possible output#
$ `cc -print-prog-name=cpp` -v < /dev/null
 ...
 #include "..." search starts here:
 #include <...> search starts here:
  /usr/lib/gcc/x86_64-linux-gnu/11/include
  /usr/local/include
  /usr/include/x86_64-linux-gnu
  /usr/include
 ...
$ `cc -print-prog-name=cpp` -I ~/mylib/include -iquote ./include -v < /dev/null
 ...
 #include "..." search starts here:
  ./include
 #include <...> search starts here:
  /home/user/mylib/include
  /usr/lib/gcc/x86_64-linux-gnu/11/include
  /usr/local/include
  /usr/include/x86_64-linux-gnu
  /usr/include
 ...

Source: stack overflow

Macros#

Object-like#

#define identifier replacement

After this line, anytime identifier appears in the source code, it will be replaced by replacement

#define identifier

Equivalent to #define identifier 1

Function-like#

#define identifier(parameters) replacement

After this line, anytime identifier(values) appears in the source code, it will be replaced by replacement, replacing any occurence of a parameter by the value provided by the caller.

#define identifier(parameters, ...) replacement

Similar to the previous definition, but zero or more extra parameters can be supplied. The identifier __VA_ARGS__ will be replaced by those extra parameters. Additionally, __VA_OPT__(x) will be replaced by nothing if zero extra parameters were supplied, or by x if at least one extra parameter was supplied.

Source: cppreference

Conditional inclusion#

#if condition A #else B #endif

Evaluates condition (so at preprocessor-time), then replaces the whole #if#endif block with A or B depending on the result.

#ifdef MACRO

Equivalent to #if defined(MACRO)

#ifndef MACRO

Equivalent to #if !defined(MACRO)

#elif condition2 B #endif

Convenient way to chain multiple conditions without nesting, equivalent to:

#else
#  if condition2
B
#  endif
#endif
#elifdef MACRO

Added in C23 for constistency, equivalent to #elif defined(MACRO)

#elifndef MACRO

Added in C23 for constistency, equivalent to #elif !defined(MACRO)

Source: cppreference

How is the condition evaluated ?

The #if block needs to be resolved at preprocessor-time, so its condition is evaluated with limited capabilities:

  • only integer literals and macros that evaluate to an integer literal can be used in the condition

  • all identifiers unknown to the preprocessor are replaced with 0

Said differently: anything that is not a macro is replaced by 0, even if it has a value known at compile time (e.g. comparing to an enumerator is actually comparing to 0).

Danger

It means that typos are silently replaced by 0

The operators#

We have seen previously that the input of the preprocessor is a stream of tokens, each with a type.

It should be of no surprise then, that the 2 preprocessor operators are about manipulating tokens.

#

Set token type to string literal

name name

##

Concatenate 2 tokens

some thing something

Those operators can only be used on parameters of function-like macros.

Source: cppreference