The Preprocessor
The C preprocessor runs before the compiler ever sees your code. It operates on text, not on C syntax. It includes files, expands macros, and conditionally compiles sections of code. Every line that starts with # is a preprocessor directive. Understanding what the preprocessor does and does not do is essential for writing correct C.
File Inclusion
The #include directive copies the entire contents of another file into the current file. There are two forms.
#include <stdio.h> /* Search system include paths */
#include "myheader.h" /* Search current directory first, then system paths */
The angle bracket form searches the standard system directories. The quoted form searches the directory of the current file first. This distinction matters when you have a local header with the same name as a system header.
What Actually Happens
When the preprocessor encounters #include "utils.h", it literally pastes the contents of utils.h into the current file. If utils.h itself contains #include directives, those are expanded too. A single .c file can expand to tens of thousands of lines after preprocessing. You can see the result with:
gcc -E main.c -o main.i
The -E flag tells GCC to stop after preprocessing and output the result.
Include Guards
If header.h is included by both a.h and b.h, and your .c file includes both a.h and b.h, the contents of header.h appear twice. This causes duplicate definition errors. Include guards prevent this.
/* myheader.h */
#ifndef MYHEADER_H
#define MYHEADER_H
struct Point {
int x;
int y;
};
void draw_point(struct Point p);
#endif /* MYHEADER_H */
The first time the preprocessor encounters this file, MYHEADER_H is not defined, so the contents are included and MYHEADER_H is defined. The second time, MYHEADER_H is already defined, so everything between #ifndef and #endif is skipped.
pragma once
Many compilers support a non-standard alternative:
#pragma once
struct Point {
int x;
int y;
};
This tells the compiler to include the file only once. It is simpler and avoids the risk of guard name collisions. However, it is not part of the C standard. In practice, every major compiler supports it, and most new codebases use it. The Linux kernel uses traditional include guards.
Macros with #define
The #define directive creates a macro. The simplest form defines a constant.
#define MAX_BUFFER_SIZE 1024
#define PI 3.14159265358979323846
Wherever MAX_BUFFER_SIZE appears in the code, the preprocessor replaces it with 1024. This is textual substitution, not a variable assignment.
Function-Like Macros
Macros can take parameters.
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define SQUARE(x) ((x) * (x))
The parentheses around every parameter and around the entire expression are critical. Without them, operator precedence can produce wrong results.
/* Without inner parentheses: SQUARE(1 + 2) expands to (1 + 2 * 1 + 2) = 5 */
/* With parentheses: SQUARE(1 + 2) expands to ((1 + 2) * (1 + 2)) = 9 */
Stringification with
The # operator converts a macro parameter to a string literal.
#define STRINGIFY(x) #x
#define LOG_VAR(var) printf(#var " = %d\n", var)
int count = 42;
LOG_VAR(count); /* Expands to: printf("count" " = %d\n", count); */
count = 42
Adjacent string literals in C are concatenated by the compiler, so "count" " = %d\n" becomes "count = %d\n".
Token Pasting with
The ## operator concatenates two tokens into one.
#define MAKE_GETTER(type, field) \
type get_##field(const struct Config *c) { \
return c->field; \
}
MAKE_GETTER(int, width)
MAKE_GETTER(int, height)
This generates get_width and get_height functions. Token pasting is used in code generation patterns, particularly in libraries that need to create families of similar functions.
Conditional Compilation
The preprocessor can include or exclude code based on conditions.
#ifdef DEBUG
printf("Debug: x = %d\n", x);
#endif
#ifndef NDEBUG
assert(ptr != NULL);
#endif
#if defined(_WIN32)
#include <windows.h>
#elif defined(__linux__)
#include <unistd.h>
#elif defined(__APPLE__)
#include <mach/mach.h>
#endif
This is how C code handles platform differences. The same source file compiles on Windows, Linux, and macOS by including different headers and calling different system APIs based on preprocessor conditionals.
Debug Logging Pattern
A common pattern uses macros for debug output that compiles to nothing in release builds.
#ifdef DEBUG
#define DBG(fmt, ...) fprintf(stderr, "DEBUG %s:%d: " fmt "\n", \
__FILE__, __LINE__, ##__VA_ARGS__)
#else
#define DBG(fmt, ...) ((void)0)
#endif
DBG("processing item %d", item_id);
In debug builds, this prints the file name, line number, and message. In release builds, it vanishes entirely with zero runtime cost.
When Macros Are Appropriate
Macros are the right tool for:
- Constants that need to be compile-time values:
#define ARRAY_SIZE 256 - Conditional compilation for platform-specific code or debug/release builds
- Debug logging where you need
__FILE__and__LINE__ - Header include guards
- Compile-time assertions in pre-C11 code
#define ARRAY_LEN(arr) (sizeof(arr) / sizeof((arr)[0]))
int scores[10];
for (size_t i = 0; i < ARRAY_LEN(scores); i++) {
scores[i] = 0;
}
The ARRAY_LEN macro works because sizeof is evaluated at compile time. No function can replicate this behavior because arrays decay to pointers when passed to functions.
When Macros Are Not Appropriate
Macros should not replace functions when a function would work.
/* Bad: macro with side effects */
#define MAX(a, b) ((a) > (b) ? (a) : (b))
int x = MAX(i++, j++); /* One of i or j gets incremented twice */
/* Good: inline function with type safety */
static inline int max_int(int a, int b) {
return a > b ? a : b;
}
The MAX macro evaluates its arguments multiple times. If an argument has side effects, the behavior is unexpected. An inline function evaluates each argument exactly once and provides type checking.
The Problem with Macro Type Safety
Macros perform textual substitution. They do not check types.
#define ABS(x) ((x) < 0 ? -(x) : (x))
/* This compiles but makes no sense */
char *name = "hello";
ABS(name); /* Compiles without warning on some compilers */
An inline function would catch this at compile time. Use macros only when you need something a function cannot provide: access to __FILE__, __LINE__, sizeof on the caller's variable, or conditional compilation.
Predefined Macros
The compiler provides several built-in macros.
printf("File: %s\n", __FILE__);
printf("Line: %d\n", __LINE__);
printf("Date: %s\n", __DATE__);
printf("Time: %s\n", __TIME__);
printf("Function: %s\n", __func__); /* C99, technically not a macro */
File: main.c
Line: 12
Date: Apr 18 2026
Time: 14:30:00
Function: main
These are invaluable for error reporting and logging. __FILE__ and __LINE__ are the reason debug logging macros exist: no function can report the caller's file and line without them.
Common Pitfalls
- Missing parentheses in macros —
#define DOUBLE(x) x * 2breaks withDOUBLE(1 + 3)which expands to1 + 3 * 2 = 7instead of8. Always parenthesize parameters and the entire expression. - Multiple evaluation —
MAX(expensive_function(), other_function())calls the larger function twice. Use inline functions for anything with side effects. - Macro name collisions — A macro named
ERRORwill replace every occurrence ofERRORin your code, including enum values and struct fields. Use prefixed names likeMYLIB_ERROR. - Debugging expanded macros — When a macro produces a compiler error, the error message refers to the expanded code, not the macro definition. Use
gcc -Eto see what the preprocessor actually produced. - Forgetting include guards — Without guards, a header included twice causes redefinition errors. Every header file needs either include guards or
#pragma once. - Using macros for complex logic — Multi-line macros with control flow are hard to read, hard to debug, and provide no type safety. If it looks like a function, make it a function.
Key Takeaways
- The preprocessor operates on text before the compiler runs. It handles
#include,#define, and conditional compilation. - Include guards (
#ifndef/#define/#endif) or#pragma onceprevent duplicate inclusion of header files. - Function-like macros must parenthesize every parameter and the full expression to avoid precedence bugs.
- Stringification (
#) and token pasting (##) enable code generation patterns, but use them sparingly. - Macros are appropriate for constants, conditional compilation, debug logging, and
sizeof-based tricks. For everything else, prefer inline functions. - Macros have no type safety and can evaluate arguments multiple times. These are the two dangers that make inline functions the better default.