Preprocessor & Code Quality
What happens BEFORE compilation — #define macros, #include, #ifdef guards, and writing quality C code.
A program that rewrites your program before compiling
Imagine you write a document with placeholders like [YOUR NAME] and [DATE]. Before printing, an assistant finds every placeholder and replaces it with the real text. The preprocessor does exactly this for your C source file — it performs text substitution before the compiler ever sees the code. It is not a C compiler; it manipulates text.
Compilation in C happens in stages. The preprocessor runs first, transforming your source file. It handles all lines that begin with #. The output is a new, expanded C file that the actual compiler then compiles. You never see this intermediate file unless you ask for it with gcc -E.
Python/Java comparison: Python has no preprocessor — you can't do conditional compilation or text substitution at build time. Java also lacks a preprocessor (annotations and generics work differently). C's preprocessor is unique and powerful, but it has sharp edges: macros have no type safety and no scope — they are pure text replacement throughout the file after the #define.
Three main uses of the preprocessor:
1. Inclusion: #include pastes the entire contents of a header file at that point in the source. #include <stdio.h> is literally copy-pasted by the preprocessor before the compiler sees it.
2. Constants and macros: #define PI 3.14159 makes the preprocessor replace every occurrence of PI with 3.14159 — no type, no memory, no runtime cost. #define MAX(a,b) ((a)>(b)?(a):(b)) creates a macro that works like a function but expands inline.
3. Conditional compilation: #ifdef DEBUG includes or excludes blocks of code at compile time — useful for debug builds, platform-specific code, and preventing double-inclusion of headers.
# Python uses constants differently
PI = 3.14159 # runtime variable
MAX_SIZE = 100 # runtime variable
# No conditional compilation —
# Python checks at runtime:
import sys
if sys.platform == 'win32':
# windows code
pass
# No include guards needed —
# Python modules handle this
/* Preprocessor: text substitution */
#define PI 3.14159 /* no type! */
#define MAX_SIZE 100
/* Macro with parameters */
#define MAX(a,b) ((a)>(b)?(a):(b))
/* Conditional compilation */
#ifdef _WIN32
/* windows-only code */
#endif
/* Include guard */
#ifndef MYHEADER_H
#define MYHEADER_H
/* header contents */
#endif
#define SQUARE(x) x*x looks like a function. But SQUARE(1+2) expands to 1+2*1+2 = 5, not 9. Always wrap macro parameters and the whole expression in parentheses: #define SQUARE(x) ((x)*(x)). Even then, SQUARE(i++) increments i twice — a side-effect bug. Use static inline functions instead of macros when possible.
Use #define for magic numbers (not literal 42 in code — use #define MAX_STUDENTS 42). Use include guards in every header. Write comments that explain why, not what — the code shows what, the comment explains intent. Use assert() to document and enforce preconditions during development.
All preprocessor directives with annotations
/* ── #include ── paste file contents here ── */ #include <stdio.h> // angle brackets: search system include paths #include "myheader.h" // quotes: search current directory first /* ── #define constant ── no type, no semicolon! ── */ #define PI 3.14159 #define MAX_SIZE 100 // Every PI in code below becomes 3.14159 (text substitution) /* ── #define macro with parameters ── */ #define MAX(a,b) ((a)>(b)?(a):(b)) #define SQUARE(x) ((x)*(x)) // ALL parentheses required — prevents operator precedence bugs /* ── Include guard — prevents double inclusion ── */ #ifndef MYHEADER_H // "if not defined" #define MYHEADER_H // now define it — subsequent includes are skipped /* header content here */ #endif // end of guard /* ── Conditional compilation ── */ #ifdef DEBUG printf("debug: x = %d\n", x); // only in debug builds #endif /* #pragma once — modern alternative to include guard */ #pragma once // tells compiler: include this file only once
#define lines have NO semicolon at the end — adding one makes the semicolon part of the replacement text, which causes confusing compilation errors.
assert() — defensive programming
#include <assert.h> void divide(int a, int b) { assert(b != 0); // crashes with message if b==0 (in debug mode) return a / b; } // assert is REMOVED when compiling with -DNDEBUG (release builds) // It documents preconditions AND enforces them during development
assert() for things that should never happen (programmer errors). Use if + error return for things that might happen (user input, file not found). Asserts catch bugs during development; they are stripped from release builds.
#define constant vs const variable
| Feature | #define PI 3.14159 | const double PI = 3.14159; |
|---|---|---|
| Has a type? | No — pure text | Yes — double |
| Occupies memory? | No | Yes (can be optimized away) |
| Visible to debugger? | No | Yes |
| Has scope? | No — file-wide after define | Yes — block/file scope |
| Type-checked by compiler? | No | Yes |
| Can appear in arrays? | Yes (constant expression) | In C99+, yes for VLAs |
Complete programs you can compile and run
#include <stdio.h>
/* Constants — no type, no semicolon, ALL_CAPS by convention */
#define PI 3.14159265
#define MAX_ITEMS 10
#define GREETING "Hello, COMP2017!"
/* Parameterized macros — ALL parameters wrapped in parens */
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define MIN(a, b) ((a) < (b) ? (a) : (b))
#define SQUARE(x) ((x) * (x))
#define ABS(x) ((x) >= 0 ? (x) : -(x))
int main(void) {
printf("%s\n", GREETING);
printf("PI = %.5f\n", PI);
printf("MAX_ITEMS = %d\n", MAX_ITEMS);
int a = 3, b = 7;
printf("MAX(%d,%d) = %d\n", a, b, MAX(a, b)); /* 7 */
printf("MIN(%d,%d) = %d\n", a, b, MIN(a, b)); /* 3 */
printf("SQUARE(5) = %d\n", SQUARE(5)); /* 25 */
printf("ABS(-4) = %d\n", ABS(-4)); /* 4 */
/* Arrays can use #define for size */
int arr[MAX_ITEMS];
for (int i = 0; i < MAX_ITEMS; i++) arr[i] = i;
printf("arr[9] = %d\n", arr[9]); /* 9 */
return 0;
}
PI = 3.14159
MAX_ITEMS = 10
MAX(3,7) = 7
MIN(3,7) = 3
SQUARE(5) = 25
ABS(-4) = 4
arr[9] = 9
// Predefined macros — filled in by preprocessor automatically
printf("Error at %s line %d in %s()\n", __FILE__, __LINE__, __func__);
// __FILE__ → "myprogram.c" (source filename string literal)
// __LINE__ → 42 (current line number as integer)
// __func__ → "main" (current function name, C99)
// __DATE__ → "Jun 11 2026" (compilation date)
// __TIME__ → "14:32:01" (compilation time)
// Practical DEBUG macro:
#define DEBUG(msg) fprintf(stderr, "[%s:%d %s] %s\n", \
__FILE__, __LINE__, __func__, msg)
Instead of hardcoding filenames and line numbers in error messages (which go stale whenever you move code), let the preprocessor fill them in automatically. The DEBUG macro above gives you free location info in every error message.
/* ===== mymath.h ===== */
#ifndef MYMATH_H /* Guard: skip if already included */
#define MYMATH_H
#define MY_PI 3.14159265
/* Function declarations (prototypes) */
double circle_area(double r);
int factorial(int n);
#endif /* MYMATH_H */
/* ===== mymath.c ===== */
#include "mymath.h" /* quotes = look in current dir first */
#include <stdio.h>
double circle_area(double r) {
return MY_PI * r * r;
}
int factorial(int n) {
if (n <= 1) return 1;
return n * factorial(n - 1);
}
/* ===== main.c ===== */
#include "mymath.h" /* safe to include twice — guard prevents re-processing */
#include "mymath.h" /* would cause errors WITHOUT the guard */
int main(void) {
printf("Area r=3: %.2f\n", circle_area(3.0)); /* 28.27 */
printf("5! = %d\n", factorial(5)); /* 120 */
return 0;
}
5! = 120
#include <stdio.h>
#include <assert.h>
/* Define DEBUG to enable debug output */
/* gcc -DDEBUG main.c — pass on command line */
/* gcc main.c — no debug output */
#ifdef DEBUG
#define LOG(fmt, ...) printf("[DEBUG] " fmt "\n", ##__VA_ARGS__)
#else
#define LOG(fmt, ...) /* nothing — expands to empty */
#endif
int divide(int a, int b) {
assert(b != 0); /* crashes if b==0 in debug mode */
LOG("divide called: a=%d b=%d", a, b);
return a / b;
}
int main(void) {
LOG("program starting");
int result = divide(10, 2);
printf("10 / 2 = %d\n", result);
LOG("done");
return 0;
}
[DEBUG] divide called: a=10 b=2
10 / 2 = 5
[DEBUG] done
#include <stdio.h>
#include <stdlib.h>
/* Predefined macros — automatically set by the preprocessor:
__FILE__ expands to the current source filename (string literal)
__LINE__ expands to the current line number (integer)
__func__ expands to the current function name (string, C99)
__DATE__ expands to compilation date, e.g. "Jun 11 2026"
__TIME__ expands to compilation time, e.g. "14:23:01" */
/* Practical DEBUG macro — never goes stale because the
preprocessor fills in file/line/function automatically */
#ifdef DEBUG
#define DBG(msg) \
fprintf(stderr, "[%s:%d %s] %s\n", __FILE__, __LINE__, __func__, msg)
#else
#define DBG(msg) do {} while(0) /* expands to nothing in release */
#endif
void check_pointer(int *ptr) {
if (ptr == NULL) {
/* Without predefined macros you'd hardcode "check_pointer"
and the line number — and both go stale when code moves */
fprintf(stderr, "Error at %s:%d in %s()\n",
__FILE__, __LINE__, __func__);
exit(1);
}
DBG("pointer is valid");
}
int main(void) {
printf("Compiled from: %s\n", __FILE__);
printf("This is line: %d\n", __LINE__);
printf("Compiled on: %s at %s\n", __DATE__, __TIME__);
int x = 42;
check_pointer(&x);
DBG("all good");
return 0;
}
This is line: 27
Compiled on: Jun 11 2026 at 14:23:01
Hardcoding file names or line numbers in error messages is fragile — they go stale the moment code moves. __FILE__, __LINE__, and __func__ let the preprocessor fill them in automatically at compile time, always matching the actual location in the source. The standard assert() macro uses exactly this technique internally.
#include <stdio.h>
/* BAD macro — missing parentheses */
#define BAD_SQUARE(x) x * x
/* GOOD macro — all parentheses present */
#define GOOD_SQUARE(x) ((x) * (x))
/* BAD macro — side effect with ++ */
#define BAD_MAX(a,b) ((a)>(b)?(a):(b))
int main(void) {
/* Precedence bug */
printf("BAD_SQUARE(1+2) = %d\n", BAD_SQUARE(1+2));
/* Expands to: 1+2*1+2 = 1+2+2 = 5 (WRONG — expected 9) */
printf("GOOD_SQUARE(1+2) = %d\n", GOOD_SQUARE(1+2));
/* Expands to: ((1+2)*(1+2)) = 9 (correct) */
/* Side-effect bug */
int i = 3;
int r = BAD_MAX(i++, 2);
/* Expands to: ((i++)>(2)?(i++):(2)) — i is incremented TWICE */
printf("i after BAD_MAX(i++,2): %d (expected 4, got %d)\n", 4, i);
/* i is now 5, not 4! */
return 0;
}
GOOD_SQUARE(1+2) = 9
i after BAD_MAX(i++,2): expected 4, got 5
// BROKEN — multi-statement macro without do{}while(0)
#define SWAP(a, b) int tmp = a; a = b; b = tmp;
if (x > 0)
SWAP(x, y); // expands to 3 statements — only first is in the if!
else
doSomething(); // parse error: "else without if"
// CORRECT — wrap in do { } while(0)
#define SWAP(a, b) do { int tmp = (a); (a) = (b); (b) = tmp; } while(0)
if (x > 0)
SWAP(x, y); // now a single statement — works perfectly
else
doSomething(); // else matches the if correctly
The while(0) is not a loop — it runs exactly once. The point is that do { ... } while(0) is syntactically a single statement, so it works correctly after if, else, for, etc. Plain braces { ... } alone do not form a single statement in C — they leave a dangling else problem. Every multi-statement macro you write should use this pattern.
Practice problems with solutions
The macro below produces wrong results for some inputs. Identify the problem and write the corrected version. Show what AREA(2+1) expands to with the broken version.
#define AREA(r) r * r * 3
/* Broken expansion: AREA(2+1) → 2+1 * 2+1 * 3 = 2 + 2 + 3 = 7 (WRONG) */
/* Expected: AREA(2+1) → (2+1)*(2+1)*3 = 9*3 = 27 */
/* Fixed version — wrap parameter AND whole expression */
#define AREA(r) ((r) * (r) * 3)
/* Now: AREA(2+1) → ((2+1)*(2+1)*3) = 27 (correct) */
The following header file is missing include guards, causing "redefinition" errors when included by multiple source files. Add the correct #ifndef / #define / #endif guard.
/* point.h — missing include guard */
typedef struct {
double x;
double y;
} Point;
double distance(Point a, Point b);
/* point.h — with include guard */
#ifndef POINT_H
#define POINT_H
typedef struct {
double x;
double y;
} Point;
double distance(Point a, Point b);
#endif /* POINT_H */
point.h → POINT_H. Some teams add a project prefix: MYPROJECT_POINT_H. The comment on #endif is optional but helpful when the file is long. #pragma once is a simpler modern alternative but is not in the C standard.
What does this code print? Trace through each macro expansion step by step.
#include <stdio.h>
#define DOUBLE(x) (x) + (x)
#define TRIPLE(x) (x) + (x) + (x)
int main(void) {
int a = 3;
printf("%d\n", 2 * DOUBLE(a));
printf("%d\n", TRIPLE(a + 1));
return 0;
}
/* Line 1: 2 * DOUBLE(a)
Expands to: 2 * (a) + (a)
= 2 * 3 + 3 = 6 + 3 = 9
(NOT 12 — * has higher precedence than +) */
/* Line 2: TRIPLE(a + 1)
Expands to: (a + 1) + (a + 1) + (a + 1)
= (3+1) + (3+1) + (3+1) = 4 + 4 + 4 = 12 */
Output:
9
12
DOUBLE(x) expands to (x) + (x) — the outer expression is 2 * (a) + (a). Multiplication binds tighter than addition, so it's (2*3) + 3 = 9, not 2*(3+3) = 12. The fix would be to wrap the whole macro: #define DOUBLE(x) ((x) + (x)).
A student writes the following. Identify one advantage and one disadvantage of using #define vs const int for BUFFER_SIZE, and explain which is preferred in modern C.
#define BUFFER_SIZE 1024 /* version A */
const int buffer_size = 1024; /* version B */
char buf[BUFFER_SIZE]; /* works with version A */
char buf2[buffer_size]; /* works in C99 as VLA, C89 may not */
#define disadvantage: No type safety, no scope, invisible to the debugger.
#define BUFFER_SIZE -1 would compile silently and cause a buffer overflow.const int advantage: Type-checked, visible in the debugger, has scope (can be local to a function). Better for modern C.
Modern preference: Use
const int (or const size_t) when you need type safety and debugger visibility. Use #define when you genuinely need a compile-time constant expression (array sizes in C89, preprocessor conditionals).
Write a macro CLAMP(x, lo, hi) that clamps value x to the range [lo, hi]. Ensure all parentheses are correct. Then write why CLAMP(i++, 0, 10) is dangerous and what the safe alternative is.
/* Safe macro version — correct parentheses */
#define CLAMP(x, lo, hi) \
((x) < (lo) ? (lo) : ((x) > (hi) ? (hi) : (x)))
/* Danger: CLAMP(i++, 0, 10) expands x three times */
/* i++ can execute 1, 2, or 3 times depending on path */
/* Safe alternative — static inline function */
static inline int clamp(int x, int lo, int hi) {
if (x < lo) return lo;
if (x > hi) return hi;
return x;
}
/* clamp(i++, 0, 10) is safe — i++ executes exactly once */
CLAMP(i++, 0, 10) expands x three times, so i++ runs up to three times. A static inline function evaluates its arguments exactly once (like a normal function call) but gets inlined by the compiler — giving you function safety with macro performance.
Key concepts to memorize
Test your understanding
#define MAX(a,b) a>b?a:b expand to when called as MAX(1+2, 3)?LO1#define constant has a type and is visible to the debugger.LO1#ifndef HEADER_H / #define HEADER_H / #endif)?LO1#___ DEBUG — what directive checks if DEBUG is defined?LO1SQUARE(i++) dangerous when SQUARE is a macro?LO1#include <stdio.h> and #include "myfile.h"?LO1