Recursion and aliasing in plain English

Analogy — Russian Dolls (Recursion)

Imagine a set of Russian nesting dolls (Matryoshka). To find the smallest doll inside, you open the current doll, and if there is another doll inside you open that one too — repeating the same action. You stop when you reach a doll that is solid (no more dolls inside). That stopping condition is the base case. Each "open this doll" step is the recursive case. If you never hit a solid doll, you open forever — a stack overflow in code terms.

A recursive function is a function that calls itself as part of its own definition. Every recursive function must have two distinct parts:

  • Base case — the condition under which the function stops calling itself and returns directly. Without a base case, recursion runs forever and crashes the program with a stack overflow.
  • Recursive case — the part where the function calls itself with a simpler version of the problem. Each recursive call must move the problem closer to the base case.

Recursion is most natural for problems that have a self-similar structure: tree traversal, directory listing, mathematical sequences, and divide-and-conquer algorithms. Use iteration (loops) when the algorithm is naturally sequential; use recursion when the data structure or problem is inherently hierarchical.

The call stack grows with every recursive call

Every function call in C allocates a stack frame — a block of memory holding local variables, the return address, and function arguments. Recursive calls stack these frames one on top of another. If recursion goes too deep (typically thousands of frames), the stack exhausts available memory and the OS kills the program with a segfault or "stack overflow" error. Always ensure your base case will be reached with bounded input.

Tail recursion is a special form where the recursive call is the very last thing the function does — there is no work remaining after the call returns. Some compilers (with -O2 or -O3) can convert tail-recursive functions into loops automatically (tail-call optimisation, TCO), eliminating the stack growth entirely. GCC performs TCO on eligible C functions. To write a tail-recursive version you usually need an accumulator parameter to carry intermediate results.

Mutual recursion is where function A calls function B, and function B calls function A. For example, a parser for a grammar may have parse_expr call parse_term, which calls parse_expr for sub-expressions. Mutual recursion is perfectly valid in C as long as one of the functions is forward-declared before the other is defined.

Analogy — Aliasing: Two Names for the Same Person

Imagine "Bob Smith" and "Robert Smith" are the same person. If you write a note to Bob saying "get a haircut" and simultaneously write a note to Robert saying "grow your hair long", both notes apply to the same person — the instructions conflict and the result is unpredictable. Pointer aliasing works the same way: two pointers that hold the same address are two names for the same piece of memory. Writing through one affects what the other sees.

Pointer aliasing occurs when two or more pointers in a program refer to the same memory location. This is perfectly legal in C and often intentional (a linked-list node pointing back to itself, a function returning a pointer into its argument). However, aliasing causes serious problems in two scenarios:

  • Undefined behaviour with overlapping buffers: memcpy assumes source and destination do not overlap. If they do, the result is undefined. Use memmove instead — it handles overlapping regions correctly.
  • Missed compiler optimisations: The compiler cannot reorder memory reads and writes if it suspects aliasing. If it cannot prove two pointers are distinct, it must be conservative and reload from memory on every access, missing vectorisation and other optimisations.

The restrict keyword (C99 and later) is a promise you make to the compiler: "I guarantee that for the lifetime of this pointer, no other pointer in scope accesses the same memory." With this guarantee, the compiler can freely reorder accesses, keep values in registers, and apply SIMD optimisations. If you lie — if aliasing actually does occur with a restrict-qualified pointer — the behaviour is undefined. restrict appears frequently in the C standard library signatures, for example memcpy(void * restrict dest, const void * restrict src, size_t n).

When does aliasing matter most?

Aliasing matters most in tight numeric loops that process arrays — the kind of code that benefits from auto-vectorisation (SSE/AVX). Without restrict, the compiler inserts memory fence instructions to be safe. With restrict, it can load once, compute in registers, and write once. For most application code the difference is invisible; for HPC and DSP kernels it can be a 4–8x speedup.

Annotated C code: recursion and aliasing constructs

Recursive function structure

/* General template for a recursive function */
int recurse(int n) {
/*  ^    ^        ^
    |    |        └── parameter that changes on each call (gets closer to base case)
    |    └─────────── function name — this same name appears in the body
    └──────────────── return type must match the value returned by base case AND recursive case */

    if (n == 0) {
        return 1;   /* BASE CASE — stop here, no recursive call */
    }
/*  ^
    └── must be FIRST (or at least checked before the recursive call)
        must be reachable for every valid input
        must not call recurse() again */

    return n * recurse(n - 1);
/*          ^   ^          ^
            |   |          └── argument is SMALLER than n — progress toward base case
            |   └──────────── recursive call to itself
            └──────────────── combines result of sub-problem with current value */
}
Key rule: Every execution path through the function must either hit the base case or make a recursive call with a strictly smaller (or simpler) argument. If any path can skip both, the function either loops forever or returns garbage.

Tail-recursive form with accumulator

/* Non-tail-recursive factorial — stack frame kept alive waiting for result */
int fact(int n) {
    if (n == 0) return 1;
    return n * fact(n-1);  /* must wait for fact(n-1) before multiplying */
}

/* Tail-recursive version — accumulator carries the running product */
int fact_tail(int n, int acc) {
/*                       ^^^
                         accumulator holds partial result — starts at 1 */
    if (n == 0) return acc;           /* base case returns accumulated result */
    return fact_tail(n-1, n*acc);  /* TAIL CALL — last operation, nothing left to do */
}
/* Call as: fact_tail(5, 1)  — with GCC -O2, this compiles to a loop */
Tail call: The recursive call is the absolute last thing done — no pending n * ... multiplication waiting. GCC/Clang can replace the call+return with a jump back to the function start, using constant stack space.

Pointer aliasing example

int x = 10;
int *p = &x;  /* p points to x */
int *q = &x;  /* q ALSO points to x — this is aliasing */
/*          ^
            both p and q hold the same address
            writing through p changes what q reads */

*p = 42;
/* Now *q is also 42 even though we never wrote to *q */
printf("%d\n", *q);  /* prints 42 */

/* Aliasing with overlapping buffers — DANGEROUS with memcpy */
char buf[10] = "hello";
memcpy(buf+1, buf, 5);  /* UNDEFINED — src and dst overlap! */
memmove(buf+1, buf, 5); /* SAFE — memmove handles overlap */
Rule: When source and destination memory regions overlap, always use memmove instead of memcpy. memcpy has undefined behaviour when regions alias (the C standard says it requires non-overlapping regions).

The restrict keyword

/* WITHOUT restrict — compiler must assume p and q may alias */
void add_arrays(int *dst, const int *src, int n) {
    for (int i = 0; i < n; i++)
        dst[i] += src[i];  /* if dst == src+1, every iteration sees a modified src! */
}

/* WITH restrict — compiler may vectorise freely */
void add_arrays_r(int * restrict dst,
/*                      ^^^^^^^
                        promise: no other pointer in this scope aliases dst */
                   const int * restrict src,
/*                               ^^^^^^^
                                 promise: src does not overlap dst */
                   int n) {
    for (int i = 0; i < n; i++)
        dst[i] += src[i/* compiler can vectorise: load 8 ints at once with SSE */
}

/* Standard library uses restrict: */
/* void *memcpy(void * restrict dest, const void * restrict src, size_t n); */
/* char *strcpy(char * restrict dst,  const char * restrict src);            */
Syntax: int * restrict p — the keyword goes between the * and the variable name. It is a qualifier on the pointer, not on the pointed-to type. Only applies for the duration of the pointer's scope (usually a function parameter's lifetime).

Complete C programs with stack traces and output

Example 1 — Factorial: recursive implementation with full call-stack trace Classic recursion
#include <stdio.h>

/* Compute n! (n factorial) recursively.
   Base case:      fact(0) = 1
   Recursive case: fact(n) = n * fact(n-1)  for n > 0 */
int fact(int n) {
    printf("  fact(%d) called\n", n);   /* trace the call */

    if (n == 0) {                        /* BASE CASE */
        printf("  fact(0) = 1  [base case]\n");
        return 1;
    }

    int sub = fact(n - 1);              /* RECURSIVE CASE: call with n-1 */
    int result = n * sub;
    printf("  fact(%d) = %d * %d = %d  [returning]\n", n, n, sub, result);
    return result;
}

int main(void) {
    printf("Computing fact(4):\n");
    int answer = fact(4);
    printf("Result: %d\n", answer);     /* 24 */
    return 0;
}
Output — shows the call stack growing down, then unwinding
Computing fact(4):
  fact(4) called
  fact(3) called
  fact(2) called
  fact(1) called
  fact(0) called
  fact(0) = 1 [base case]
  fact(1) = 1 * 1 = 1 [returning]
  fact(2) = 2 * 1 = 2 [returning]
  fact(3) = 3 * 2 = 6 [returning]
  fact(4) = 4 * 6 = 24 [returning]
Result: 24

The call stack at its deepest point (just before fact(0) returns) contains five live frames:

sp+0fact(0)— top of stack, n=0, about to return 1
sp+1fact(1)— waiting for fact(0) to return so it can compute 1*result
sp+2fact(2)— waiting for fact(1)
sp+3fact(3)— waiting for fact(2)
sp+4fact(4)— waiting for fact(3)
sp+5main()— called fact(4), waiting at "int answer = ..."
Example 2 — Fibonacci: naive recursion vs. tail recursion, plus aliasing bug with memcpy Recursion + aliasing
#include <stdio.h>
#include <string.h>

/* ──── Part A: Fibonacci — naive recursive (exponential calls!) ──── */
int fib_naive(int n) {
    if (n <= 1) return n;           /* base cases: fib(0)=0, fib(1)=1 */
    return fib_naive(n-1) + fib_naive(n-2);  /* TWO recursive calls */
    /* Problem: fib_naive(5) calls fib_naive(4) AND fib_naive(3).
       fib_naive(4) ALSO calls fib_naive(3). fib(3) is computed TWICE.
       For fib(40) this recalculates the same values ~100 million times. */
}

/* ──── Part B: Fibonacci — tail-recursive with accumulator ──── */
/* fib_tail(n, a, b): a = fib(k), b = fib(k+1), working toward n */
int fib_tail(int n, int a, int b) {
    if (n == 0) return a;           /* base case: accumulated result */
    return fib_tail(n-1, b, a+b);  /* tail call — no work after return */
}
/* Call as: fib_tail(10, 0, 1) */

/* ──── Part C: aliasing bug with memcpy ──── */
void demonstrate_aliasing(void) {
    char s[20] = "hello";

    /* Bug: src (s) and dst (s+1) overlap — undefined behaviour */
    /* memcpy(s+1, s, 5); */  /* DO NOT USE — overlapping regions */

    /* Fix: use memmove — it copies to a temp buffer first */
    memmove(s+1, s, 5);        /* result: "hhello" (safely shifted) */
    s[6] = '\0';
    printf("After memmove shift: %s\n", s);  /* hhello */

    /* Another aliasing example: two pointers to the same int */
    int val = 99;
    int *p = &val;
    int *q = &val;  /* p and q ALIAS the same location */

    *p = 10;
    printf("*q = %d  (we wrote to *p, but *q changed too)\n", *q);  /* 10 */
}

int main(void) {
    printf("fib_naive(10) = %d\n", fib_naive(10));       /* 55 */
    printf("fib_tail(10)  = %d\n", fib_tail(10, 0, 1)); /* 55 */
    demonstrate_aliasing();
    return 0;
}
Output
fib_naive(10) = 55
fib_tail(10) = 55
After memmove shift: hhello
*q = 10 (we wrote to *p, but *q changed too)
Example 3 — Mutual recursion (is_even/is_odd) and restrict in a vector add Mutual recursion + restrict
#include <stdio.h>

/* Forward declaration needed because is_even calls is_odd
   which is defined AFTER is_even */
int is_odd(int n);

int is_even(int n) {
    if (n == 0) return 1;       /* base case: 0 is even */
    return is_odd(n - 1);       /* delegates to is_odd — mutual recursion */
}

int is_odd(int n) {
    if (n == 0) return 0;       /* base case: 0 is not odd */
    return is_even(n - 1);      /* delegates back to is_even */
}

/* Demonstrate restrict — no-alias promise enables vectorisation */
void vec_add(float * restrict dst,
             const float * restrict a,
             const float * restrict b,
             int n) {
    /* Compiler can now generate SIMD (SSE/AVX) instructions here because
       it knows dst, a, and b do not overlap.
       Without restrict, it must reload a[i] and b[i] each iteration
       in case dst == a or dst == b. */
    for (int i = 0; i < n; i++) {
        dst[i] = a[i] + b[i];
    }
}

int main(void) {
    printf("is_even(4) = %d\n", is_even(4));  /* 1 (true) */
    printf("is_odd(7)  = %d\n", is_odd(7));   /* 1 (true) */
    printf("is_even(3) = %d\n", is_even(3));  /* 0 (false) */

    float x[4] = {1.0f, 2.0f, 3.0f, 4.0f};
    float y[4] = {10.0f, 20.0f, 30.0f, 40.0f};
    float z[4];
    vec_add(z, x, y, 4);
    printf("z = [%.0f, %.0f, %.0f, %.0f]\n", z[0], z[1], z[2], z[3]);
    return 0;
}
Output
is_even(4) = 1
is_odd(7) = 1
is_even(3) = 0
z = [11, 22, 33, 44]

Practice problems with solutions

P1 — Trace a recursive call: what does sum_digits(523) return? LO1 — recursion tracing

Without running the code, trace every call made by sum_digits(523) and determine the final return value. Show each call and its return value step by step.

int sum_digits(int n) {
    if (n == 0) return 0;
    return (n % 10) + sum_digits(n / 10);
}
sum_digits(523)
  = (523 % 10) + sum_digits(523 / 10)
  = 3           + sum_digits(52)

sum_digits(52)
  = (52 % 10) + sum_digits(52 / 10)
  = 2          + sum_digits(5)

sum_digits(5)
  = (5 % 10) + sum_digits(5 / 10)
  = 5         + sum_digits(0)

sum_digits(0)
  = 0          [base case]

Unwinding:
  sum_digits(5)   = 5 + 0 = 5
  sum_digits(52)  = 2 + 5 = 7
  sum_digits(523) = 3 + 7 = 10

Final answer: 10
Key insight: n % 10 extracts the last digit, n / 10 removes it (integer division). The function sums all digits: 5 + 2 + 3 = 10. The base case is n == 0 — when no digits remain. Note: this function gives incorrect results for negative numbers.
P2 — Identify the aliasing bug and fix it LO1 — pointer aliasing / undefined behaviour

The following function is supposed to shift every character in buf one position to the right (inserting an 'X' at the start). It has an aliasing bug that causes undefined behaviour. Identify the bug and fix it.

#include <string.h>
#include <stdio.h>

void prepend_X(char *buf, size_t len) {
    memcpy(buf + 1, buf, len);   /* shift contents right by 1 */
    buf[0] = 'X';
}

int main(void) {
    char s[20] = "hello";
    prepend_X(s, 5);
    printf("%s\n", s);   /* intended: "Xhello" */
    return 0;
}
#include <string.h>
#include <stdio.h>

void prepend_X(char *buf, size_t len) {
    memmove(buf + 1, buf, len);  /* FIX: memmove handles overlapping regions */
    buf[0] = 'X';
}

int main(void) {
    char s[20] = "hello";
    prepend_X(s, 5);
    printf("%s\n", s);   /* "Xhello" */
    s[6] = '\0';         /* null-terminate the longer string */
    return 0;
}
The bug: memcpy(buf+1, buf, 5) — the source region [buf, buf+5) and destination region [buf+1, buf+6) overlap by 4 bytes. The C standard (C11 7.1.4p1 and the memcpy spec) explicitly states that memcpy's regions must not overlap. The behaviour is undefined — on some implementations it "works" by accident; on others it corrupts the data.

The fix: Replace memcpy with memmove. memmove detects overlap and either copies via a temporary buffer or reverses the copy direction to handle overlapping safely. Also need to null-terminate the result since we now have 6 characters.
P3 — Convert an iterative loop to a recursive function LO1 — recursion design

The following iterative function computes the sum of all integers from 1 to n. Rewrite it as a recursive function sum_recursive(int n). Identify the base case and recursive case clearly in comments.

int sum_iterative(int n) {
    int total = 0;
    for (int i = 1; i <= n; i++) {
        total += i;
    }
    return total;
}
int sum_recursive(int n) {
    /* BASE CASE: sum of integers from 1 to 0 is 0 */
    if (n <= 0) return 0;

    /* RECURSIVE CASE: sum(1..n) = n + sum(1..n-1)
       The argument n-1 is strictly smaller than n,
       so we make progress toward the base case on every call. */
    return n + sum_recursive(n - 1);
}

/* Alternatively, a tail-recursive version with accumulator:
int sum_tail(int n, int acc) {
    if (n <= 0) return acc;
    return sum_tail(n - 1, acc + n);
}
// Call as: sum_tail(100, 0)
*/
Reasoning: The loop accumulates total += i for i from 1 to n. Recursively: summing 1 to n equals n plus summing 1 to n-1. The base case is when n is 0 (or negative), where the answer is 0. The function makes progress because each call reduces n by 1. For large n (e.g. n=100000), the non-tail version will overflow the stack — the tail-recursive version avoids that with GCC -O2.
P4 — Add restrict correctly to a function signature LO1 — restrict keyword

The following function scales an array by a constant. Add restrict qualifiers where appropriate so the compiler can optimise the loop. Explain why each one is valid (or why you chose not to add one).

void scale(float *dst, const float *src, float factor, int n) {
    for (int i = 0; i < n; i++) {
        dst[i] = src[i] * factor;
    }
}
void scale(float * restrict dst,
           const float * restrict src,
           float factor,
           int n) {
    for (int i = 0; i < n; i++) {
        dst[i] = src[i] * factor;
    }
}

/* Why restrict on dst?
   We promise: no other pointer passed to this function aliases dst.
   So the compiler knows that writing dst[i] does not invalidate src[j].
   Without restrict, after writing dst[0], the compiler would have to
   reload src[1] from memory in case dst == src+0 made them alias.

   Why restrict on src?
   Same reason from the src side. Together they allow full vectorisation.

   Why NOT restrict on factor (a float, not a pointer)?
   restrict only applies to pointers. Scalar values are passed by value
   and cannot alias. */
Correctness requirement: You must only add restrict when you can guarantee the caller never passes overlapping regions. If a caller does pass overlapping arrays, the behaviour becomes undefined — the compiler may produce wrong results silently. Always document the precondition in a comment above the function.
P5 — Write a recursive binary search LO1 — recursive algorithm design

Write a recursive function int bsearch_r(int *arr, int lo, int hi, int target) that performs binary search on a sorted array. It should return the index of target if found, or -1 if not found. The base case occurs when lo > hi (empty range).

#include <stdio.h>

/* Binary search: search arr[lo..hi] for target.
   Precondition: arr is sorted in ascending order.
   Returns: index of target, or -1 if not found. */
int bsearch_r(int *arr, int lo, int hi, int target) {
    /* BASE CASE 1: empty range — target not in array */
    if (lo > hi) return -1;

    int mid = lo + (hi - lo) / 2;  /* avoid integer overflow vs. (lo+hi)/2 */

    /* BASE CASE 2: found the target */
    if (arr[mid] == target) return mid;

    /* RECURSIVE CASES: search the appropriate half */
    if (target < arr[mid]) {
        return bsearch_r(arr, lo, mid - 1, target);  /* search left half */
    } else {
        return bsearch_r(arr, mid + 1, hi, target);  /* search right half */
    }
}

int main(void) {
    int sorted[] = {2, 5, 8, 12, 16, 23, 38, 56, 72, 91};
    int n = 10;

    printf("Index of 23: %d\n", bsearch_r(sorted, 0, n-1, 23));  /* 5 */
    printf("Index of 99: %d\n", bsearch_r(sorted, 0, n-1, 99));  /* -1 */
    printf("Index of  2: %d\n", bsearch_r(sorted, 0, n-1, 2));   /* 0 */
    return 0;
}
Key design choices: Two base cases — empty range (not found) and direct match. Each recursive call halves the search space, so the depth is O(log n) — at most about 30 frames for any 32-bit array size, making stack overflow impossible in practice. The midpoint formula lo + (hi - lo) / 2 avoids integer overflow that (lo + hi) / 2 could cause with large indices.
P6 — Spot the bug: missing or incorrect base case LO1 — recursion correctness

The code below is intended to count how many times digit d appears in the non-negative integer n. It has a bug that causes infinite recursion for certain inputs. Find the bug and fix it.

int count_digit(int n, int d) {
    if (n == 0) return 0;                 /* base case */
    return (n % 10 == d) + count_digit(n / 10, d);
}

int main(void) {
    printf("%d\n", count_digit(303, 3));  /* should print 2 */
    printf("%d\n", count_digit(0, 0));    /* should print 1, but crashes! */
}
/* Bug: count_digit(0, 0) returns 0 immediately (hits base case n==0)
   but the digit 0 IS present in the number 0 — the answer should be 1.
   Worse: count_digit(-1, d) causes infinite recursion because -1/10 == 0
   and -1 % 10 == -1 on most platforms, but n never reaches 0 for
   large negatives.

   Fix: use a sentinel to distinguish "called with 0 as a number" from
   "exhausted all digits". One standard approach: */

int count_digit_fixed(int n, int d) {
    /* Process at least one digit even when n == 0 */
    int count = (n % 10 == d) ? 1 : 0;
    if (n / 10 == 0) return count;          /* base case: no more digits */
    return count + count_digit_fixed(n / 10, d);
}

/* Or handle the special case of n==0 separately: */
int count_digit_v2(int n, int d) {
    if (n < 10) return (n == d) ? 1 : 0;   /* single-digit base case */
    return (n % 10 == d) + count_digit_v2(n / 10, d);
}
Root cause: The base case n == 0 fires before checking whether the digit 0 is in the number. When the user asks "how many 0s are in 0?", the function returns 0 without examining the number. The fix is to check the last digit before deciding to recurse, and use n / 10 == 0 (no more digits to examine) as the stopping condition instead.

Key concepts to memorize

Card 1 of 10
Question — click to flip
Answer
Click card to flip • Use buttons to navigate

Test your understanding

Topic 31 Quiz — Recursion & Aliasing Score: 0 / 6
1
What is the base case in a recursive function?LO1
multiple choice
2
True or False: memcpy is safe to use when the source and destination memory regions overlap.LO1
true / false
3
What does the restrict keyword tell the compiler?LO1
multiple choice
4
Fill in the blank: A recursive call is tail-recursive if it is the ________ thing the function does before returning. (One word answer.)LO1
fill in the blank
5
Spot the bug: what is the problem with this recursive function?LO1
int countdown(int n) {
    printf("%d\n", n);
    return countdown(n - 1);
}
spot the bug — multiple choice
6
Two pointers p and q in the same function both point to the same int variable. You write *p = 5;. What is the value of *q immediately after?LO1
multiple choice
0/6
Quiz complete!