Topic 21 — Processes (fork, exec, wait)

Foundational Concept

Virtual Memory Model

Real-World Analogy

Each process thinks it owns all of RAM — like each apartment in a building has its own numbered rooms. The MMU (Memory Management Unit) is the building manager who maps each apartment's room numbers to actual physical rooms. Two apartments can both say "room 42" but the manager routes them to completely different physical rooms. In Python or Java, the runtime hides all of this — you never see addresses. In C, you work directly with these virtual room numbers, so understanding what they really mean is essential.

Every process runs in its own virtual address space. The addresses your code uses (e.g., what printf("%p", ptr) prints) are virtual addresses — not the real location in physical RAM. The MMU, using a data structure called the page table, translates each virtual address to a physical address on every memory access. Because each process has its own page table, process A at virtual address 0x1000 and process B at virtual address 0x1000 are accessing completely different physical memory locations.

Why does this matter for processes? When you call fork(), the child gets an independent copy of the parent's entire virtual address space — same layout, same virtual addresses, but backed by separate physical pages. They can no longer corrupt each other's memory directly. This isolation is enforced by the MMU, completely transparently to the C programmer.

Virtual vs Physical Address

A virtual address is what your program sees — a number like 0x7fff5a00. A physical address is the actual hardware DRAM row/column. The kernel controls the page table; your process can only read and write through virtual addresses it has been granted. Attempting to access an unmapped virtual address causes a segmentation fault — the OS kills the process rather than allowing it to corrupt physical memory it does not own.

Process Memory Layout (virtual address space)

Every C process's virtual address space is divided into fixed regions. From low address (bottom) to high address (top):

Kernel Space protected — process cannot touch

Stack local variables, return addresses ↓ grows down

gap — unmapped virtual pages (access = segfault)

Heap malloc / calloc / realloc ↑ grows up

.bss uninitialised globals & statics (zero-filled by OS)

.data initialised globals & statics (e.g. int x = 5;)

.text compiled machine instructions (read-only)

0x0000… (low address) High address ↑

Key Concepts Table

Concept	What it is	In practice
Virtual address	The address your C code uses — a number in the process's own address space, e.g. `0x7fff5a00`	`int *p = malloc(4);` — `p` holds a virtual address
Physical address	The real hardware DRAM location — invisible to user code; managed exclusively by the kernel	Never directly visible in C; the MMU handles translation silently
MMU	Memory Management Unit — hardware chip on the CPU that performs virtual→physical translation on every memory access using the page table	Transparent to the programmer; causes a page fault if a page is unmapped
Page / page table	A page is the OS unit of virtual memory (typically 4 KB). The page table is a kernel data structure mapping virtual pages to physical frames for one process	Each process has its own page table; `fork()` creates a copy of the parent's page table for the child
Page fault	The MMU raises this when a virtual address has no valid physical mapping. The OS either allocates a new frame (valid but not yet mapped), or sends SIGSEGV (invalid access)	Happens invisibly when the heap or stack grows; becomes a crash (segfault) if you access freed or out-of-bounds memory
Kernel space	The upper portion of every process's virtual address space reserved for the OS kernel. Mapped in every process's page table but protected — any user-mode access causes a fault	System calls cross into kernel space; regular code runs in user space
User space	The lower portion of the virtual address space accessible by the process: .text, .data, .bss, heap, and stack. Your C program lives here	All `malloc`, local variables, global variables, and code addresses are in user space
Stack direction	Grows downward (toward lower addresses) on x86/x86-64 and ARM — the stack pointer decrements as frames are pushed	Each function call pushes a new frame; deep recursion can exhaust the stack and cause a stack overflow
Heap direction	Grows upward (toward higher addresses) as `malloc` requests more memory from the OS via `brk()` / `mmap()`	`malloc` can fail (return NULL) if virtual address space or RAM is exhausted — always check the return value

Worked Example — Heap and Stack in action

malloc asks the OS for heap pages; the stack grows automatically on function calls Week 7 — memory regions

/*
 * Demonstrates which region of virtual memory each variable lives in.
 *
 * Compile: gcc -o regions regions.c
 * Run:     ./regions
 */
#include <stdio.h>
#include <stdlib.h>

int global_init   = 42;         /* .data  — initialised global              */
int global_uninit;              /* .bss   — uninitialised global (zero-filled)*/

void inner(void) {
    int local = 99;             /* stack  — new frame pushed here           */
    printf("  inner() local var  (STACK):  %p\n", (void *)&local);
}

int main(void) {
    /* ── Stack variables ── */
    int x = 1;
    printf("main() local var      (STACK):  %p\n", (void *)&x);

    /* ── Calling inner() pushes another stack frame (lower address) ── */
    inner();   /* stack grows downward — inner's frame is at a lower address */

    /* ── Heap allocation: malloc calls brk()/mmap() to get OS pages ──
     *   The OS does NOT immediately commit physical RAM — it marks the
     *   pages as "demand-zero". The first write causes a page fault and
     *   the OS silently allocates a physical frame. This is why malloc
     *   returns almost instantly even for huge sizes.
     */
    int *heap_arr = malloc(10 * sizeof(int));  /* heap grows upward */
    if (!heap_arr) { perror("malloc"); return 1; }
    heap_arr[0] = 7;   /* first write → page fault → OS allocates frame */
    printf("heap_arr              (HEAP):   %p\n", (void *)heap_arr);

    /* ── Static/global regions ── */
    printf("global_init           (.data):  %p\n", (void *)&global_init);
    printf("global_uninit         (.bss):   %p\n", (void *)&global_uninit);

    /* ── Code is in .text (lower than everything else) ── */
    printf("main() function       (.text):  %p\n", (void *)main);

    free(heap_arr);
    return 0;
}

Sample output — exact addresses vary per run (ASLR)

main() local var (STACK): 0x7ffd8a3c14ac inner() local var (STACK): 0x7ffd8a3c1480 ← lower address (grew down) heap_arr (HEAP): 0x55a3f0d8b2a0 ← higher than .text global_init (.data): 0x55a3efb01020 global_uninit (.bss): 0x55a3efb01024 main() function (.text): 0x55a3efb0068a ← lowest (code segment)

What happens when fork() copies the address space?

After fork(), parent and child have identical virtual addresses — heap_arr is 0x55a3...2a0 in both. But they are backed by separate physical pages (Copy-On-Write: the OS shares physical frames until one process writes, then makes a private copy). That is why writing to a global in the child does not affect the parent's copy.

Concept Explanation

What is a process?

Real-World Analogy

Think of a program as a recipe card. A process is a chef actually cooking that recipe — the recipe in action, with all the ingredients already assembled and the stove running. Your computer can run the same recipe card in many kitchens simultaneously (multiple processes of the same program). Each kitchen (process) has its own workspace: its own memory, its own stove timer (CPU registers), its own stack of notes (call stack). The operating system is the restaurant manager scheduling which kitchen gets to use the shared equipment (CPU) at any given moment.

Every process has a process ID (PID), a unique integer assigned by the OS. Every process also knows its parent's ID (PPID). You can query these at runtime with getpid() and getppid() from <unistd.h>.

How does a new process come to life? In Unix, the only way to create a new process is to clone an existing one using the fork() system call. The OS copies the entire memory image of the calling process (the parent) into a new process (the child). Both then continue running from the same point in the code — the line just after fork() returns. The only difference is the return value of fork():

Child process: fork() returns 0
Parent process: fork() returns the child's PID (a positive integer)
Error: fork() returns -1 (check errno)

This is the key insight: both parent and child run the same code, so you distinguish them purely by checking fork's return value.

What does exec do? After forking, the child process often wants to run a completely different program — not just a different code path in the same source file. The exec family of functions achieves this: they replace the current process's entire memory image (code, data, stack, heap) with the code from a new executable file. Execution of the new program begins from its main(). If exec succeeds, it never returns — because the calling code no longer exists in memory.

What does wait do? When a child process finishes (calls exit() or returns from main()), it leaves behind an exit status that the parent needs to collect. Until the parent collects it, the child becomes a zombie process — it no longer runs but still occupies a slot in the process table. wait() and waitpid() allow the parent to collect ("reap") this status, freeing the zombie. If the parent dies first without waiting, the child becomes an orphan — the OS reassigns orphans to the init process (PID 1) which periodically reaps them.

Zombie vs Orphan

Zombie: child has exited but parent hasn't called wait() yet. Takes a table slot. Fix: always call wait() or waitpid() in the parent.
Orphan: parent exited before child. OS reassigns orphan's parent to init (PID 1). Not harmful, but poor design.

Fork Bomb Warning

A fork bomb calls fork() in an infinite loop: each child also calls fork, doubling processes exponentially. Within seconds this exhausts the process table and freezes the system. The code :(){ :|:& };: is a bash fork bomb. Never run fork in a loop without a termination condition. On exam: how many processes does a loop calling fork n times create? Answer: 2ⁿ processes total (including the original).

Process tree visualization

init (PID 1) └── bash (PID 100) ├── your_program (PID 200) <-- fork() called here │ └── child (PID 201) <-- child gets PID 201 └── other_process (PID 202)

The fork + exec + wait pattern

The canonical Unix pattern for running another program:
1. fork() — create a child clone
2. In the child: call exec() to replace itself with the target program
3. In the parent: call wait() to reap the child when it finishes
This is exactly what your shell does every time you type a command.

Syntax Breakdown

fork, exec, wait — annotated signatures

fork()

/* Headers needed */
#include <sys/types.h>
#include <unistd.h>

pid_t fork(void);
  ^     ^
  │     └── no arguments — clones the calling process as-is
  └── returns pid_t: typedef for int on most platforms

/* Return value — same call, different values in parent vs child */
pid_t pid = fork();
if (pid == -1) { /* fork failed — check errno */        }
if (pid ==  0) { /* we are in the CHILD process */        }
if (pid  >  0) { /* we are in the PARENT; pid = child's PID */ }

Key fact: After fork(), both processes run concurrently from the same point. The only way to tell which is which is the return value. The child gets a perfect copy of the parent's memory — but they are separate copies (Copy-On-Write in practice, but logically independent).

exec family

/* execv — execute file, pass argv as an array */
int execv(const char *path, char *const argv[]);
           ^                   ^
           │                   └── NULL-terminated array: argv[0]=program name, rest=args
           └── absolute path to executable e.g. "/bin/ls"

/* execvp — like execv but searches PATH for the program */
int execvp(const char *file, char *const argv[]);
           ^                                          
           └── just the program name e.g. "ls" (PATH is searched)

/* execve — exec with explicit environment */
int execve(const char *path, char *const argv[], char *const envp[]);
                                                             ^
                                                             └── environment variables array

/* ALL exec variants: on success, NEVER return.
   On failure, return -1 and set errno. */
char *args[] = {"ls", "-l", NULL};  // argv[0] = program, last = NULL
execvp("ls", args);
/* if we reach here, exec failed */
perror("execvp");

Mnemonic for exec variants: l=list args, v=vector (array), p=search PATH, e=explicit environment. So execvp = vector + PATH search. execle = list + explicit env. The most common are execv, execvp, execve.

wait() and waitpid()

#include <sys/types.h>
#include <sys/wait.h>

pid_t wait(int *status);
  ^          ^
  │          └── pointer to int; OS writes exit info here. Pass NULL to ignore.
  └── returns PID of child that exited, or -1 on error

pid_t waitpid(pid_t pid, int *status, int options);
              ^                         ^
              │                         └── 0 = block; WNOHANG = don't block
              └── specific child PID, or -1 for any child

/* Extracting exit status after wait: */
int status;
waitpid(child_pid, &status, 0);
if (WIFEXITED(status)) {
    int code = WEXITSTATUS(status);  // 0 = success, non-0 = error
}

Important macros: WIFEXITED(status) — true if child exited normally. WEXITSTATUS(status) — extract the exit code (0–255). WIFSIGNALED(status) — true if child was killed by a signal. WTERMSIG(status) — which signal killed it.

getpid() and getppid()

#include <unistd.h>

pid_t getpid(void);   /* returns this process's PID */
pid_t getppid(void);  /* returns parent process's PID */

printf("I am PID %d, my parent is %d\n", getpid(), getppid());

Both functions always succeed and never return an error. They are simple syscalls with no side effects — safe to call anywhere.

Exec variant cheat-sheet

Function	Path search	Args as	Custom env
`execl`	No (full path)	Variadic list	No (inherit)
`execv`	No (full path)	Array (vector)	No (inherit)
`execlp`	Yes (PATH)	Variadic list	No (inherit)
`execvp`	Yes (PATH)	Array (vector)	No (inherit)
`execle`	No (full path)	Variadic list	Yes (envp)
`execve`	No (full path)	Array (vector)	Yes (envp)

popen() / pclose() — higher-level shell command execution

popen() is a higher-level alternative to the fork + exec pattern. It launches a shell command and returns a FILE * pipe so you can read the command's stdout (or write to its stdin) directly with standard fgets/fprintf calls. Under the hood it still forks and execs — it just hides the plumbing. pclose() closes the pipe and waits for the child, returning the exit status (same encoding as waitpid).

#include <stdio.h>

FILE *popen(const char *command, const char *mode);
             ^                      ^
             │                      └── "r" = read child's stdout
             │                          "w" = write to child's stdin
             └── shell command string, e.g. "ls -la" or "wc -l < file"

int   pclose(FILE *stream);
  ^
  └── like waitpid — blocks until child exits, returns its exit status

/* Example: capture output of `ls -la` line by line */
FILE *fp = popen("ls -la", "r");
if (fp == NULL) { perror("popen"); exit(1); }
char line[256];
while (fgets(line, sizeof(line), fp) != NULL) {
    printf("%s", line);
}
int status = pclose(fp);   // returns exit status — use WEXITSTATUS(status)

popen uses /bin/sh: NEVER pass user-controlled input as the command string. Because popen executes via /bin/sh -c, an attacker can inject shell metacharacters (;, |, $()) to run arbitrary commands. Use fork + exec when you need to pass untrusted arguments safely — exec does not invoke a shell.

Shell Injection Vulnerability

If user_input comes from outside your program, never do: popen(user_input, "r"). A value like "ls; rm -rf /" would execute both commands. This is the same class of vulnerability as SQL injection — only use popen with fully controlled, hard-coded command strings.

posix_spawn() — spawn a child process without fork

posix_spawn() (from #include <spawn.h>) is a POSIX-standard alternative to the fork + exec pattern. It is particularly useful on platforms where fork() is expensive — for example, in multi-threaded processes, fork() only copies the calling thread and can leave mutex state in an inconsistent state; posix_spawn avoids these hazards. On Linux it typically still uses fork internally, but on systems with copy-heavy virtual memory (e.g., VxWorks, some embedded OSes), it may be implemented with a lighter-weight mechanism.

#include <spawn.h>
#include <unistd.h>   // for environ

int posix_spawn(pid_t *pid,
               const char                       *path,
               const posix_spawn_file_actions_t *file_actions,
               const posix_spawnattr_t          *attrp,
               char *const                       argv[],
               char *const                       envp[]);
  ^    ^        ^                                ^       ^
  │    │        │                                │       └── environment (pass environ for current env)
  │    │        │                                └── NULL-terminated argv array (argv[0] = program name)
  │    │        └── attrp: NULL = default attributes (signal mask, flags)
  │    └── file_actions: NULL = inherit all FDs; used to redirect stdin/stdout
  └── returns 0 on success (error code, not -1); *pid is set to child's PID

/* Typical usage — spawn /bin/ls and wait for it */
extern char **environ;
pid_t pid;
char *args[] = { "ls", "-la", NULL };
int rc = posix_spawn(&pid, "/bin/ls", NULL, NULL, args, environ);
if (rc != 0) { fprintf(stderr, "posix_spawn: %s\n", strerror(rc)); }
waitpid(pid, NULL, 0);

Key difference from fork+exec: posix_spawn does not use a shell — you pass the executable path and argv directly, the same way execv does. This makes it safe to use with untrusted arguments (no shell injection risk). Error handling differs too: it returns an errno value on failure (not -1), so check rc != 0 rather than rc == -1.

fork+exec vs popen vs posix_spawn — when to use which

fork+exec: maximum control; use when you need pipes, custom FD setup, or fine-grained child management.
popen: simplest for capturing command output; use only with fully trusted, hard-coded command strings.
posix_spawn: cleaner than fork+exec for the simple "just run a program" case; preferred in multi-threaded code where fork is risky; available on all POSIX systems via <spawn.h>.

Worked Examples

Complete programs you can compile and run

Example 1 — Basic fork: parent and child execute different branches Week 7 Lecture

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(void) {
    printf("Before fork: PID=%d\n", getpid());

    pid_t pid = fork();   /* Clone this process */

    if (pid == -1) {
        /* fork failed — print error and exit */
        perror("fork");
        exit(1);
    }

    if (pid == 0) {
        /*
         * We are in the CHILD process.
         * fork() returned 0 only to the child.
         */
        printf("Child: PID=%d, PPID=%d\n", getpid(), getppid());
        printf("Child: doing child-specific work...\n");
        exit(0);   /* Child exits with success code */
    }

    /*
     * We are in the PARENT process.
     * pid holds the child's PID (a positive number).
     */
    printf("Parent: PID=%d, child PID=%d\n", getpid(), pid);

    /* Wait for child to finish — reap the zombie */
    int status;
    pid_t waited = waitpid(pid, &status, 0);  /* blocks until child exits */

    if (waited == -1) {
        perror("waitpid");
        exit(1);
    }

    if (WIFEXITED(status)) {
        printf("Parent: child exited with code %d\n", WEXITSTATUS(status));
    }

    printf("Parent: done.\n");
    return 0;
}

Sample output (order may vary)

Before fork: PID=1234 Parent: PID=1234, child PID=1235 Child: PID=1235, PPID=1234 Child: doing child-specific work... Parent: child exited with code 0 Parent: done.

Example 2 — fork + execvp + wait: run /bin/ls as a child process Week 7 Lecture — classic shell pattern

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(void) {
    printf("Parent (PID %d): about to run 'ls -l'\n", getpid());

    pid_t pid = fork();
    if (pid == -1) { perror("fork"); exit(1); }

    if (pid == 0) {
        /*
         * CHILD: replace this process image with 'ls -l'
         * argv must be NULL-terminated. argv[0] = program name by convention.
         */
        char *argv[] = { "ls", "-l", NULL };
        execvp("ls", argv);

        /*
         * If execvp returns, it FAILED.
         * The code below only runs on exec failure.
         */
        perror("execvp failed");
        exit(127);   /* 127 = "command not found" by convention */
    }

    /*
     * PARENT: wait for the child (ls) to finish.
     * During this time, ls is running and printing to stdout.
     */
    int status;
    waitpid(pid, &status, 0);

    if (WIFEXITED(status)) {
        printf("\nParent: ls finished with exit code %d\n", WEXITSTATUS(status));
    } else if (WIFSIGNALED(status)) {
        printf("\nParent: ls was killed by signal %d\n", WTERMSIG(status));
    }

    return 0;
}

Sample output

Parent (PID 1234): about to run 'ls -l' total 24 -rwxr-xr-x 1 user user 16832 Jun 10 15:00 a.out -rw-r--r-- 1 user user 612 Jun 10 14:59 fork_exec.c Parent: ls finished with exit code 0

Example 3 — "How many times does X print?" — counting fork processes Week 7 Lecture — exam favourite

/* QUESTION: How many times is "Hello!" printed?
 *
 * Trace:
 *   fork #1 → 2 processes (original + child_A)
 *   fork #2 → each of the 2 forks again → 4 processes total
 *   fork #3 → each of the 4 forks again → 8 processes total
 *   All 8 processes reach printf → 8 prints
 *
 * General rule: n forks in sequence → 2^n processes
 */
#include <stdio.h>
#include <unistd.h>

int main(void) {
    fork();   /* After: 2 processes */
    fork();   /* After: 4 processes */
    fork();   /* After: 8 processes */
    printf("Hello!\n");   /* Each of the 8 prints once */
    return 0;
}
/* Answer: 8 times (2^3) */

/* ============================================================
 * VARIANT: conditional fork — careful!
 * ============================================================ */
#include <stdio.h>
#include <unistd.h>

int variant(void) {
    pid_t p1 = fork();   /* fork 1: parent gets child PID, child gets 0 */
    pid_t p2 = fork();   /* fork 2: BOTH parent and child from above fork again */

    /*
     * After two forks, there are 4 processes:
     * Process A (original): p1=child_PID  p2=grandchild_PID   → p1>0, p2>0
     * Process B (child of A): p1=0        p2=great_child_PID  → p1==0, p2>0
     * Process C (child of A): p1=child_PID p2=0               → p1>0, p2==0
     * Process D (child of B): p1=0         p2=0               → p1==0, p2==0
     */
    if (p1 == 0 && p2 == 0) {
        printf("Both forks: I am the grandchild\n"); /* 1 print */
    }
    if (p1 != 0) {
        printf("Original-line process\n"); /* 2 prints: A and C */
    }
    return 0;
    /* Total: 3 prints */
}

main(): 8 "Hello!" lines (order non-deterministic)

Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello!

Problems

Practice problems with solutions

P1 — How many times does "Alive!" print? Exam style — Week 7

Without running the code, determine exactly how many times "Alive!" is printed.

#include <stdio.h>
#include <unistd.h>
int main(void) {
    for (int i = 0; i < 4; i++) {
        fork();
    }
    printf("Alive!\n");
    return 0;
}

Answer: 16 times (2⁴ = 16).

Each iteration of the loop calls fork() once. After each fork, the number of processes doubles — every existing process calls fork again in the next iteration. The table below tracks it:

i=0: 1 process → fork → 2 processes
i=1: 2 processes → fork → 4 processes
i=2: 4 processes → fork → 8 processes
i=3: 8 processes → fork → 16 processes

All 16 processes complete the loop and call printf once each → 16 prints. General rule: n sequential forks inside a loop = 2ⁿ processes.

P2 — Spot the bug: zombie creation Exam 2024 style

The code below creates a zombie process. Identify the bug and explain how to fix it.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void) {
    pid_t pid = fork();
    if (pid == 0) {
        printf("Child: working...\n");
        exit(0);
    }
    /* Parent does other work and exits without waiting */
    printf("Parent: I'm done, bye!\n");
    return 0;
}

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
    pid_t pid = fork();
    if (pid == 0) {
        printf("Child: working...\n");
        exit(0);
    }
    /* FIX: parent must wait() to reap the child */
    int status;
    waitpid(pid, &status, 0);
    printf("Parent: child exited with %d. I'm done.\n", WEXITSTATUS(status));
    return 0;
}

Bug: The parent exits without calling wait() or waitpid(). When the child calls exit(0), it sends a SIGCHLD signal to the parent and enters zombie state — its PID and exit status are held in the process table waiting to be collected. Since the parent never collects it (exits immediately), the zombie persists until the parent itself is reaped (at which point init cleans up).

Fix: Add waitpid(pid, &status, 0) in the parent after the child-branch check. This blocks until the child exits and frees its process table entry.

P3 — Write a mini-shell: run a command as a child process Week 7 Tutorial extension

Write a program that: forks a child, uses execvp in the child to run the command echo hello world, waits in the parent, and prints the child's exit code. Do not use system().

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(void) {
    pid_t pid = fork();
    if (pid == -1) { perror("fork"); exit(1); }

    if (pid == 0) {
        /* Child: exec echo */
        char *args[] = { "echo", "hello", "world", NULL };
        execvp("echo", args);
        /* Only reached if exec fails */
        perror("execvp");
        exit(127);
    }

    /* Parent: wait for child */
    int status;
    if (waitpid(pid, &status, 0) == -1) {
        perror("waitpid"); exit(1);
    }

    if (WIFEXITED(status)) {
        printf("Command exited with code: %d\n", WEXITSTATUS(status));
    } else {
        printf("Command terminated abnormally.\n");
    }
    return 0;
}

Key points: argv[0] is always the program name by convention. The array must be NULL-terminated. execvp searches PATH so you don't need the full path /bin/echo. The exit code 127 by convention means "command not found" — this only runs if exec fails.

P4 — True/False: exec, fork, and zombie concepts Exam 2024 style — LO2, LO3

State whether each claim is TRUE or FALSE, and explain your reasoning.

After a successful execvp(), the next line of code in the calling process runs.
fork() returns the same value in both parent and child.
A zombie process uses CPU resources.
If you call fork() twice sequentially without branching on the return values, you get 4 processes.
After fork(), both parent and child share the exact same memory (not copies).

1. FALSE — On success, exec replaces the entire process image and never returns. The next line only runs if exec fails.

2. FALSE — fork() returns 0 in the child and the child's PID in the parent. This is how they distinguish themselves.

3. FALSE — A zombie has already exited. It uses no CPU and minimal memory (only a process table entry for the exit status). The problem with zombies is exhausting the PID table if thousands accumulate.

4. TRUE — First fork: 2 processes. Both processes run the second fork: 4 processes total. 2² = 4.

5. FALSE — After fork(), parent and child have logically separate memory (independent copies). In practice the kernel uses Copy-On-Write (the physical pages are shared until one process writes to them), but from a programmer's perspective they are independent — writes in one do not affect the other.

P5 — What is printed? Trace this fork code carefully. COMP2017 Exam 2024 style

Trace this program carefully. How many lines are printed, and what are they? Assume fork always succeeds and getpid() returns sequential values (parent=100, first child=101, second child=102, third child=103).

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main(void) {
    pid_t a = fork();
    pid_t b = fork();
    if (a == 0) printf("X\n");
    if (b == 0) printf("Y\n");
    return 0;
}

After 2 forks there are 4 processes. Let's label them:

P0 (original): a=101, b=102 → a≠0, b≠0 → prints nothing
P1 (child of first fork): a=0, b=103 → a==0 prints X, b≠0 → prints X only
P2 (child of second fork from P0): a=101, b=0 → a≠0, b==0 → prints Y only
P3 (child of second fork from P1): a=0, b=0 → a==0 prints X, b==0 prints Y → prints X and Y

Total output: 3 X's and 2 Y's (5 lines total). Wait — let's recount: P1 prints X (1), P3 prints X and Y (2), P2 prints Y (1) → 2 X's, 2 Y's = 4 lines. P0 prints nothing. Answer: 4 lines: two "X", two "Y" (in non-deterministic order).

P6 — 2024 Final Exam Q8 (35 marks, COMP9017): Binary Tree of Processes + Signals + mmap From: 2024 Final Exam

A program creates a complete binary tree of concurrent processes. The root process is the root of the tree. Each process has a worker ID (wID) assigned by level-order traversal: root = wID 1, left child of wID k = 2k, right child = 2k+1. A 3-level tree has 7 processes (wIDs 1–7). Leaf processes are wID 4,5,6,7 and non-leaf processes are wID 1,2,3.

Part 1 (15 marks): Write int initialise(int n) to create a complete binary tree of n levels. Each non-leaf parent must maintain references (PIDs) to both children. Minimise resources per process. Return 0 on success. Valid values: 3 ≤ n ≤ 7 (max 127 processes).

Part 2 (20 marks): Design an encoding and protocol for reliably communicating to leaf processes from the root using only signals (SIGUSR1 down the tree, SIGUSR2 up the tree). The root calls:

void issue_command(enum command cmd, uint8_t leaf_wID);

Rules: only SIGUSR1 is sent to children (downward); only SIGUSR2 is sent to parents (upward). Each signal is sent after a specially encoded 32-bit integer has been written to shared memory:

static uint32_t *shared;   /* already set up with mmap() */

enum command { RESUME, CHANGE, STOP };
/* Commands:
 *   RESUME rval  — rval is 4 bits
 *   CHANGE cval  — cval is 8 bits
 *   STOP         — no parameter
 */

Assumptions: 3 ≤ n ≤ 7 (max 127 processes). shared is already set up and usable. Signals cannot be queued, out of order, or missed. Describe how you handle (or acknowledge) reliability concerns.

/*
 * =====================================================================
 * PART 1: initialise(n) — build a complete binary tree of processes
 * =====================================================================
 *
 * Strategy: use a recursive fork pattern.
 * Each process knows its own wID. It forks two children (wID*2, wID*2+1)
 * if it is not at the leaf level. The root calls initialise(n) with wID=1.
 *
 * Resources minimised: each process only stores the PIDs of its two
 * direct children (not the whole tree).
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/wait.h>

enum command { RESUME, CHANGE, STOP };

/* Shared memory word — written before each signal */
static uint32_t *shared;

/* Per-process state */
static int      my_wID;
static int      tree_levels;
static pid_t    child_left_pid  = -1;
static pid_t    child_right_pid = -1;
static pid_t    parent_pid;

/*
 * shared[0] is the command word. Format (32 bits):
 *
 *  [31:24]  target_wID  (8 bits — the leaf wID this command is for)
 *  [23:16]  command     (8 bits — RESUME=0, CHANGE=1, STOP=2)
 *  [15:8]   param       (8 bits — rval for RESUME (low 4 bits), cval for CHANGE)
 *  [7:0]    hop_path    (8 bits — bitmask of the path from root to leaf,
 *                                 used by intermediate nodes for routing)
 *
 * hop_path encoding:
 *   To reach a leaf wID from the root, extract the binary representation
 *   of leaf_wID. The bits after the leading 1-bit encode left(0)/right(1)
 *   turns at each level. For a 3-level tree (7 nodes), leaf wIDs 4-7:
 *     wID 4 = 100b → path bits: 00 (left, left)
 *     wID 5 = 101b → path bits: 01 (left, right)
 *     wID 6 = 110b → path bits: 10 (right, left)
 *     wID 7 = 111b → path bits: 11 (right, right)
 *   For n levels, the path is the lower (n-1) bits of leaf_wID
 *   after the leading 1.
 *
 * Each non-leaf node on receiving SIGUSR1:
 *   1. Read shared[0].
 *   2. Determine if target_wID is in its left or right subtree.
 *   3. Write shared[0] (unchanged — already set by parent/root).
 *   4. Send SIGUSR1 to the appropriate child.
 *   5. Wait for SIGUSR2 acknowledgement from that child.
 *   6. Forward SIGUSR2 to its own parent (if not root).
 *
 * Leaf node on receiving SIGUSR1:
 *   1. Read shared[0].
 *   2. Check that target_wID == my_wID.
 *   3. Execute the command.
 *   4. Send SIGUSR2 to parent as acknowledgement.
 */

/* ── Shared memory packet packing/unpacking ── */
static uint32_t pack_command(uint8_t target_wID, enum command cmd,
                              uint8_t param) {
    return ((uint32_t)target_wID << 24)
         | ((uint32_t)cmd        << 16)
         | ((uint32_t)param      <<  8);
}

static void unpack_command(uint32_t word, uint8_t *target_wID,
                            enum command *cmd, uint8_t *param) {
    *target_wID = (word >> 24) & 0xFF;
    *cmd        = (enum command)((word >> 16) & 0xFF);
    *param      = (word >>  8) & 0xFF;
}

/* Returns 1 if wID 'target' is in the left subtree of 'me', 0 for right */
static int goes_left(int me_wID, int target_wID) {
    /* Repeatedly halve target until its parent is me_wID */
    int t = target_wID;
    while (t / 2 != me_wID) t = t / 2;
    return (t == me_wID * 2);   /* left child = 2*me_wID */
}

/* ── Signal handlers ── */
static volatile sig_atomic_t got_sigusr1 = 0;
static volatile sig_atomic_t got_sigusr2 = 0;

static void handler_usr1(int sig) { (void)sig; got_sigusr1 = 1; }
static void handler_usr2(int sig) { (void)sig; got_sigusr2 = 1; }

static void install_handlers(void) {
    struct sigaction sa;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sa.sa_handler = handler_usr1;
    sigaction(SIGUSR1, &sa, NULL);
    sa.sa_handler = handler_usr2;
    sigaction(SIGUSR2, &sa, NULL);
}

/* ── Process event loop (non-root processes run this after init) ── */
static void process_event_loop(void) {
    install_handlers();
    int is_leaf = (my_wID >= (1 << (tree_levels - 1)));

    while (1) {
        /* Block until SIGUSR1 arrives */
        while (!got_sigusr1) pause();
        got_sigusr1 = 0;

        /* Read the command word from shared memory */
        uint32_t word = *shared;
        uint8_t target_wID, param;
        enum command cmd;
        unpack_command(word, &target_wID, &cmd, &param);

        if (is_leaf) {
            /* Verify we are the intended target */
            if (target_wID == (uint8_t)my_wID) {
                /* Execute command */
                switch (cmd) {
                    case RESUME:
                        /* rval = param & 0x0F */
                        break;
                    case CHANGE:
                        /* cval = param */
                        break;
                    case STOP:
                        kill(parent_pid, SIGUSR2);
                        _exit(0);
                }
            }
            /* Send SIGUSR2 acknowledgement up */
            kill(parent_pid, SIGUSR2);
        } else {
            /* Non-leaf: route to correct child */
            pid_t target_child;
            if (goes_left(my_wID, (int)target_wID))
                target_child = child_left_pid;
            else
                target_child = child_right_pid;

            /* shared[0] already written by root — just signal */
            kill(target_child, SIGUSR1);

            /* Wait for SIGUSR2 acknowledgement from child */
            while (!got_sigusr2) pause();
            got_sigusr2 = 0;

            /* Forward ack upward (unless we are root) */
            if (my_wID != 1)
                kill(parent_pid, SIGUSR2);
        }
    }
}

/* ── Part 1: initialise ── */
int initialise(int n) {
    tree_levels = n;
    shared = mmap(NULL, sizeof(uint32_t),
                  PROT_READ | PROT_WRITE,
                  MAP_SHARED | MAP_ANONYMOUS,
                  -1, 0);
    if (shared == MAP_FAILED) return -1;

    /* Root starts with wID = 1 */
    my_wID    = 1;
    parent_pid = getpid();   /* root has no parent — set to self */

    /* Recursive build: build_tree(wID, current_level, n) */
    /* Inline iterative BFS-style: fork children if not at leaf level */
    /* Each process forks two children, records their PIDs, then loops */

    void build_subtree(int wid, int level, pid_t ppid);
    build_subtree(1, 1, getpid());
    return 0;
}

void build_subtree(int wid, int level, pid_t ppid) {
    my_wID     = wid;
    parent_pid = ppid;
    install_handlers();

    if (level == tree_levels) {
        /* Leaf: just run the event loop */
        process_event_loop();
        _exit(0);
    }

    /* Non-leaf: fork left and right children */
    pid_t my_pid = getpid();

    pid_t left = fork();
    if (left == 0) {
        build_subtree(wid * 2, level + 1, my_pid);
        _exit(0);
    }
    child_left_pid = left;

    pid_t right = fork();
    if (right == 0) {
        build_subtree(wid * 2 + 1, level + 1, my_pid);
        _exit(0);
    }
    child_right_pid = right;

    /* Root (wid==1) returns to caller; all others run event loop */
    if (wid != 1) {
        process_event_loop();
        _exit(0);
    }
    /* Root returns 0 to initialise() caller */
}

/* ── Part 2: issue_command (called only by root) ── */
void issue_command(enum command cmd, uint8_t leaf_wID) {
    uint8_t param = 0;
    /* param encoding: for RESUME, caller sets rval in low 4 bits;
       for CHANGE, caller sets cval in low 8 bits. Here we accept
       a variant with an extra param argument — simplified to 0. */

    /* Write the command word to shared memory BEFORE signalling */
    *shared = pack_command(leaf_wID, cmd, param);

    /* Determine which child of root to signal */
    pid_t target_child;
    if (goes_left(1, (int)leaf_wID))
        target_child = child_left_pid;
    else
        target_child = child_right_pid;

    /* Signal child downward */
    kill(target_child, SIGUSR1);

    /* Wait for SIGUSR2 acknowledgement (root waits) */
    while (!got_sigusr2) pause();
    got_sigusr2 = 0;
    /* Command delivered and acknowledged */
}

Part 1 — Binary tree construction: Each non-leaf process forks exactly two children (left = 2*wID, right = 2*wID+1) and records their PIDs. This is done recursively: after forking, each child immediately calls build_subtree with its own wID. Only the root returns to the caller; all other processes run their event loops forever. Resource minimisation: each process stores only child_left_pid, child_right_pid, and parent_pid — never the whole tree.

Part 2 — Shared memory packet encoding: A single uint32_t word encodes: target wID (bits 31–24), command (bits 23–16), parameter (bits 15–8). This is written to shared[0] before every SIGUSR1. Because the exam states signals cannot be missed or reordered, the write-then-signal pattern is reliable.

Routing algorithm: Given target wID t and current node wID m, to determine left vs right: repeatedly halve t until its parent is m, then check whether it equals 2*m (left) or 2*m+1 (right). This works for any tree depth within the 7-level maximum.

Acknowledgement protocol: Leaf sends SIGUSR2 to parent after executing the command. Each non-leaf forwards SIGUSR2 up after receiving it from the child it signalled. Root waits for SIGUSR2 before returning from issue_command(). This ensures the command was received and processed exactly once.

SIGUSR1/SIGUSR2 only down/up: SIGUSR1 travels exclusively from parent to child (down). SIGUSR2 travels exclusively from child to parent (up). No node ever sends SIGUSR1 upward or SIGUSR2 downward — this is enforced by the protocol.

Reliability limitations: The exam states signals cannot be queued, missed, or reordered — so under those assumptions the protocol is reliable. In a real system, signal handlers would need to be async-signal-safe; the pause()-based spin loop is susceptible to lost signals between the flag check and pause() — a production implementation would use sigsuspend() with a masked signal set instead.

Flashcards

Key concepts to memorize

Card 1 of 11

Question — click to flip

Answer

Click card to flip • Use buttons to navigate

Quiz

Test your understanding

Topic 21 Quiz — Processes Score: 0 / 7

1

What does fork() return in the child process?LO2

multiple choice

2

True or False: If execvp() succeeds, it returns 0 to the caller.LO2

true / false

3

How many times is "Hi" printed? fork(); fork(); fork(); printf("Hi\n");LO3

multiple choice

4

Fill in the blank: A child process that has exited but whose parent has not yet called wait() is called a ___ process. (one word)LO2

fill in the blank

5

Spot the bug: what is wrong with this exec call?LO3

char *args[] = { "-l", "/tmp", NULL };
execvp("ls", args);

spot the bug — multiple choice

6

Which macro extracts the numeric exit code from the status value returned by waitpid()?LO2

multiple choice

7

Why is it dangerous to pass user-controlled input directly to popen()?LO3

multiple choice

0/7

Quiz complete!

Processes — fork, exec, wait

Virtual Memory Model

Process Memory Layout (virtual address space)

Key Concepts Table

Worked Example — Heap and Stack in action

What is a process?

Process tree visualization

fork, exec, wait — annotated signatures

fork()

exec family

wait() and waitpid()

getpid() and getppid()

Exec variant cheat-sheet

popen() / pclose() — higher-level shell command execution

posix_spawn() — spawn a child process without fork

Complete programs you can compile and run

Practice problems with solutions

Key concepts to memorize

Test your understanding