Linking & Libraries
How C programs are assembled from multiple files — the compilation pipeline, object files, static and dynamic libraries, and how the linker resolves symbols.
From source code to executable: the full pipeline
Think of building a house. You have architects who draw blueprints (source files), contractors who build individual rooms from those blueprints (object files), and a general contractor who connects all the rooms into a coherent house (linker). Some rooms come pre-built from a catalogue — you just slot them in. That catalogue is a library. Static libraries are like buying a prefabricated room that gets permanently embedded in your house. Dynamic libraries are like hiring a plumber who shows up only when you need the bathroom — the code is loaded on demand at runtime.
When you run gcc hello.c -o hello, four separate programs run behind the scenes. You rarely invoke them individually, but understanding each step is essential for debugging build errors, working with multiple files, and creating your own libraries.
Step 1 — Preprocessor: Text substitution only. Expands #include (pastes header files in), #define macros, and #ifdef conditionals. Produces a pure C file with no preprocessor directives left. Run gcc -E hello.c to see this output.
Step 2 — Compiler: Translates preprocessed C into assembly language (human-readable CPU instructions). Run gcc -S hello.c to produce hello.s.
Step 3 — Assembler: Converts assembly language into binary machine code, producing an object file (.o). Object files contain the compiled code but with placeholder references for symbols defined in other files (e.g., a call to printf is just a placeholder until the linker fills it in). Run gcc -c hello.c to stop at this stage.
Step 4 — Linker: Combines all .o files and resolves all symbol references. It looks up each unresolved name (like printf) in the listed libraries and fills in the correct addresses. The output is a complete executable.
In a large project with dozens of .c files, recompiling everything every time you change one file would be slow. By compiling each file to its own .o, you only need to recompile the changed file and then re-link. This is exactly what Makefiles automate.
Multi-file projects and modules: In C, a module is a .c file paired with a .h header file. The .c file contains the definitions (the actual code). The .h file contains the declarations (the function signatures and types). Other files #include the .h to know what they can call — but the actual code only lives once, in the .c file.
extern declarations tell the compiler "this symbol exists somewhere else — trust me and let the linker find it." Without extern, each .c file would think it needs to define everything itself, leading to duplicate definition errors.
Header guards prevent a header file from being included more than once in the same translation unit, which would cause duplicate type definitions. The standard pattern is #ifndef MYHEADER_H / #define MYHEADER_H / ... / #endif. Every header file you write should have one.
Static vs Dynamic libraries — the key difference: A static library (.a) is an archive of .o files. When you link against it, the linker copies the needed object code directly into your executable. Your binary is self-contained but larger. A dynamic (shared) library (.so on Linux, .dylib on macOS) is loaded at runtime when the program starts. Multiple processes can share one copy in memory. Your binary is smaller, but the .so file must be present at runtime.
The most common linker error is "undefined reference to 'foo'". This means the linker found a call to foo but couldn't find its definition in any of the .o or library files you provided. Fix: either add the missing .c file to the compile command, or add the correct -l flag for the library that provides it.
# Python imports happen at runtime
import math # standard library
import mymodule # your own file
# No separate compile step
# No header files needed
# No linking step
# Just run: python3 main.py
/* C includes happen at preprocessor time */
#include <math.h> /* standard library header */
#include "mymodule.h" /* your own header */
/* Must compile AND link:
gcc main.c mymodule.c -lm -o prog
-lm links the math library (libm.a/.so) */
Source files compile to object files. Object files get linked together. Header files are declarations, not definitions. Use -c to compile only, -L for library path, -l for library name, -I for include path. Static = baked in. Dynamic = loaded at runtime.
Commands, flags, and file formats annotated
Compiling and linking commands
# Compile a single file to object code (stop before linking) gcc -c util.c # ^ ^ # | └── source file # └────── -c flag: compile only, produce util.o # Compile multiple .c files and link them gcc main.c util.c -o myprog # ^ ^ ^ # | | └── output executable named 'myprog' # | └──────────── second source file # └────────────────── first source file # Link existing object files gcc main.o util.o -o myprog # Compile with custom include and library paths gcc -I./include -L./lib -lmylib main.c -o myprog # ^ ^ ^ # | | └── link against libmylib.a or libmylib.so # | └─────────── search ./lib for library files # └──────────────────────── search ./include for header files
-l strips the lib prefix and the .a/.so suffix. So -lm links libm.so, -lpthread links libpthread.so, -lmylib links libmylib.a or libmylib.so.
Creating static libraries with ar
# Step 1: compile source files to object code gcc -c util.c # produces util.o gcc -c mylib.c # produces mylib.o # Step 2: pack them into an archive (static library) ar rcs libmylib.a util.o mylib.o # ^ ^ ^ ^ # | | | └── object files to include # | | └──────────── output archive name (must start with 'lib', end with '.a') # | └──────────────── flags: r=insert/replace, c=create, s=write index # └─────────────────── the 'ar' archiver tool # Step 3: link the static library into a program gcc -o myprog main.c -L. -lmylib # ^ ^ # | └── link libmylib.a (or .so) # └────── look for libraries in current dir (.)
Creating dynamic (shared) libraries
# Step 1: compile with Position Independent Code flag gcc -c -fpic util.c # -fpic = position-independent code, required for shared libs gcc -c -fpic mylib.c # Step 2: create shared object file gcc -shared -o libmylib.so util.o mylib.o # ^ ^ # | └── output shared object (convention: lib*.so) # └─────────────── -shared: produce a shared library, not an executable # Step 3: link against it (same -L -l flags as static) gcc -o myprog main.c -L. -lmylib # At runtime: tell the loader where to find the .so LD_LIBRARY_PATH=. ./myprog # Or install .so to /usr/local/lib and run ldconfig
Header guards — protecting against double inclusion
/* mylib.h — every header file should look like this */ #ifndef MYLIB_H /* if MYLIB_H is not yet defined... */ #define MYLIB_H /* ...define it now (marks this file as "seen") */ /* All your declarations go here */ int add(int a, int b); void print_result(int n); #endif /* MYLIB_H */ /* Without header guards, if two .c files both #include "mylib.h", the compiler sees the declarations twice → "redefinition" errors. The guard ensures the body is only processed once. */
mylib.h → MYLIB_H. Many compilers also support #pragma once as a simpler alternative, but #ifndef guards are portable and standard.
Diagnostic tools: nm and ldd
| Command | What it does | Example |
|---|---|---|
nm myprog |
List all symbols (functions, global variables) in an object file or executable. Shows which are defined (T = text/code), undefined (U), or data (D). | nm util.o | grep ' U ' — show unresolved symbols |
ldd myprog |
List dynamic libraries (shared objects) that a program depends on at runtime, and their resolved paths. | ldd ./myprog — shows libpthread.so, libc.so, etc. |
ar t libmylib.a |
List the object files inside a static library archive. | ar t libm.a |
objdump -d myprog |
Disassemble an object file or executable — show the machine code as assembly. | objdump -d util.o |
Complete multi-file project examples
/* ===== mathlib.h ===== */
/* Header guard — prevents double inclusion */
#ifndef MATHLIB_H
#define MATHLIB_H
/* Declaration only — no function body here */
/* extern is implicit for function declarations */
int add(int a, int b);
int multiply(int a, int b);
double power(double base, int exp);
#endif /* MATHLIB_H */
/* ===== mathlib.c ===== */
/* This file provides the DEFINITIONS (actual code) */
#include "mathlib.h" /* include our own header */
int add(int a, int b) {
return a + b;
}
int multiply(int a, int b) {
return a * b;
}
double power(double base, int exp) {
double result = 1.0;
for (int i = 0; i < exp; i++) {
result *= base;
}
return result;
}
/* ===== main.c ===== */
#include <stdio.h>
#include "mathlib.h" /* get declarations so compiler checks our calls */
int main(void) {
printf("add(3, 4) = %d\n", add(3, 4));
printf("multiply(6, 7) = %d\n", multiply(6, 7));
printf("power(2.0, 10) = %.0f\n", power(2.0, 10));
return 0;
}
$ gcc -c main.c # produces main.o
$ gcc main.o mathlib.o -o calc # link both .o files
$ ./calc
add(3, 4) = 7
multiply(6, 7) = 42
power(2.0, 10) = 1024
# Build object files for library components
gcc -c mathlib.c -o mathlib.o
gcc -c strutils.c -o strutils.o
# Create static library archive (convention: name starts with 'lib', ends with '.a')
ar rcs libmathlib.a mathlib.o strutils.o
# r = insert/replace members
# c = create archive if it doesn't exist
# s = write an object-file index (speeds up linking)
# Inspect the archive
ar t libmathlib.a
# Output:
# mathlib.o
# strutils.o
# Link the library into a program
# -L. = look for libraries in the current directory (.)
# -lmathlib = link against libmathlib.a (strips 'lib' prefix and '.a' suffix)
gcc -o calc main.c -L. -lmathlib
# Run the program
./calc
# View symbols inside the library
nm libmathlib.a
# T = defined symbol (Text section = code)
# U = undefined symbol (must be resolved by the linker)
# Output includes lines like:
# 0000000000000000 T add
# 0000000000000020 T multiply
# Step 1: compile with -fpic (position-independent code — required for shared libs)
gcc -c -fpic mathlib.c -o mathlib.o
# Step 2: create shared library from the PIC object
gcc -shared -o libmathlib.so mathlib.o
# Step 3: link the program against the shared library
gcc -o calc main.c -L. -lmathlib
# Step 4a: run — if libmathlib.so is NOT in a standard library path, set env var
LD_LIBRARY_PATH=. ./calc
# Step 4b: alternatively, install to system library path (needs root) and refresh cache
# sudo cp libmathlib.so /usr/local/lib/
# sudo ldconfig
# Check what shared libraries 'calc' depends on
ldd ./calc
# Output (example):
# linux-vdso.so.1 => (0x00007ffce6bfe000)
# libmathlib.so => ./libmathlib.so (0x00007f4a3c123000)
# libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4a3bf00000)
CC = gcc
CFLAGS = -Wall -Wextra -g -I./include
LDFLAGS = -L./lib
LIBS = -lmathlib
# Default target: build the executable
all: calc
# Link: combine main.o with the library
calc: main.o lib/libmathlib.a
$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) $(LIBS)
# Compile main.c to main.o
main.o: src/main.c include/mathlib.h
$(CC) $(CFLAGS) -c src/main.c -o $@
# Compile library source files
lib/mathlib.o: src/mathlib.c include/mathlib.h
$(CC) $(CFLAGS) -c src/mathlib.c -o $@
# Build static library from object files
lib/libmathlib.a: lib/mathlib.o
ar rcs $@ $^
# Clean all generated files
clean:
rm -f main.o lib/*.o lib/libmathlib.a calc
.PHONY: all clean
$ make clean # removes all generated files
$ make calc # builds only the executable target
Practice problems with solutions
For each command, name which stage(s) of the compilation pipeline run, and what file is produced:
gcc -E hello.c
gcc -S hello.c
gcc -c hello.c
gcc hello.c -o hello
gcc -S hello.c — runs preprocessor + compiler. Output: hello.s (assembly language). Stops before the assembler.
gcc -c hello.c — runs preprocessor + compiler + assembler. Output: hello.o (machine code object file). Stops before the linker. Symbols referencing other files remain as unresolved placeholders.
gcc hello.c -o hello — runs all four stages: preprocessor, compiler, assembler, linker. Output: executable named hello. Links against the C standard library automatically.
You have main.c, a static library libutil.a in ./lib/, and headers in ./include/. The following build command produces an "undefined reference" error. Find and fix it:
gcc -I./include main.c -o myprog -lutil
gcc -I./include main.c -L./lib -lutil -o myprog
-L./lib. Without it, the linker searches only the default system library paths (like /usr/lib) for libutil.a. It won't find your custom library in ./lib/, causing "undefined reference" for every function from that library.Also note:
-l flags should generally come after the object files that need them (GCC processes arguments left-to-right, and the linker resolves symbols from objects listed before libraries). The fix shows the correct order: source files, then -L, then -l.
Write stack.h — a header file for a simple integer stack. It should include: header guards, a struct definition for the stack (capacity 100), and declarations for push, pop, peek, and is_empty.
/* stack.h */
#ifndef STACK_H /* header guard — start */
#define STACK_H
#define STACK_CAPACITY 100
/* Stack data structure — defined in the header so users know its layout */
typedef struct {
int data[STACK_CAPACITY];
int top; /* index of the top element (-1 if empty) */
} Stack;
/* Function declarations (extern is implicit for function prototypes) */
void stack_init(Stack *s);
int stack_push(Stack *s, int value); /* returns 1 on success, 0 if full */
int stack_pop(Stack *s, int *out); /* returns 1 on success, 0 if empty */
int stack_peek(Stack *s, int *out); /* returns 1 on success, 0 if empty */
int stack_is_empty(Stack *s); /* returns 1 if empty, 0 otherwise */
#endif /* STACK_H */ /* header guard — end */
Stack on the stack. Functions take a pointer (Stack *s) because C passes by value — we need to modify the caller's stack. The actual function bodies go in stack.c, which #include "stack.h".
For each scenario, say whether you would prefer a static library (.a) or a dynamic library (.so) and explain why:
(a) A command-line tool that must run on systems where the library may not be installed.
(b) A set of utility functions shared by 20 server processes running simultaneously.
(c) A security library where you want to push bugfix updates without recompiling all consumers.
(d) A small embedded system with no dynamic linker.
(b) Dynamic (.so) — all 20 processes share one copy of the library in memory (the OS maps the same .so pages into each process's address space). With a static library, each process would have its own copy of the code, wasting memory.
(c) Dynamic (.so) — you can replace the .so file with a patched version. All programs that dynamically load it will pick up the fix on their next launch without recompilation. With static linking, every consumer would need to be recompiled and redeployed.
(d) Static (.a) — embedded systems often lack a dynamic linker (ld.so). Static linking is the only option. Binary size is usually not a concern when targeting a specific hardware platform.
You have three files. main.o has an undefined reference to foo and bar. foo.o defines foo and has an undefined reference to helper. util.o defines helper and bar.
What is the correct linker command? What happens if you only provide main.o and foo.o? What if you define bar in both foo.o and util.o?
gcc main.o foo.o util.o -o myprog
foo from foo.o, main.o's bar from util.o, foo.o's helper from util.o. All symbols are satisfied.If only main.o + foo.o: The linker finds
foo (in foo.o), but bar and helper remain unresolved. Error: "undefined reference to 'bar'" and "undefined reference to 'helper'".If bar is defined in both foo.o and util.o: Error: "multiple definition of 'bar'". Each symbol may only be defined once across all linked object files. This is the One Definition Rule. Solution: remove the duplicate definition, or put one copy in a header as
static inline (but that's unusual — just fix the duplication).
Key concepts to memorize
Test your understanding
#ifndef, #define, and ___ to prevent double inclusion.LO1gcc main.c -o myprog
/* main.c includes math.h and calls sqrt() */