Linked Lists
Dynamic data structures built from heap-allocated nodes connected by pointers — insert, delete, traverse, and free them correctly in C.
What is a linked list, and why do we need one?
Imagine a treasure hunt where each clue is written on a card, and each card tells you where the next card is hidden. You must start at the first card and follow the chain — you cannot jump directly to card 5 without visiting cards 1 through 4. That is a linked list. Each node holds some data and a pointer to the next node. The last node's pointer is NULL, signalling the end of the chain.
An array stores elements in contiguous memory — all elements sit side-by-side at predictable addresses. That gives you O(1) random access (arr[4] is instant) but makes insertion and deletion expensive because you may need to shift elements. A linked list stores elements anywhere on the heap; each element carries the address of the next one. You lose random access (you must walk the chain) but gain O(1) insertion and deletion at known positions — no shifting needed.
In Python you have the built-in list type which is actually a dynamic array, and the collections.deque which is a doubly-linked list under the hood. Java has LinkedList<T>. In C, there is no standard linked list — you build your own from structs and pointers. This is the defining challenge of this topic.
Every node you create with malloc must eventually be freed. C has no garbage collector. If you lose a pointer to a node before freeing it, that memory leaks forever (until the process exits). If you free a node but still dereference a pointer to it, you get a dangling pointer — undefined behaviour. These are the two most common linked list bugs on exams.
The formal definition from the COMP2017 lecture is recursive:
<Llist> ::== Nothing | element <Llist>
That means a list is either empty (nothing) or an element followed by another list. This recursive structure is mirrored in the struct definition: a node contains a pointer to another node of the same type.
Python / Java comparison
# Python list (dynamic array, not linked)
lst = [1, 2, 3]
lst.insert(0, 0) # O(n) — shifts everything
lst.append(4) # O(1) amortised
del lst[1] # O(n) — shifts
# Python has no manual memory management
# GC handles all allocations automatically
/* C singly linked list */
struct node {
int data;
struct node *next;
};
/* You must malloc every node */
struct node *n = malloc(sizeof(struct node));
n->data = 42;
n->next = NULL;
/* You must free every node */
free(n);
Linked list vs array — when to use which
| Operation | Array | Linked List |
|---|---|---|
| Access element at index i | O(1) — offset arithmetic | O(n) — must traverse chain |
| Insert / delete at front | O(n) — shift all elements | O(1) — update head pointer |
| Insert / delete at known position | O(n) — shift elements | O(1) — rewire two pointers |
| Insert at tail | O(1) amortised (with capacity) | O(n) unless tail pointer kept |
| Memory usage | Compact — just elements | Extra pointer per node (8 bytes on 64-bit) |
| Cache performance | Excellent — contiguous | Poor — nodes scattered on heap |
| Resize | Expensive realloc + copy | Free — just malloc a new node |
Use an array when you need fast random access and your size is roughly known upfront. Use a linked list when you frequently insert/delete at the front or middle, size varies widely, or you need a stack/queue with O(1) push/pop. In COMP2017, linked lists also appear inside other data structures — hash tables, thread-safe queues, and BSTs (next topic) all use pointer-based node structures.
Variants: doubly linked and circular
A singly linked list has one pointer per node (next). You can only traverse forward. A doubly linked list adds a prev pointer, allowing traversal in both directions and O(1) deletion when you already have a pointer to the node (no need to find the predecessor). A circular linked list has the last node's next pointing back to the head instead of NULL — useful for round-robin scheduling and cyclic buffers, but you must break the cycle before freeing.
Node struct, pointer operations, and key idioms
The node struct — anatomy
/* A self-referential struct: contains a pointer to itself */ struct node { int data; // the payload — can be any type struct node * next; // pointer to the NEXT node in the chain }; // ^ ^ // | └── must use pointer (not struct node) — incomplete type rule // └── must say "struct node" again — no typedef yet /* Using typedef to avoid repeating "struct node" everywhere */ typedef struct node { int data; struct node *next; // still need "struct node" here, not "Node *" } Node; /* Creating a node on the heap */ Node *n = malloc(sizeof(Node)); // ^ ^ // | └── sizeof the struct, NOT sizeof a pointer! // └── always check return — NULL if out of memory if (n == NULL) { /* handle allocation failure */ } n->data = 42; n->next = NULL; // always initialise next — garbage pointer = crash
struct node *next) must be a pointer, never a direct struct embed. A struct cannot contain itself — that would be infinite size. It can contain a pointer to itself because all pointers are the same size (8 bytes on 64-bit).
Insert at head — O(1)
/* From Week 4 Lecture: insert_front */ Node *insert_front(Node *head, int val) { Node *newp = malloc(sizeof(Node)); if (newp == NULL) return head; // allocation failed, keep old list newp->data = val; newp->next = head; // ① new node points at old head (do this FIRST) return newp; // ② new node IS the new head } // Usage: head = insert_front(head, 99); // ORDER MATTERS: if you did head = newp FIRST, you would lose the old list!
newp->next = head BEFORE reassigning head = newp. One pointer can only point at one thing at a time. If you move head first, you lose the only reference to the rest of the list — an instant memory leak.
Insert at tail — O(n)
/* Must walk to find the last node */ Node *insert_back(Node *head, int val) { Node *newp = malloc(sizeof(Node)); if (newp == NULL) return head; newp->data = val; newp->next = NULL; // new tail always points to NULL if (head == NULL) return newp; // empty list — new node IS the list Node *p = head; while (p->next != NULL) { // walk until p IS the last node p = p->next; } p->next = newp; // attach new node at the end return head; // head did not change }
Delete a node — the predecessor trick
/* Delete the FIRST node whose data equals val */ Node *delete_val(Node *head, int val) { if (head == NULL) return NULL; /* Special case: deleting the head node */ if (head->data == val) { Node *new_head = head->next; // save next BEFORE freeing free(head); // release memory return new_head; // caller must update their head variable } /* General case: find predecessor of node-to-delete */ Node *prev = head; while (prev->next != NULL && prev->next->data != val) { prev = prev->next; } if (prev->next == NULL) return head; // val not found Node *target = prev->next; prev->next = target->next; // ① bypass the target node free(target); // ② release memory AFTER rewiring return head; }
target->next in prev->next BEFORE calling free(target). After free, reading target->next is undefined behaviour — the memory may have been reused already.
Free the entire list
/* Must save next pointer BEFORE freeing current node */ void free_list(Node *head) { Node *curr = head; while (curr != NULL) { Node *next = curr->next; // ① save next BEFORE free free(curr); // ② release current curr = next; // ③ advance to saved next } } // After this, always set head = NULL in the caller to avoid dangling pointer
Complete compilable programs
#include <stdio.h>
#include <stdlib.h>
/* ---------- Node definition ---------- */
typedef struct node {
int data;
struct node *next;
} Node;
/* ---------- Create a single node ---------- */
Node *make_node(int val) {
Node *n = malloc(sizeof(Node));
if (n == NULL) {
fprintf(stderr, "malloc failed\n");
exit(1);
}
n->data = val;
n->next = NULL;
return n;
}
/* ---------- Insert at head O(1) ---------- */
Node *insert_front(Node *head, int val) {
Node *n = make_node(val);
n->next = head; /* new node points to old head */
return n; /* new node is new head */
}
/* ---------- Insert at tail O(n) ---------- */
Node *insert_back(Node *head, int val) {
Node *n = make_node(val);
if (head == NULL) return n;
Node *p = head;
while (p->next != NULL)
p = p->next; /* walk to last node */
p->next = n;
return head;
}
/* ---------- Traverse and print ---------- */
void print_list(Node *head) {
printf("List: ");
for (Node *p = head; p != NULL; p = p->next) {
printf("%d", p->data);
if (p->next != NULL) printf(" -> ");
}
printf(" -> NULL\n");
}
/* ---------- Search: return node or NULL ---------- */
Node *search(Node *head, int val) {
for (Node *p = head; p != NULL; p = p->next) {
if (p->data == val) return p;
}
return NULL;
}
/* ---------- Delete first node with given value ---------- */
Node *delete_val(Node *head, int val) {
if (head == NULL) return NULL;
/* Deleting head */
if (head->data == val) {
Node *new_head = head->next;
free(head);
return new_head;
}
/* Find predecessor */
Node *prev = head;
while (prev->next != NULL && prev->next->data != val)
prev = prev->next;
if (prev->next == NULL) return head; /* not found */
Node *target = prev->next;
prev->next = target->next; /* bypass */
free(target);
return head;
}
/* ---------- Free entire list ---------- */
void free_list(Node *head) {
Node *curr = head;
while (curr != NULL) {
Node *next = curr->next; /* save BEFORE free */
free(curr);
curr = next;
}
}
/* ---------- main ---------- */
int main(void) {
Node *head = NULL;
/* Build: insert 3 values at back */
head = insert_back(head, 10);
head = insert_back(head, 20);
head = insert_back(head, 30);
print_list(head); /* 10 -> 20 -> 30 -> NULL */
/* Insert at front */
head = insert_front(head, 5);
print_list(head); /* 5 -> 10 -> 20 -> 30 -> NULL */
/* Search */
Node *found = search(head, 20);
if (found) printf("Found: %d\n", found->data);
/* Delete middle node */
head = delete_val(head, 20);
print_list(head); /* 5 -> 10 -> 30 -> NULL */
/* Delete head */
head = delete_val(head, 5);
print_list(head); /* 10 -> 30 -> NULL */
/* Free everything */
free_list(head);
head = NULL; /* avoid dangling pointer */
return 0;
}
List: 5 -> 10 -> 20 -> 30 -> NULL
Found: 20
List: 5 -> 10 -> 30 -> NULL
List: 10 -> 30 -> NULL
#include <stdio.h>
#include <stdlib.h>
/* Doubly linked: each node has BOTH next AND prev */
typedef struct dnode {
int data;
struct dnode *next;
struct dnode *prev; /* extra pointer — can traverse backwards */
} DNode;
DNode *make_dnode(int val) {
DNode *n = malloc(sizeof(DNode));
if (!n) { perror("malloc"); exit(1); }
n->data = val;
n->next = NULL;
n->prev = NULL;
return n;
}
/* Insert at front — O(1), updates prev pointer of old head */
DNode *dlist_push_front(DNode *head, int val) {
DNode *n = make_dnode(val);
n->next = head;
if (head != NULL)
head->prev = n; /* old head's prev now points back to new node */
return n;
}
/* Delete a node when we already have a pointer to it — O(1)! */
/* This is the KEY advantage over singly linked lists */
DNode *dlist_delete(DNode *head, DNode *target) {
if (target->prev != NULL)
target->prev->next = target->next; /* bypass from left */
else
head = target->next; /* deleting head */
if (target->next != NULL)
target->next->prev = target->prev; /* bypass from right */
free(target);
return head;
}
void dlist_print(DNode *head) {
printf("DList: ");
for (DNode *p = head; p != NULL; p = p->next)
printf("[%d]%s", p->data, p->next ? " <-> " : "");
printf(" <-> NULL\n");
}
void dlist_free(DNode *head) {
DNode *curr = head;
while (curr) {
DNode *next = curr->next;
free(curr);
curr = next;
}
}
int main(void) {
DNode *head = NULL;
head = dlist_push_front(head, 30);
head = dlist_push_front(head, 20);
head = dlist_push_front(head, 10);
dlist_print(head); /* [10] <-> [20] <-> [30] <-> NULL */
/* Delete middle node directly — no traversal needed */
DNode *mid = head->next; /* points to 20 */
head = dlist_delete(head, mid);
dlist_print(head); /* [10] <-> [30] <-> NULL */
dlist_free(head);
return 0;
}
DList: [10] <-> [30] <-> NULL
Practice problems with solutions
Write a function int list_length(Node *head) that returns the number of nodes in the list. It should return 0 for an empty list (NULL). Do not use recursion.
int list_length(Node *head) {
int count = 0;
for (Node *p = head; p != NULL; p = p->next)
count++;
return count;
}
p, incrementing for each node. When p reaches NULL (past the last node), the loop ends. The for-loop idiom for (p = head; p != NULL; p = p->next) is the canonical traversal pattern you will write dozens of times.
Write a function Node *reverse(Node *head) that reverses the order of a singly linked list in-place (no new nodes, no copying data — just rewire the next pointers). Return the new head. For example, 1->2->3->NULL becomes 3->2->1->NULL.
Node *reverse(Node *head) {
Node *prev = NULL;
Node *curr = head;
while (curr != NULL) {
Node *next = curr->next; /* 1. save forward link */
curr->next = prev; /* 2. flip: point backward */
prev = curr; /* 3. advance prev */
curr = next; /* 4. advance curr */
}
return prev; /* prev is the new head (last node we visited) */
}
prev (the node behind us), curr (current node), next (save the forward link before we overwrite it). At each step: save curr->next, then set curr->next = prev to flip the arrow. Advance both pointers. When curr is NULL, prev is the last node we processed — the new head.
Write Node *sorted_insert(Node *head, int val) that inserts a new node with val into a sorted (ascending) singly linked list, maintaining sort order. If the list is 1->3->5->NULL and val is 4, result should be 1->3->4->5->NULL.
Node *sorted_insert(Node *head, int val) {
Node *n = make_node(val);
/* Insert before head if list empty or val is smallest */
if (head == NULL || val <= head->data) {
n->next = head;
return n;
}
/* Walk to find the predecessor */
Node *prev = head;
while (prev->next != NULL && prev->next->data < val)
prev = prev->next;
/* Insert after prev */
n->next = prev->next;
prev->next = n;
return head;
}
val <= head->data) — handle this first and return the new node as head. (2) Otherwise, walk until prev->next is a node with data ≥ val (or we reach the end). Then splice in: n->next = prev->next then prev->next = n. The order of these two assignments is critical — always set the new node's next before redirecting prev.
The function below has a memory leak. Find the bug and fix it.
Node *delete_head(Node *head) {
if (head == NULL) return NULL;
head = head->next; /* just move head forward */
return head;
}
Node *delete_head(Node *head) {
if (head == NULL) return NULL;
Node *old_head = head; /* save pointer to node we want to free */
head = head->next; /* move head forward */
free(old_head); /* now free the old head */
return head;
}
head pointer forward but never calls free() on the old head node. That malloc'd node is now unreachable — its memory is leaked. The fix is to save a pointer to the old head before advancing, then free it. This pattern (save, advance, free) is the correct idiom for deleting any node from a linked list.
The code below has a dangling pointer bug. What is the problem and how do you fix it?
Node *delete_val(Node *head, int val) {
if (head == NULL) return NULL;
if (head->data == val) {
free(head);
return head->next; /* BUG: accessing freed memory! */
}
/* ... rest of function */
return head;
}
Node *delete_val(Node *head, int val) {
if (head == NULL) return NULL;
if (head->data == val) {
Node *new_head = head->next; /* save next BEFORE free */
free(head); /* now free */
return new_head; /* return saved pointer */
}
/* ... rest of function */
return head;
}
free(head) releases the memory that head points to. After free, head is a dangling pointer — it still holds the same address, but that memory is now owned by the heap allocator and may have been overwritten. Accessing head->next after free is undefined behaviour. The fix: save head->next into a local variable before calling free, then return that saved value.
A circular linked list has its last node pointing back to some earlier node (creating a cycle). Your traversal loop would run forever. Write int has_cycle(Node *head) using Floyd's tortoise-and-hare algorithm: use two pointers, one moving 1 step at a time, one moving 2 steps. If they ever point to the same node, there is a cycle. Return 1 if cycle, 0 if not.
int has_cycle(Node *head) {
Node *slow = head; /* tortoise: moves 1 step */
Node *fast = head; /* hare: moves 2 steps */
while (fast != NULL && fast->next != NULL) {
slow = slow->next; /* 1 step */
fast = fast->next->next; /* 2 steps */
if (slow == fast)
return 1; /* they met — cycle detected */
}
return 0; /* fast reached NULL — no cycle */
}
NULL. The key check fast != NULL && fast->next != NULL guards against dereferencing NULL when moving the fast pointer two steps forward.
Key concepts to memorize
Test your understanding
node->next immediately after calling free(node).LO5next pointer) is ___.LO5void free_list(Node *head) {
while (head != NULL) {
free(head);
head = head->next; /* BUG */
}
}