Enums, Structs & Unions
C's tools for grouping and naming data — struct for records with separate fields, union for overlapping memory, and enum for named integer constants.
Records, overlapping memory, and named constants
Imagine a library catalogue card for a book. It has several fields: title, author, ISBN, year. Each field is a different type (strings and numbers), but they all belong to the same book. A struct in C is exactly this — a group of related variables of possibly different types, bundled together under one name. Unlike a Python class or Java object, a struct has no methods — it is pure data.
A union looks like a struct syntactically, but all its members share the same block of memory. Think of it as a hotel room with one bed. You can put "Alice" in the room (the int member), or "Bob" (the float member), but not both at the same time. The room's size is determined by the largest possible guest. Reading a member you did not write last gives garbage — only one member is "active" at a time.
An enum is a way to give meaningful names to a set of related integer constants. Instead of writing if (day == 1) everywhere, you write if (day == MON). The compiler still stores an int underneath, but you get readable names and the ability to catch typos at compile time. Think of it like Python's IntEnum or Java's enum — but simpler, with no methods or attached data.
In Python, you use a class or dataclass to group related data. In Java, you use a class or record. In C, struct is the direct equivalent — but stripped down to pure data storage. There is no self, no inheritance, and no constructors. You define the layout, declare variables of that type, and access fields with a dot.
The typedef keyword lets you create an alias for any type. When applied to a struct, it removes the need to write struct before every variable declaration. This is extremely common in real C code and library headers.
Memory layout matters in C. Unlike Python or Java where the runtime hides memory details, in C you can see exactly how many bytes a struct occupies using sizeof. The compiler may insert padding bytes between members to satisfy CPU alignment requirements — an int must start at an address divisible by 4, a double at one divisible by 8. This means sizeof(struct) is often larger than the sum of its member sizes.
Comparison Table: struct vs union vs enum
| Feature | struct | union | enum |
|---|---|---|---|
| Purpose | Group related variables of different types | Store one of several types in the same memory | Give names to integer constants |
| Memory | Each member has its own space; total = sum + padding | All members share same space; size = largest member | Same as int (typically 4 bytes) |
| Members active | All members always valid | Only one member active at a time | N/A — it is a single integer value |
| Access | s.field or ptr->field |
u.member or ptr->member |
Direct use as integer constant |
| Python equivalent | dataclass / named tuple |
No direct equivalent | IntEnum |
| Java equivalent | record / POJO class |
No direct equivalent | enum (simpler version) |
Python vs C comparison
# Python dataclass ≈ C struct
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
p = Point(10, 20)
print(p.x, p.y) # 10 20
# Python IntEnum ≈ C enum
from enum import IntEnum
class Day(IntEnum):
SUN = 0
MON = 1
TUE = 2
today = Day.MON
print(today) # Day.MON
print(int(today)) # 1
/* C struct */
struct Point {
int x;
int y;
};
struct Point p = {10, 20};
printf("%d %d\n", p.x, p.y);
/* C enum */
enum Day {
SUN = 0,
MON = 1,
TUE = 2
};
enum Day today = MON;
printf("%d\n", today); /* 1 */
/* No union equivalent in Python */
union Data { int i; float f; };
union Data d; d.i = 42;
A C struct has no methods, no constructors, no destructors, no access control. It is a pure memory layout description. You cannot call p.toString() or p.getX(). All behavior must come from standalone functions that receive a pointer to the struct. This is the foundation of C's "data + functions are separate" philosophy.
Always use sizeof(struct MyType) rather than manually calculating the size. Padding is inserted by the compiler — the actual size may surprise you. Rule of thumb: order struct members from largest to smallest alignment to minimize padding waste.
struct: multiple named boxes side by side in memory, each holding one value. union: multiple names for the same single box, only the last write is meaningful. enum: a set of names that are secretly integers. typedef: just a type alias — no new memory involved.
enum, struct, union, typedef — annotated
2a. enum — declaration and use
enum day_name /* keyword + tag name */ { SUN, /* = 0 (auto-assigned, starts at 0) */ MON, /* = 1 */ TUE, /* = 2 */ WED, /* = 3 */ THU, /* = 4 */ FRI, /* = 5 */ SAT, /* = 6 */ DAY_UNDEF /* = 7 — sentinel value */ }; /* You can also assign explicit values: */ enum colour { RED = 0, GREEN = 10, BLUE = 20 }; /* Declaring and using an enum variable: */ enum day_name today = MON; enum day_name tomorrow = today + 1; /* arithmetic works — enums are ints */ printf("%d\n", today); /* prints 1 */
int. You can increment, compare, and print them as integers. Enums improve readability (MON vs 1) but do NOT prevent you from assigning out-of-range integers.
2b. struct — declaration, definition, initialization
/* DECLARATION — define the type layout */ struct Point { ^ ^ | └── tag name (optional but recommended) └───────── struct keyword int x; /* member: type + name */ int y; }; /* semicolon after closing brace — easy to forget! */ /* VARIABLE DECLARATION (creating instances) */ struct Point p1; /* uninitialised — contains garbage */ struct Point p2 = {10, 20}; /* positional initialisation */ struct Point p3 = {0}; /* all members zero-initialised */ struct Point p4 = {.x=5, .y=8}; /* designated initialiser — order-independent */ /* MEMBER ACCESS with dot operator */ p2.x = 99; /* write to member x */ printf("%d\n", p2.y); /* read member y */ /* MEMBER ACCESS with arrow operator (pointer to struct) */ struct Point *ptr = &p2; ptr->x = 50; /* equivalent to (*ptr).x = 50 */ ^ arrow = dereference pointer + access member
. (dot) when you have a struct value. Use -> (arrow) when you have a pointer to a struct. ptr->x is exactly equivalent to (*ptr).x — arrow is just syntactic sugar.
2c. typedef — removing the struct keyword
/* WITHOUT typedef — must write "struct" every time */ struct Point { int x; int y; }; struct Point p1; /* must say "struct Point" */ /* WITH typedef — create an alias for the type */ typedef struct { int x; int y; } Point; /* "Point" is now a type alias */ ^ typedef keyword makes "Point" a direct type name Point p2; /* no "struct" needed — much cleaner */ /* Typedef with tag name (allows self-referential structs) */ typedef struct Node { int value; struct Node * next; /* self-referential: must use tag here */ } Node; Node *head = NULL; /* linked list head pointer */
typedef alias is not yet defined inside the struct body. You must use the tag name (struct Node *next) for the pointer member, then the alias is available after the closing brace.
2d. union — shared memory layout
union Data { ^ union keyword — same syntax as struct, different memory rule int i; /* 4 bytes */ float f; /* 4 bytes */ char c; /* 1 byte */ }; /* sizeof(union Data) = 4 (largest member = int/float) */ /* ALL members start at the SAME address — they overlap */ union Data d; d.i = 42; /* write int */ printf("%d\n", d.i); /* 42 — correct */ printf("%f\n", d.f); /* UNDEFINED BEHAVIOR — reading wrong member! */ d.f = 3.14f; /* now float is the active member */ printf("%f\n", d.f); /* 3.140000 — correct */
. or ->).
2e. Struct memory layout and padding
2f. Bitfields — packing bits inside a struct
Bitfields allow packing multiple small values into a struct by specifying exactly how many bits each member occupies. The compiler handles all the shifts and masks for you. They are commonly used for hardware device registers, network protocol headers, and compact flag storage.
/* Device control register — 16 bits total */ struct device_register { unsigned status : 3; /* bits 0-2: device status (0-7) */ unsigned mode : 2; /* bits 3-4: operating mode (0-3) */ unsigned priority : 4; /* bits 5-8: priority level (0-15) */ unsigned error : 1; /* bit 9: error flag */ unsigned : 6; /* bits 10-15: padding (unnamed, reserved) */ }; /* Usage: */ struct device_register reg = {0}; reg.status = 5; /* set status to 5 */ reg.error = 1; /* set error flag */ printf("mode = %u\n", reg.mode); /* read mode field */ /* Comparison: doing the same with manual bit manipulation: */ uint16_t reg_raw = 0; reg_raw |= (5 & 0x7); /* set status bits 0-2 */ reg_raw |= (1 << 9); /* set error bit 9 */ /* Bitfields are cleaner — compiler handles the shifts/masks */
type member : width; — the colon followed by a number specifies how many bits to use. An unnamed bitfield (unsigned : 6;) inserts padding bits without creating an accessible member.
Bitfield bit ordering (MSB vs LSB first) is implementation-defined. Do NOT use bitfields for data sent over a network or written to binary files that must be cross-platform. Use manual bit manipulation with shifts and masks for portable code.
Bitfield syntax reference
| Syntax | Meaning |
|---|---|
unsigned x : 3; |
x uses 3 bits, values 0–7 |
int y : 5; |
y uses 5 bits, signed (−16 to 15) |
unsigned : 4; |
4 unnamed padding bits (skip, not accessible) |
unsigned z : 1; |
single-bit flag (0 or 1) |
2g. Abstract Data Types (ADT) — opaque struct pattern
In C, you can hide a struct's implementation by forward-declaring just the name in the header file (no fields), and defining the actual fields only in the .c file. Callers can only hold a pointer to the struct — they can never access its fields directly. This is the standard C idiom for encapsulation.
/* ── stack.h (the PUBLIC interface) ── */ typedef struct Stack Stack; /* forward declaration — fields hidden */ ^ struct Stack is declared here but never defined — callers see only the name Stack *stack_create(void); /* caller only sees a pointer */ void stack_push(Stack *s, int val); int stack_pop(Stack *s); void stack_free(Stack *s); /* ── stack.c (the PRIVATE implementation) ── */ #include "stack.h" struct Stack { /* actual definition — ONLY visible in stack.c */ int *data; int top; int capacity; }; Stack *stack_create(void) { Stack *s = malloc(sizeof(Stack)); s->data = malloc(16 * sizeof(int)); s->top = 0; s->capacity = 16; return s; }
Stack * is a valid pointer type even without seeing the fields — pointers to incomplete types are allowed. The full definition lives only in stack.c, so callers cannot write s->data — it won't compile.
Encapsulation: Users cannot accidentally access s->data directly — they must use the API functions.
Flexibility: You can change the internal representation without breaking any calling code.
Interface separation: Enforces separation of interface and implementation in C — no classes needed!
Comparison: OOP encapsulation vs C opaque struct
# Python: class with name-mangled privates
class Stack:
def __init__(self):
self.__data = [] # "private"
# Java: class with private fields
class Stack {
private int[] data; // hidden
public void push(int v) { ... }
}
/* stack.h — only the name is public */
typedef struct Stack Stack;
Stack *stack_create(void);
void stack_push(Stack *s, int v);
/* stack.c — fields truly hidden */
struct Stack { int *data; int top; };
/* Caller code — cannot touch fields */
Stack *s = stack_create();
stack_push(s, 42); /* OK */
s->data[0]; /* compile error! */
Complete programs you can compile and run
#include <stdio.h>
/* Define Point using typedef — no need to write "struct Point" later */
typedef struct {
int x;
int y;
} Point;
/* Rectangle composed of two Points */
typedef struct {
Point top_left; /* nested struct — a struct inside a struct */
Point bottom_right;
} Rectangle;
/* Function taking a pointer to a Point — uses arrow operator */
void print_point(const Point *p) {
printf("(%d, %d)", p->x, p->y);
}
/* Compute area — no pointer needed, pass by value */
int rect_area(Rectangle r) {
int width = r.bottom_right.x - r.top_left.x;
int height = r.bottom_right.y - r.top_left.y;
return width * height;
}
int main(void) {
/* Designated initializer — order doesn't matter */
Point origin = {.x = 0, .y = 0};
Point corner = {.x = 10, .y = 5};
printf("Origin: "); print_point(&origin); printf("\n");
printf("Corner: "); print_point(&corner); printf("\n");
/* Nested struct initialization */
Rectangle r = { {0, 0}, {10, 5} };
printf("Area: %d\n", rect_area(r));
/* Modifying via pointer */
Point *ptr = &corner;
ptr->x = 20; /* same as corner.x = 20 */
ptr->y = 15; /* same as corner.y = 15 */
printf("New corner: "); print_point(&corner); printf("\n");
/* sizeof */
printf("sizeof(Point): %zu bytes\n", sizeof(Point));
printf("sizeof(Rectangle): %zu bytes\n", sizeof(Rectangle));
return 0;
}
Corner: (10, 5)
Area: 50
New corner: (20, 15)
sizeof(Point): 8 bytes
sizeof(Rectangle): 16 bytes
#include <stdio.h>
#include <stdlib.h>
/* Self-referential struct: next points to another Node */
typedef struct Node {
int value;
struct Node *next; /* must use tag name here — typedef not yet complete */
} Node;
/* Allocate a new node on the heap */
Node *make_node(int val) {
Node *n = malloc(sizeof(Node));
if (!n) { perror("malloc"); exit(1); }
n->value = val;
n->next = NULL;
return n;
}
/* Print the whole list */
void print_list(Node *head) {
Node *curr = head;
while (curr != NULL) {
printf("%d", curr->value);
if (curr->next) printf(" -> ");
curr = curr->next;
}
printf("\n");
}
/* Free all nodes */
void free_list(Node *head) {
while (head) {
Node *tmp = head->next;
free(head);
head = tmp;
}
}
int main(void) {
/* Build list: 1 -> 2 -> 3 */
Node *head = make_node(1);
head->next = make_node(2);
head->next->next = make_node(3);
print_list(head); /* 1 -> 2 -> 3 */
free_list(head);
return 0;
}
#include <stdio.h>
union Data {
int i;
float f;
char bytes[4]; /* lets us inspect individual bytes */
};
int main(void) {
union Data d;
/* Write an int */
d.i = 0x41424344; /* ASCII 'A'=41, 'B'=42, 'C'=43, 'D'=44 */
printf("d.i = 0x%08x (%d)\n", d.i, d.i);
printf("d.bytes: %02x %02x %02x %02x\n",
(unsigned char)d.bytes[0],
(unsigned char)d.bytes[1],
(unsigned char)d.bytes[2],
(unsigned char)d.bytes[3]);
/* bytes[0] is lowest address — on little-endian: 44 43 42 41 */
/* Write a float — NOW i is garbage, only f is valid */
d.f = 3.14f;
printf("d.f = %f\n", d.f);
/* Size demonstration */
printf("\nsizeof(union Data) = %zu\n", sizeof(union Data));
printf("sizeof(int) = %zu\n", sizeof(int));
printf("sizeof(float) = %zu\n", sizeof(float));
printf("sizeof(char[4]) = %zu\n", sizeof(char[4]));
/* All are 4 — union size = largest member = 4 */
union { int i; double d; char s[16]; } big;
printf("\nsizeof(big union) = %zu\n", sizeof(big));
/* = 16, the largest member (char s[16]) */
return 0;
}
d.bytes: 44 43 42 41
d.f = 3.140000
sizeof(union Data) = 4
sizeof(int) = 4
sizeof(float) = 4
sizeof(char[4]) = 4
sizeof(big union) = 16
#include <stdio.h>
enum day_name { SUN, MON, TUE, WED, THU, FRI, SAT, DAY_UNDEF };
enum month_name {
JAN, FEB, MAR, APR, MAY, JUN,
JUL, AUG, SEP, OCT, NOV, DEC,
MONTH_UNDEF
};
struct date {
enum day_name day;
int day_num;
enum month_name month;
int year;
};
/* Array of structs */
typedef struct {
char name[32];
float gpa;
int year;
} Student;
void print_student(const Student *s) {
printf("%-12s GPA: %.2f Year: %d\n", s->name, s->gpa, s->year);
}
int main(void) {
/* enum usage */
struct date big_day = { MON, 7, JAN, 1980 };
printf("Day number: %d\n", big_day.day); /* MON = 1 */
printf("Month num: %d\n", big_day.month); /* JAN = 0 */
/* Can increment enums */
enum day_name tomorrow = big_day.day + 1; /* TUE = 2 */
printf("Tomorrow: %d\n", tomorrow);
/* Array of structs */
Student cohort[3] = {
{"Alice", 3.9, 2},
{"Bob", 3.5, 3},
{"Charlie", 3.7, 1},
};
printf("\nStudent cohort:\n");
for (int i = 0; i < 3; i++) {
print_student(&cohort[i]); /* pass pointer to each element */
}
printf("\nsizeof(Student) = %zu bytes\n", sizeof(Student));
printf("sizeof(cohort) = %zu bytes (%zu elements)\n",
sizeof(cohort), sizeof(cohort) / sizeof(cohort[0]));
return 0;
}
Month num: 0
Tomorrow: 2
Student cohort:
Alice GPA: 3.90 Year: 2
Bob GPA: 3.50 Year: 3
Charlie GPA: 3.70 Year: 1
sizeof(Student) = 40 bytes
sizeof(cohort) = 120 bytes (3 elements)
Practice problems with solutions
Define a struct Car with fields: make (char array, 32 bytes), year (int), price (double). Then declare two cars: one initialized at definition using a positional initializer, and one using designated initializers. Print both using printf.
#include <stdio.h>
struct Car {
char make[32];
int year;
double price;
};
int main(void) {
/* Positional initializer — order must match struct declaration */
struct Car c1 = {"Toyota", 2022, 28500.00};
/* Designated initializer — order doesn't matter, safer */
struct Car c2 = {.make = "Honda", .price = 31999.99, .year = 2023};
printf("Car 1: %s %d $%.2f\n", c1.make, c1.year, c1.price);
printf("Car 2: %s %d $%.2f\n", c2.make, c2.year, c2.price);
printf("sizeof(struct Car) = %zu bytes\n", sizeof(struct Car));
return 0;
}
c1 and c2 are values (not pointers). Positional initializers require you to know the order of fields exactly. Designated initializers (C99+) use .member = value syntax and are safer and more readable. Note that sizeof(struct Car) may be larger than 32 + 4 + 8 = 44 due to alignment padding before double price — the compiler will align double to an 8-byte boundary.
Given the following struct, write a function void scale_point(struct Point *p, int factor) that multiplies both x and y by factor. Call it in main and verify the result. Also write the equivalent using (*p).x syntax to show they are identical.
struct Point { int x; int y; };
#include <stdio.h>
struct Point { int x; int y; };
/* Using arrow operator — the idiomatic way */
void scale_point(struct Point *p, int factor) {
p->x *= factor; /* same as (*p).x *= factor */
p->y *= factor; /* same as (*p).y *= factor */
}
/* Equivalent using explicit dereference */
void scale_point_v2(struct Point *p, int factor) {
(*p).x *= factor;
(*p).y *= factor;
}
int main(void) {
struct Point pt = {3, 4};
printf("Before: (%d, %d)\n", pt.x, pt.y); /* (3, 4) */
scale_point(&pt, 5); /* pass address with & */
printf("After: (%d, %d)\n", pt.x, pt.y); /* (15, 20) */
return 0;
}
void f(struct Point p)), C makes a copy — modifications inside the function do not affect the original. To modify the caller's struct, you must pass its address (&pt) and receive a pointer (struct Point *p). Then use p->x to access through the pointer. The arrow operator -> is strictly shorthand for (*p). — both compile to identical machine code.
Rewrite the following struct using typedef so you don't have to write struct Student everywhere. Then write a function float average_gpa(Student *arr, int n) that takes an array of Students and returns the average GPA.
struct Student { char name[50]; float gpa; int year; };
#include <stdio.h>
/* typedef makes "Student" a type alias — no more "struct Student" */
typedef struct {
char name[50];
float gpa;
int year;
} Student;
float average_gpa(Student *arr, int n) {
float sum = 0.0f;
for (int i = 0; i < n; i++) {
sum += arr[i].gpa; /* dot operator — arr[i] is a Student value */
}
return n > 0 ? sum / n : 0.0f;
}
int main(void) {
Student cohort[] = {
{"Alice", 3.9f, 2},
{"Bob", 3.5f, 3},
{"Charlie", 3.7f, 1},
};
int n = sizeof(cohort) / sizeof(cohort[0]);
printf("Average GPA: %.2f\n", average_gpa(cohort, n));
/* cohort decays to a pointer — same as &cohort[0] */
return 0;
}
typedef struct { ... } Name; creates an anonymous struct with alias Name. The alias is defined after the closing brace. When passing an array to a function in C, the array decays to a pointer to the first element — so Student *arr receives the address of cohort[0]. Inside the function, arr[i] accesses element i using pointer arithmetic, and arr[i].gpa uses dot because arr[i] is a Student value (not a pointer).
Without running the code, predict the output of each printf. Then explain why by drawing the memory layout with padding. Assume a 64-bit system with standard alignment rules.
#include <stdio.h>
struct X { char a; int b; char c; };
struct Y { double d; int i; short s; char c1; char c2; };
union U { int i; double d; char s[16]; };
int main(void) {
printf("%zu\n", sizeof(struct X));
printf("%zu\n", sizeof(struct Y));
printf("%zu\n", sizeof(union U));
return 0;
}
12
16
16
offset 0: char a (1 byte), then 3 bytes padding (int needs 4-byte alignment),
offset 4: int b (4 bytes),
offset 8: char c (1 byte), then 3 bytes padding (struct size must be multiple of 4).
Total: 1+3+4+1+3 = 12.
struct Y { double d; int i; short s; char c1; char c2; }:
offset 0: double d (8 bytes),
offset 8: int i (4 bytes),
offset 12: short s (2 bytes),
offset 14: char c1 (1 byte), char c2 (1 byte).
Total: 8+4+2+1+1 = 16, already a multiple of 8. 16 — padding optimized by ordering largest members first.
union U { int i; double d; char s[16]; }:
Size = size of largest member =
char s[16] = 16 bytes. All three members share the same 16 bytes starting at offset 0. The union's alignment is determined by its most-aligned member (double, 8 bytes), so the size 16 is already a multiple of 8.
The code below has four bugs related to structs and unions. Identify each bug and explain the fix.
#include <stdio.h>
struct Date {
int day;
int month;
int year;
} /* Bug 1 */
int main(void) {
struct Date d = {15, 6, 2024};
struct Date *p = &d;
printf("%d\n", p.day); /* Bug 2 */
union Data { int i; float f; } u;
u.i = 100;
printf("%f\n", u.f); /* Bug 3 */
struct Date arr[3];
arr[0].day = 1;
printf("%d\n", arr[0]->day); /* Bug 4 */
return 0;
}
#include <stdio.h>
struct Date {
int day;
int month;
int year;
}; /* Fix 1: missing semicolon after closing brace */
int main(void) {
struct Date d = {15, 6, 2024};
struct Date *p = &d;
printf("%d\n", p->day); /* Fix 2: p is a pointer, use -> not . */
union Data { int i; float f; } u;
u.i = 100;
/* Fix 3: reading wrong union member — must read the member you last wrote */
printf("%d\n", u.i); /* read u.i since that was the last write */
struct Date arr[3] = {0};
arr[0].day = 1;
printf("%d\n", arr[0].day); /* Fix 4: arr[0] is a value, not a pointer — use . */
return 0;
}
; after the closing }. This is the most common syntax error with structs.Bug 2:
p is a struct Date * (pointer), so you must use p->day, not p.day. The dot operator is only for struct values.Bug 3: Reading a union member that was not the last written. After writing
u.i = 100, only u.i is valid. Reading u.f is undefined behavior — the bit pattern of integer 100 interpreted as a float gives a meaningless small number.Bug 4:
arr[0] is a struct Date value (array indexing does not give a pointer), so use arr[0].day, not arr[0]->day. Only use -> when you have a pointer.
Key concepts to memorize
Test your understanding
sizeof(struct X) for the following? struct X { char a; int b; char c; }; (64-bit system, standard alignment)LO1struct Person *p; and struct Date birthday; inside Person, how do you access the day field of birthday through pointer p?LO1typedef struct { int x; int y; } ___; — what token goes in the blank to create a type alias Point so you can write Point p;?LO1union Data { int i; float f; } u;
u.i = 42;
printf("%f\n", u.f);
enum underlying type map to in C?LO1struct Reg { unsigned status : 3; unsigned error : 1; unsigned : 4; }; — what is the maximum value that status can hold, and what does unsigned : 4; do?LO1