Reading Code
How to make sense of C code you didn't write
So far this course has mostly been about writing C. In real life, programmers spend much more time reading code than writing it — their own code from last week, library code, code from teammates, code from strangers on the internet whose only common language with you is C itself.
Reading code well is a skill. Like any skill, it has a method.
Read top-down, then top-down again
The most common mistake is to start at line 1 and try to understand every line in order. Don't. The author wrote the code that way because that's the order the compiler needs, not the order a reader needs.
A better order:
- What does the file do? Read the file's top-level comment (if any) and the names of the functions defined in it. Form a sentence: "This file implements a hash table."
- What's the public surface? Read the corresponding
.hheader to see what functions and types are exposed. - Trace
main(or the entry point) at a high level. Just the shape: "open file, parse arguments, do work, print result". Skip the details. - Dive into one function at a time, on demand. Only when you
need to know what
do_workactually does do you open it.
This is a recursive zoom: stay zoomed out until forced to zoom in, then zoom back out. You're never trying to hold every detail in your head at once.
Follow the types
C is a statically typed language. The type of a value tells you an enormous amount about what's going on, often more than the variable name does. When you see an unfamiliar function call:
Node *n = list_find(head, target);You instantly know: head is presumably a Node * (a list), and
the result is a Node * (or NULL). Half the structure of the
program is encoded in the types.
Tools like ctags, clangd-aware editors, and IDE "go to
definition" let you jump from a use to the declaration. Use
them. Reading C without the ability to jump to definitions is
reading C with one hand tied.
Identify common patterns
C is small. The same idioms recur. Recognizing them on sight is half of fluent reading.
The traversal loop
for (Node *cur = head; cur != NULL; cur = cur->next) {
/* ... */
}Translation: "walk every node in the list."
The null-terminated string scan
while (*s) {
/* do something with *s */
s++;
}Translation: "walk every character of the string."
The "do thing or fail" pattern
FILE *f = fopen(path, "r");
if (f == NULL) {
perror(path);
return 1;
}Translation: "open the file; if it fails, print an error and bail."
The buffered read loop
char buf[256];
while (fgets(buf, sizeof buf, stdin) != NULL) {
/* process one line at a time */
}Translation: "for each line of input..."
The more you read, the larger your library of "I've seen this exact shape before" patterns becomes. Reading speed is roughly proportional to that library size.
A worked example
Let's read this small but real-looking function together, the way you would in practice:
char *read_line(FILE *f) {
size_t cap = 64;
size_t len = 0;
char *buf = malloc(cap);
if (buf == NULL) return NULL;
int c;
while ((c = fgetc(f)) != EOF && c != '\n') {
if (len + 1 >= cap) {
cap *= 2;
char *new_buf = realloc(buf, cap);
if (new_buf == NULL) {
free(buf);
return NULL;
}
buf = new_buf;
}
buf[len++] = (char)c;
}
if (len == 0 && c == EOF) {
free(buf);
return NULL;
}
buf[len] = '\0';
return buf;
}First pass: what does it do?
The name read_line plus the return type char * plus the
parameter FILE *f is enough to guess: "reads one line from f
and returns it as a malloc'd string. Returns NULL on EOF or error."
Second pass: shape
- Allocate an initial buffer.
- Read characters one by one.
- If the buffer is full, double it.
- Stop at newline or EOF.
- Null-terminate. Return.
That matches our guess.
Third pass: details and edge cases
- The doubling loop is a "growable buffer" pattern, same as our
DynArrfrom the dynamic memory chapter. (c = fgetc(f)) != EOF && c != '\n'— both ends are short-circuited; we stop on either EOF or\n, and we don't store the\n.- The
if (len == 0 && c == EOF)block handles the "nothing to read" case at the end of the file: returnNULLso the caller knows we're done. Otherwise an empty-but-valid line returns"". - The error path on
reallocfailure carefullyfreesbuf— important becausereallocreturningNULLdoes not free the original. - Ownership: the caller must
freethe returned string.
Notice we read three times — once for purpose, once for shape, once for detail. We never tried to understand every line on the first pass.
Make the code obvious to yourself
When you read unfamiliar code, give yourself permission to rewrite it on paper or in comments — to make it readable, even if the author didn't.
- Add explanatory comments as you read. (Remove or keep them depending on whether you'll commit your changes.)
- Rename ambiguous local variables in a copy of the file.
- Reformat dense expressions across multiple lines.
- Draw boxes-and-arrows for any pointer-heavy code.
You are not insulting the original author by doing this. You are making the code legible to you, the reader, which is the only audience that matters right now.
Things to be suspicious of
When reading unknown C, be alert for:
- Functions without bounds checking on input (
strcpy,gets,scanf("%s", ...)) — frequent sources of bugs. mallocwithout a matchingfree— possible leak.freeof something that wasn'tmalloc'd — crash waiting to happen.- Comparisons between signed and unsigned types — often wrong in subtle ways.
- Comments that contradict the code — believe the code; the comment is probably out of date.
Train your eye to flag these in passing. They are the C equivalent of "spelling mistakes you can't help noticing".
Practice: read this, then summarize
Before reading on, write a one-sentence summary of what this program does.
...
Done? Here's mine: "It's a tiny grep. It reads lines from
standard input, prints those that contain the word given as a
command-line argument, and exits 0 if at least one line matched."
Did you spot:
argc < 2is the usage check.strstris "substring search".- The exit code communicates "found anything?" to the shell.
If you spotted those without reading line by line, you're reading like a C programmer.
When you open an unfamiliar C source file, what is the best first thing to read?
Line 1, and continue strictly in order.
The longest function in the file.
The header file (.h) for the file, plus the names of all functions defined in the .c — to learn what the file does and exposes before diving in.
The bottom of the file (most files put main at the bottom).
You see this in unfamiliar code:
char buf[64];
scanf("%s", buf);
What should you immediately suspect?
That buf is too big and wastes memory.
That scanf will read multiple words by mistake.
That an input longer than 63 characters will overflow buf — scanf("%s", ...) has no length limit.
That buf should have been declared const.