Strings
C strings, null-termination, string literals, and the `<string.h>` toolkit
A C string is not a type. It is a convention: a contiguous
sequence of char ending in a zero byte ('\0'). Every function in
<string.h> relies on that single byte to know where the string
stops.
Null-terminated character arrays
The character '\0' is the byte 0, not the digit '0' (which is
0x30 in ASCII). Until the runtime finds it, every byte is still
"part of the string."
printf("%s", s) walks s starting at the address you pass, printing
characters until it hits '\0'. If the terminator is missing, it
will happily keep walking into unrelated memory.
String literals
"hello" in C is a string literal — an anonymous array of char
with an automatically appended '\0'. The compiler usually places
literals in read-only memory.
| Form | What it is | Mutable? |
|---|---|---|
char s[] = "hi"; | A copy of the literal in a local array | yes |
char *s = "hi"; | A pointer to the literal in read-only memory | no — writing through it is UB |
const char *s = "hi"; | The right way to spell the second form | n/a |
Do not write through a string literal pointer
char *s = "hi"; s[0] = 'H'; compiles but is undefined behavior — on
many platforms it crashes because the literal lives in read-only
memory. Use const char * for literal pointers, or char s[] = "hi";
when you need to modify the contents.
The <string.h> toolkit
| Function | Purpose | Notes |
|---|---|---|
strlen(s) | Number of bytes before '\0' | O(n) |
strcpy(dst, src) | Copy null-terminated src to dst | No length check — unsafe |
strncpy(dst, src, n) | Copy at most n bytes | May not null-terminate; see below |
strcat(dst, src) | Append src to dst | No length check — unsafe |
strcmp(a, b) | Compare two strings | negative / 0 / positive |
strncmp(a, b, n) | Compare first n bytes | |
strchr(s, c) | First occurrence of c | Returns pointer or NULL |
strstr(haystack, needle) | First occurrence of substring | |
memcpy(dst, src, n) | Copy n bytes (no terminator) | Buffers must not overlap |
memmove(dst, src, n) | Like memcpy but overlap-safe | |
memset(dst, b, n) | Fill n bytes with byte b |
The "unsafe" functions are unsafe because they have no way to know
how big dst is. You are responsible for that.
The strncpy footgun
strncpy(dst, src, n) looks like a "safer strcpy." It is not.
- If
srcis shorter thannbytes, it padsdstwith extra'\0's out tonbytes. - If
srcis longer thannbytes, it copies exactlynbytes and does not null-terminatedst.
That second case is the bug factory. Many codebases now use
snprintf(dst, sizeof dst, "%s", src) or platform-specific
strlcpy instead.
Counting your own strlen
Writing string functions by hand is a great way to internalize null-termination.
The idiomatic version uses a moving pointer:
Buffer sizes are the API
Every C string function is part of a contract: "I will read until I
find a '\0'," or "I will write at most n bytes." Whenever you
write a new function that takes a string, document and enforce the
buffer size.
Note the (unsigned char) cast before passing to toupper. The
<ctype.h> functions are only defined for non-negative ints, and
plain char is signed on some platforms.
Practice: implement strcmp
Implement int my_strcmp(const char *a, const char *b) that returns a negative value if a < b, zero if equal, and positive if a > b — using byte-wise comparison. The provided main prints <, =, or > for a few pairs.
Practice: reverse a string in place
Implement void reverse_str(char *s) so that the string s is reversed without using any helper buffer. The provided main calls it on "systems" and prints the result on one line.
Test your understanding
What is the length of the C string literal "hello" in memory (in bytes)?
4 bytes
5 bytes
6 bytes (5 characters + the trailing '\0')
It depends on the encoding; cannot be determined.
Why is the function char *s = "hello"; s[0] = 'H'; dangerous?
It overflows the buffer.
It always crashes the program in a controlled way.
String literals may live in read-only memory; writing through a non-const pointer to a literal is undefined behavior.
It corrupts the system clock.
Which is the safest way to copy a string into a fixed-size buffer?
strcpy(dst, src);
strncpy(dst, src, sizeof dst);
snprintf(dst, sizeof dst, "%s", src);
memcpy(dst, src, strlen(src));