Dataslope logoDataslope

Data Types and Memory

Integer widths, signedness, fixed-width types, alignment, and endianness in C

In Python an int can grow forever; in C every integer fits in a fixed number of bytes. That fixed size is both C's greatest strength (predictable, no boxing, no allocations) and its most common source of overflow bugs. This page builds the mental model you need to reason about what is actually in memory.

The fundamental types

TypeTypical sizeRange (signed)
char1 byte−128 .. 127 (or 0 .. 255 if unsigned)
short2 bytes−32 768 .. 32 767
int4 bytes≈ ±2.1 × 10⁹
long4 or 8 bytes (platform dependent)varies
long long8 bytes≈ ±9.2 × 10¹⁸
float4 bytesIEEE 754 single
double8 bytesIEEE 754 double
void *4 or 8 bytes (32- or 64-bit target)address

The C standard only guarantees minimum sizes (e.g. int is at least 16 bits). The actual sizes depend on the target. The WASI target used by this sandbox is 32-bit: int is 4 bytes and void * is 4 bytes.

Code Block
C 17 (201710L)

When "fixed size" turns into a bug

Because every C integer wraps at its boundary, arithmetic that would silently grow a Python int can silently truncate in C.

Code Block
C 17 (201710L)

Signed overflow is UB, unsigned overflow is not

For signed integers, overflow is undefined behavior — the compiler is allowed to assume it never happens. For unsigned, overflow is well-defined modular arithmetic (it wraps). When you actually want wrap-around (e.g. a hash function), use unsigned.

Fixed-width types: <stdint.h>

Because int, long, etc. vary across platforms, anything you write to disk, send over a network, or talk to hardware with should use the fixed-width types from <stdint.h>.

TypeWidthUse case
int8_t, uint8_t8 bitsbytes
int16_t, uint16_t16 bitssmall counters, audio samples
int32_t, uint32_t32 bitscolors, network ints
int64_t, uint64_t64 bitstimestamps, large counters
size_tplatform pointer-sized unsignedarray sizes
ssize_tplatform pointer-sized signedbyte counts that can be negative
intptr_t, uintptr_texactly pointer-sizedpointer arithmetic
Code Block
C 17 (201710L)

The PRI… macros from <inttypes.h> give you portable printf format specifiers no matter what size int32_t happens to be.

Signedness pitfalls

Mixing signed and unsigned values is one of the most popular C bugs.

Code Block
C 17 (201710L)

The fix is to keep types consistent: do not mix int i with size_t n = vec.size, for example.

Endianness: which byte comes first?

A 32-bit integer like 0x12345678 is four bytes: 12, 34, 56, 78. Different CPUs disagree on the order in which to store them in memory:

  • Little-endian (x86, ARM in normal mode, RISC-V, WASI): the least significant byte first — 78 56 34 12 at increasing addresses.
  • Big-endian (older PowerPC, network byte order): the most significant byte first — 12 34 56 78.

You can see the byte order at runtime by reading the bytes of an integer through a char *:

Code Block
C 17 (201710L)

On the little-endian WASI target you should see 78 56 34 12 — the "low" byte first. That order matters when you serialize integers to disk or send them over a network; protocols like TCP/IP standardize on big-endian ("network byte order") and provide htonl / ntohl to convert.

Alignment

Most CPUs read memory most efficiently when an N-byte type lives at an address that is a multiple of N. A uint32_t at address 0x1004 is fine; the same value at 0x1001 may be slow or, on some CPUs, crash.

The compiler enforces alignment automatically: local variables and struct members are placed at aligned addresses, sometimes with unused padding bytes in between. We will see padding clearly when we cover structs.

Code Block
C 17 (201710L)

Practice: count set bits

Challenge
C 17 (201710L)
Count the number of 1 bits in a uint32_t

Implement int popcount(uint32_t x) that returns how many bits of x are set to 1. Do not call any library popcount intrinsic — write the loop yourself. The provided main calls it for several values and prints the results, one per line.

Practice: detect endianness at runtime

Challenge
C 17 (201710L)
Print 'little' or 'big'

Print little if this machine is little-endian and big otherwise. Hint: place a known multi-byte integer at a known address and read its first byte through a char *.

Test your understanding

QuestionSelect one

What does the C standard say about the size of int?

It is always exactly 32 bits.

It is always exactly 16 bits.

It is at least 16 bits; the actual size depends on the target platform.

It is whatever the operating system kernel decides at runtime.

QuestionSelect one

Which type should you use when you need a guaranteed 32-bit unsigned integer for a binary file format?

int

unsigned int

size_t

uint32_t (from <stdint.h>)

QuestionSelect one

A 32-bit integer 0xAABBCCDD is stored in memory on a little-endian machine. What byte will you find at the integer's lowest address?

0xAA

0xDD

0xCC

It is unpredictable.

On this page