Arrays and Tensors

How NumPy arrays represent vectors, matrices, and higher-rank tensors, and how to manipulate them without copies

The single data structure that makes scientific Python possible is the NumPy ndarray — a contiguous block of memory plus a tiny header describing how to interpret it as a multidimensional grid. This page builds your mental model of that structure and shows the operations you will reach for over and over in the rest of the course.

What an ndarray actually is

A NumPy array has three pieces:

A dtype — e.g. float64, int32, complex128 — which says how many bytes each element takes and how to interpret them.
A shape — a tuple like (3, 4) or (2, 3, 5) — which says how to index the buffer as a multidimensional grid.
A strides tuple — how many bytes to skip in the buffer to step along each axis.

Most array operations — slicing, transposing, reshaping — just change the header. The underlying bytes are not copied. That is what makes NumPy so fast.

The transpose has different strides but the same bytes. Reads through B jump through memory differently — sometimes faster, sometimes slower than reads through A. The numerical content is identical.

Creating arrays

The handful of constructors you will use 99% of the time:

For random data, prefer the modern default_rng() interface (it is reproducible and stateless):

Indexing: views, fancy, and boolean

There are three indexing modes, and they behave very differently in terms of memory.

Basic indexing (slices, integers, :, None) returns a view. A view shares memory with the original; modifying the view modifies the original.

Fancy indexing (an array of integer indices) returns a copy.

Boolean indexing (a boolean mask of the same shape) also returns a copy.

Knowing whether you have a view or a copy is the difference between an in-place algorithm and a silent bug.

Reshape, ravel, and the order question

reshape re-interprets the existing bytes; it only copies if it cannot. ravel flattens to 1-D as a view when possible. flatten always copies.

Vector, matrix, and tensor arithmetic

All the standard arithmetic ops are element-wise. Matrix-style products use @ (or np.matmul / np.dot).

The * versus @ distinction is among the most important pieces of vocabulary in scientific Python. Element-wise when you want arithmetic; matmul when you want linear-algebra composition.

Higher-rank tensors

A "tensor" in scientific computing usually means a NumPy array with $\ge 3$ axes. They show up everywhere images (height $\times$ width $\times$ channels), video (frames $\times$ height $\times$ width $\times$ channels), time-series ensembles, and PDE solution states live.

The general-purpose tool for tensor contractions is np.einsum, which lets you write Einstein-summation notation directly.

einsum is slower to read than @ but enormously more expressive: contractions, traces, outer products, transposes, diagonal extractions all fit in the same notation.

A multi-file linear-algebra utility

A real scientific project organizes its kernels and its drivers into separate modules. Here is a small example: a routine that projects a batch of vectors onto the span of a set of basis vectors, separated into a linalg.py helper module and a main.py driver.

Two files, twelve lines of real code, a complete projection operator that runs at BLAS speed on millions of points.

Practice: build a small tensor

Implement weighted_centroid(points, weights) that takes:

points of shape (n, d): $n$ points in $\mathbb{R}^d$
weights of shape (n,): non-negative weights

and returns the weighted centroid

$$ c = \frac{\sum_i w_i , p_i}{\sum_i w_i} \in \mathbb{R}^d $$

as a 1-D array of shape (d,). Do it without any explicit Python loops — use broadcasting and a single reduction.

Check your understanding

QuestionSelect one

What is the shape of np.einsum("ijk,kl->ijl", A, B) if A has shape (2, 3, 4) and B has shape (4, 5)?

(2, 3, 4)

(2, 3, 5)

(4, 5)

(2, 5)

QuestionSelect one

Which of the following array operations on a 2-D NumPy array returns a copy (rather than a view)?

A[1:5, ::2]

A.T

A.reshape(-1) on a C-contiguous array

A[A > 0] (boolean indexing)

Arrays and Tensors

On this page