Dataslope logoDataslope

Vectorized Computation

Why R lets you write `prices * 1.08` to add tax to a thousand prices at once — and why that one idea changes how you think about programming.

In most programming languages, if you want to add 1 to every number in a list, you write a loop:

for each item in list:
    item = item + 1

In R, you write:

xs + 1

That's it. No loop. R applies the + 1 to every element of xs automatically. This is called vectorized computation, and once you internalize it, your R code becomes shorter, clearer, and often much faster.

Arithmetic on whole vectors

Every basic operator works element-by-element on vectors:

Code Block
R 4.6.0

No loops, no temporary variables. The expression describes the shape of the answer, and R fills it in.

Two vectors of the same length

When you combine two vectors, R lines them up element-by-element:

Code Block
R 4.6.0

This is the heart of how data analysts think: you have parallel columns, and you operate on them as wholes.

Recycling: when lengths differ

If the two vectors are different lengths, R recycles the shorter one — it repeats it as needed to match the length of the longer one.

Code Block
R 4.6.0

This is most useful in the very common case where one vector is length 1 (a "scalar"). When the lengths don't divide evenly, R will warn you — that warning almost always means a bug:

Code Block
R 4.6.0

The result is computed (R recycles (10, 20) to (10, 20, 10, 20, 10) and adds), but the warning is R saying: "Are you sure you meant this?" Almost always: no. Either trim or extend one of the vectors deliberately.

Built-in summary functions

R has a small army of functions that take a vector and return a single summary value. You will use these constantly:

Code Block
R 4.6.0

Notice how every one of these functions has the same shape: vector in, summary out. Once you have this mental model, half of "data analysis in R" is just knowing which summary function to call.

Cumulative and rolling operations

A few functions take a vector in and give a vector of the same length out, computing as they go:

Code Block
R 4.6.0

A small but mighty example: standardizing a vector

A common data-prep step is standardizing a vector — subtracting the mean and dividing by the standard deviation, so the result has mean 0 and standard deviation 1.

Code Block
R 4.6.0

Three lines, no loops. This is the archetypal R operation: a small chain of vectorized expressions producing a transformed column.

Why vectorized code is faster, too

In most interpreted languages, a for loop in user code has overhead at every step. In R, every vectorized operation is implemented in highly-optimized C under the hood — when you write x + y, R does the whole computation in one call to a fast internal routine.

This means vectorized R is often 10–100x faster than the equivalent loop, on top of being shorter and easier to read.

You can — and sometimes should — write loops in R. But your first instinct should always be: "is there a vectorized way to express this?" 95% of the time, there is.

Test your understanding

QuestionSelect one

What does c(1, 2, 3) * 10 return?

60 (the product of all elements times 10)

10 20 30

An error — different lengths

c(10, 2, 3)

QuestionSelect one

What value does mean(c(2, 4, 6, 8)) produce?

4

5

6

20

QuestionSelect one

In R, you almost never need to write a for loop to apply an operation to every element of a vector. Why?

Because for loops are forbidden in R.

Because operations and most built-in functions are vectorized — they automatically apply to every element.

Because R secretly rewrites loops as parallel code.

Because R cannot iterate over vectors.

Mini challenge: convert Celsius to Fahrenheit

Given a vector of temperatures in Celsius, produce a vector of temperatures in Fahrenheit using the formula F = C * 9/5 + 32.

Challenge
R 4.6.0
Vectorized temperature conversion

Given celsius, compute fahrenheit as a vector. No loops needed — use vectorized arithmetic.

Next we will look at the two specialized vector types that get heavy use in data work: logical vectors (which power filtering) and character vectors (which carry every label, name, and category in your dataset).

On this page