Dataslope logoDataslope

Logical and Character Vectors

Two specialized vector types that power filtering, categorization, and labeling — the bread and butter of real data work.

So far our vectors have held numbers. But in a real dataset, you will see two other types just as often:

  • Logical vectors — every element is TRUE or FALSE. These are the engine of filtering.
  • Character vectors — every element is a string. These hold names, categories, labels, IDs, anything textual.

This page is about both.

Logical vectors

A logical vector is just a vector whose elements are TRUE or FALSE:

Code Block
R 4.6.0

You rarely type logical vectors out by hand. Almost always, you produce them by comparing a vector to something:

Code Block
R 4.6.0

Notice the result is a vector of TRUE/FALSE values, one per input. Every comparison operator is vectorized, just like arithmetic.

Counting and summarizing logicals

Here is one of the most useful R tricks of all time: sum() and mean() work on logical vectors, because R quietly treats TRUE as 1 and FALSE as 0.

Code Block
R 4.6.0

Read that out loud:

  • sum(scores >= 80) = "count the scores that are 80 or above"
  • mean(scores >= 80) = "proportion of scores that are 80 or above"
  • any(scores == 100) = "is there at least one perfect score?"
  • all(scores > 50) = "are they all above 50?"

These four lines cover an astonishing amount of real-world analytical questions.

Combining logical conditions

You can combine logicals with the operators & (and), | (or), and ! (not):

Code Block
R 4.6.0

Use & and | (single character) for vector-wise logic. R also has && and || (double character), but these are for single-value control flow and behave differently — for data analysis, always use & and |.

Filtering with logicals

The real payoff: you can use a logical vector as an index into another vector. R keeps only the positions where the logical is TRUE.

Code Block
R 4.6.0

This is the foundation of filtering. Later we will see how dplyr::filter() builds on this exact pattern for whole data frames — but understanding it on vectors first is what makes the whole library click.

Character vectors

Strings in R live inside character vectors. Each element is a string enclosed in double or single quotes:

Code Block
R 4.6.0

Like everything else in R, character functions are vectorized: toupper() doesn't take one string — it takes a whole vector and returns a transformed one.

Pasting strings together

paste() and paste0() glue strings (and any other type, after coercion) into one:

Code Block
R 4.6.0

The two arguments people most confuse are sep and collapse. The rule is:

  • sep = "what goes between the parallel inputs"
  • collapse = "what goes between the elements of the result, when folding into one string"

Detecting patterns in text

For checking which strings contain something, the workhorse is grepl() ("global regular expression, logical"). It takes a pattern and a vector of strings, and returns a logical vector.

Code Block
R 4.6.0

(Setting fixed = TRUE tells grepl() to treat the pattern as a plain string, not a regular expression. Regex is a whole topic of its own — for now, fixed = TRUE keeps things simple.)

Categorical data: a peek at factors

When a character vector represents one of a fixed set of categories (like "low" / "medium" / "high"), R has a specialized type called a factor. We won't dwell on factors here, but quick exposure:

Code Block
R 4.6.0

Factors look like character vectors but carry information about which categories are valid and in what order. They become important when you plot or run models — but for today, character vectors will do.

Test your understanding

QuestionSelect one

Given x <- c(10, 20, 30, 40), what does sum(x > 15) return?

Hint: x > 15 is a logical vector, and sum() adds up TRUE (1) and FALSE (0) — it never sees the original numbers.

90 (the sum of values greater than 15)

3

TRUE

0.75

QuestionSelect one

What does mean(c(TRUE, TRUE, FALSE, TRUE)) evaluate to?

3

0.75

TRUE

An error

QuestionSelect one

Given names <- c("Ana", "Bo", "Cara") and ages <- c(40, 25, 33), what does names[ages > 30] return?

c(40, 33)

c(TRUE, FALSE, TRUE)

c("Ana", "Cara")

An error

Mini challenge: who passed?

You're given names and exam scores. Produce a character vector passed containing only the names of students who scored 70 or higher.

Challenge
R 4.6.0
Filter names by score

Using a logical index, build a character vector passed containing the names of every student who scored 70 or higher.

One last topic before we leave vectors behind: what happens when a value is simply missing — R's special NA marker, and the small set of rules for dealing with it.

On this page