Collections and Data Organization
Lists, dictionaries, sets, and arrays — choosing the right container for the job
A single variable holds one value. Real programs need to organize lots of values: a list of users, a phone book keyed by name, a set of seen URLs, a grid of game tiles.
C# ships with a small set of go-to collections that cover 95% of everyday needs. Picking the right one for a given problem is one of the most valuable habits you can build.
The big four
| Collection | Think of it as | Lookup by |
|---|---|---|
T[] (array) | Fixed-size sequence | Position (index) |
List<T> | Growable sequence | Position (index) |
Dictionary<TKey, TValue> | Phone book / map | Key |
HashSet<T> | Bag of distinct items | Membership ("is X in here?") |
Each one is good at something the others are bad at.
Arrays: fixed-size, indexed
Arrays have a fixed size at creation time. They're fast and
lightweight, but you can't Add or Remove.
List<T>: the everyday sequence
If you don't know exactly how many items you'll have, reach for
List<T>.
Mental model: a List<T> is "an array that grows for you." It
is backed by an array internally; when you add too many items,
it allocates a bigger one and copies the old contents.
Dictionary<TKey, TValue>: a phone book
A dictionary maps keys to values. Lookup by key is fast — even for a dictionary with millions of entries.
TryGetValue is the idiomatic way to look up a key that might
not be there without crashing. Using dict[key] on a missing key
throws a KeyNotFoundException.
HashSet<T>: "is this thing in here?"
Use a set when you only care about membership and uniqueness, not order.
A set's Contains is roughly as fast as a dictionary's lookup —
much faster than scanning a list.
Picking the right collection
Rough cost picture:
| Operation | Array / List | Dictionary | HashSet |
|---|---|---|---|
| Access by index | very fast | n/a | n/a |
| Lookup by key | slow (scan) | very fast | n/a |
| Test membership | slow (scan) | very fast | very fast |
| Add at end | fast | very fast | very fast |
| Insert in the middle | slow | n/a | n/a |
| Maintain order | yes | no | no |
"Slow" here means "time grows with the size of the collection."
Iterating with foreach
foreach works for every standard collection:
Inside a foreach, don't modify the collection you're
iterating over. Take a copy first or iterate over the indices of a
list if you really need to mutate.
A worked example: word frequencies
A classic use of Dictionary<string, int>:
Notice the pattern: "if key exists, increment; otherwise initialize." You'll write this dozens of times in real programs.
Practice
Implement WordCounter.Counts(string text) returning a Dictionary<string, int> where each key is a word from the input and the value is how many times it appeared.
Words are separated by single spaces. Compare words as-is — case-sensitive, no trimming.
Program.cs uses the sentence "the cat and the dog and the cat" and prints (one line per word, in any order, but matching the expected lines exactly):
and=2
cat=2
dog=1
the=3
Test your understanding
You want to remember every URL the crawler has visited and quickly check whether a given URL has already been seen. Which collection fits best?
List<string>
string[]
HashSet<string>
Dictionary<string, string>
What happens if you read dict[key] for a key that doesn't exist in a Dictionary?
It returns null
It returns a default value silently
It throws a KeyNotFoundException
It adds the key with a default value