Dataslope logoDataslope

The Birth of LINQ

How Microsoft Research unified querying across collections, XML, databases, and APIs — and the language features that had to ship first

By the early 2000s, .NET developers were querying data in three or four mutually incompatible ways. In the same application you might loop over a List<Customer>, write an XPath expression to dig into an XML config file, build a SQL string by hand for the database, and call a SOAP web service that returned yet another shape. Each one had its own syntax, its own errors, its own debugging story, and zero overlap with the others.

LINQ — Language Integrated Query — was the audacious attempt to make a single query syntax that worked over all of them, with the type checker on your side.

The state of querying in 2005

Each arrow was a different language stitched into C#. None of them could see the others. A typo inside a SQL string failed at runtime, in production, sometimes hours after deployment.

The Microsoft Research team — led by Anders Hejlsberg, Erik Meijer, and others — asked a deceptively simple question: "What if where and select were part of the C# language, and worked the same way regardless of what they were querying?"

The features that had to ship first

LINQ could not exist in C# 2.0. Several language features had to land together in C# 3.0 to make it possible. Each one is interesting on its own.

FeatureWhat it gave usWithout it…
Generics (C# 2.0)IEnumerable<T> instead of IEnumerable of objectEvery query would be untyped
Lambda expressionsn => n * 2 instead of named delegate methodsInline predicates would be unbearable
Extension methodsnumbers.Where(...) without modifying IEnumerable<T>We'd need a LinqHelper.Where(numbers, ...) style
Anonymous typesnew { Name, Age } mid-pipelineEach projection would need a named class
Expression treesExpression<Func<T,bool>> for translating to SQLLINQ to SQL/EF would be impossible
Query syntaxfrom x in xs where ... select ...Method-chain syntax only
Type inference (var)Hide the long types pipelines produceVariable declarations would be painful

Each row above is a feature that lots of C# developers use today without realizing it shipped together in 2007 specifically to make LINQ feel like a built-in part of the language.

The "unified query" idea

The big claim of LINQ was that the same query syntax should work over anything that produces a sequence of values. Watch the same shape of query target four different data sources.

Code Block
C# 13

Three different data sources. The same operators. That is the LINQ idea in one screenshot.

In real applications you'd see two more:

  • LINQ to SQL / Entity Framework — the same operators, translated into a SELECT statement and shipped to the database. The filtering happens on the database server, not in your process.
  • LINQ to Provider XIQueryable providers exist for MongoDB, Elasticsearch, in-memory event streams, you name it.

We will not spend time on EF specifically in this course (it's a big topic of its own), but the same mental model — pipelines of operators that describe a result — covers them all.

The two faces of LINQ

There are two kinds of LINQ, and the difference matters.

  • LINQ to Objects runs on IEnumerable<T>. Lambdas are Func<T, ...> — compiled to ordinary methods. The pipeline executes in your process, in CLR memory.
  • LINQ to Providers (EF Core, etc.) runs on IQueryable<T>. Lambdas are Expression<Func<T, ...>>data structures describing the lambda, which a provider walks and translates into another language (SQL, an HTTP query, etc.).

This course is about LINQ to Objects and the functional thinking behind it. Once you have that mental model, IQueryable is a small extra step.

Why this was a big deal

The cultural shift LINQ produced was bigger than the technical one. Before LINQ, querying was something special — a separate language you bolted onto your "real" code. After LINQ:

  • You could refactor a query the same way you refactor any other C# method.
  • Types flowed end-to-end. If you renamed Person.City to Person.Town, the compiler caught every query that referenced it.
  • The same person who wrote the domain model wrote the queries, using the same tools.
  • Other languages followed: Scala, Kotlin, Java Streams, JavaScript array methods, Rust iterators, Swift sequence operations — all borrow heavily from LINQ's design.

Why this matters for this course

The rest of the course is going to treat LINQ less as a feature and more as a way of thinking. The historical note matters because the design choices LINQ made (lazy by default, generic over any sequence, composable through method chaining) are exactly the choices that make functional collection processing work in any language.

If you internalize them in C#, you will recognize the same patterns the next time you read a Kotlin Sequence, a Java Stream, or a Rust Iterator pipeline.

QuestionSelect one

Which language feature, introduced in C# 3.0 alongside LINQ, allows methods like Where and Select to appear to be members of IEnumerable<T> without actually modifying that interface?

Generics

Anonymous types

Extension methods

Expression trees

On this page