Modules and Packages
Splitting code across files with import, packages, and __main__
A module is just a .py file. A package is a directory of modules (usually marked by an __init__.py). import is how you pull them in.
Modules are the backbone of Python's ecosystem: every library you pip install — requests, numpy, pandas — is a package made of modules.
Why modules matter: the real-world impact
Python's module system is the foundation of its success:
- Code reuse — Write once, import everywhere.
- PyPI ecosystem — Over 500,000 packages available.
pip install requestsgives you a production-grade HTTP client in one line. - Testability — Each module can be tested in isolation.
- Namespace isolation —
foo.barandbaz.bardon't collide.
When you do import pandas as pd, you're pulling in hundreds of modules, all organized into a single namespace. Understanding how import works is critical for debugging import errors, managing dependencies, and organizing your own code.
Importing
When to use each import style
import math— Use when you call multiple functions (math.sqrt,math.cos, ...). Keeps the namespace clean.from math import sqrt— Use when you call one or two functions repeatedly. But don't overdo it;sqrt(x)is less clear thanmath.sqrt(x)to a reader who doesn't know wheresqrtcomes from.import numpy as np— Use for libraries with established conventions. Everyone knowsnp,pd,plt.
The standard library
Python comes with a famously rich standard library. A small tour:
The standard library has modules for everything: file I/O, networking, threading, regular expressions, unit testing, profiling, etc. Always check the standard library before reaching for a third-party package.
What import actually does
When you write import foo:
- Check the cache — Python looks for
fooinsys.modules(a dict of already-imported modules). If it's there, done. - Search the path — If not, Python searches every directory in
sys.pathfor:- A
foo.pyfile - A
foo/directory with an__init__.py(package) - A built-in module (written in C, like
sysormath)
- A
- Execute — The module is executed top to bottom, and its namespace is bound to the name
foo.
sys.path includes the directory of the running script, the PYTHONPATH environment variable, and the install's site-packages. You can inspect it:
Module execution is cached
A module is only executed once, the first time it is imported. Subsequent imports return the cached object from sys.modules. This is why global state in a module is shared across all imports.
Avoid from module import *
It pulls every public name into the current namespace, which makes it impossible to tell where a name came from. Stick to explicit imports.
Star imports hide bugs
from foo import * can silently overwrite names. If foo defines a function called max, it shadows the built-in max. A linter will warn about this, but it's better to never write it in the first place.
The if __name__ == "__main__": idiom
When you run a file directly, __name__ is set to "__main__". When it is imported by something else, __name__ is the module's name. This lets a single file serve as both a library and a script:
# greet.py
def greet(name):
return f"Hello, {name}!"
if __name__ == "__main__":
print(greet("World"))Running python greet.py prints Hello, World!; doing from greet import greet in another file does not run the if block.
Always use this idiom
Put your "script" code (CLI argument parsing, main() calls, etc.) inside if __name__ == "__main__":. It makes your file importable for testing and reuse.
Packages
A directory containing an __init__.py is a package. The __init__.py runs when the package is first imported and typically re-exports the package's public API.
myapp/
__init__.py
cli.py
db/
__init__.py
connection.py
queries.pyThen:
from myapp.db.queries import top_users
from myapp.db import connection__init__.py can be empty; just having it makes the directory a package.
__init__.py in Python 3.3+
Python 3.3+ has "namespace packages" which don't require __init__.py. But most projects still use __init__.py for clarity and to control what gets imported when you do import mypackage.
Relative imports inside packages
Relative imports only work when the file is being imported as part of a package, not when run directly with python myapp/db/queries.py.
When to use relative imports
Use relative imports inside a package when modules import from each other. They make refactoring easier (you can rename the top-level package without updating every import). Use absolute imports everywhere else.
Real-world: the PyPI ecosystem
PyPI (Python Package Index) is why Python is popular. Want to:
- Make HTTP requests?
pip install requests - Parse HTML?
pip install beautifulsoup4 - Work with dataframes?
pip install pandas - Build a web API?
pip install flask - Do machine learning?
pip install scikit-learn
Each of these is a package made of modules. When you pip install, you're downloading a package from PyPI and putting it in your site-packages/ directory (which is in sys.path).
The ability to share and reuse code at this scale — hundreds of thousands of packages, billions of downloads — is only possible because Python's module system is simple and consistent.
Multi-file challenge
This challenge uses two files. Open the utils.py tab and implement the functions; main.py already imports and uses them.
Open utils.py and implement two functions:
mean(values)returns the arithmetic mean ofvalues. Assume the list is non-empty.variance(values)returns the population variance (mean of squared deviations from the mean). Assume the list is non-empty.
main.py already imports both and prints their results. Do not edit main.py.
Define a function filter_even(numbers) that returns a new list containing only the even numbers from numbers.
Multiple choice questions
What is the purpose of if __name__ == "__main__": at the bottom of a module?
It prevents the file from being imported at all.
It runs the indented block only when the file is executed directly, not when it is imported.
It is required syntax in every Python script.
It is the entry point recognized by Python's package installer.
What happens the second time you import foo?
The module is re-executed from scratch.
Python returns the cached module from sys.modules.
A ImportError is raised (duplicate import).
The module is reloaded from disk with the latest changes.
Which import style is generally preferred for readability?
from math import *
from math import sqrt, cos, sin, tan, log, exp, pi, e
import math (then call math.sqrt(...))
All three are equally good.
Where does Python search for modules when you import foo?
Only the current directory.
All directories in sys.path, which includes the current directory, PYTHONPATH, and site-packages.
Only the standard library.
Only installed packages from pip install.
When something goes wrong inside that imported module, you need a way to handle the failure. That is what exceptions are for.