Databases for Developers

Lesson 3 of 10 · 22 min

x
30%

Schema Design & Normalization

Schema design is the most consequential decision in a relational database. A good schema prevents anomalies: inserting a row should not require duplicating data, and deleting a row should not accidentally destroy unrelated information. Normalization is the process of organizing tables to eliminate redundancy. First Normal Form (1NF) requires atomic values — no arrays in a cell. Second Normal Form (2NF) removes partial dependencies — every non-key column must depend on the whole primary key. Third Normal Form (3NF) removes transitive dependencies — non-key columns should not depend on other non-key columns. In practice, most application schemas target 3NF, then strategically denormalize hot read paths for performance.

Before
Denormalized (redundant data)
orders
┌────────┬──────────────┬───────────────────┬───────────┐
│ id     │ customer_name│ customer_email     │ product   │
├────────┼──────────────┼───────────────────┼───────────┤
│ 1      │ Alice        │ alice@example.com  │ Laptop    │
│ 2      │ Alice        │ alice@example.com  │ Mouse     │
└────────┴──────────────┴───────────────────┴───────────┘
After
Normalized (3NF)
customers              orders
┌────┬───────┬──────┐   ┌────┬─────────────┬────────┐
│ id │ name  │email │   │ id │ customer_id │product │
├────┼───────┼──────┤   ├────┼─────────────┼────────┤
│ 1  │ Alice │alice@│   │ 1  │ 1           │ Laptop │
└────┴───────┴──────┘   │ 2  │ 1           │ Mouse  │
                        └────┴─────────────┴────────┘

Key Takeaway

Normalize to eliminate redundancy, then denormalize only when profiling proves you need the speed.

PreviousNext Lesson