🧩 Module 4: Normalization Demystified


🤔 Why Normalize?

Imagine you’re managing a spreadsheet with customer orders. Every time a customer places a new order, you copy their full name, email, and address again. This leads to:

  • ✅ Repetitive data
  • ⚠️ Higher chance of inconsistencies
  • 😵 Difficulty updating info

Here’s what that might look like:

❌ Non-Normalized Orders Table

OrderIDCustomerNameEmailAddressItemQuantity
101Alice Smithalice@example.com123 Maple StCoffee Mug1
102Alice Smithalice.smith@gmail.com123 Maple StNotebook2
103Bob Jonesbob@example.com456 Oak StPen3

Now let’s clean it up…


✅ Normalized Version

🧾 Customers Table

CustomerIDCustomerNameEmailAddress
1Alice Smithalice@example.com123 Maple St
2Bob Jonesbob@example.com456 Oak St

📦 Orders Table

OrderIDCustomerIDItemQuantity
1011Coffee Mug1
1021Notebook2
1032Pen3

Now if Alice changes her email address, you only update it in one place.


🧱 First Normal Form (1NF): Atomicity and No Repeating Groups

Rule: Each field must contain only one value — no lists, no nesting.

❌ 1NF Violation:

NamePhone Numbers
Rachel555-1234, 555-5678

✅ 1NF Compliant:

NamePhone Number
Rachel555-1234
Rachel555-5678

🧱 Second Normal Form (2NF): Eliminate Partial Dependencies

Rule: Must be in 1NF and all non-key columns must depend on the entire primary key.

❌ 2NF Violation:

CourseIDStudentIDCourseName
ENG1011001English Lit

CourseName depends on CourseID, not the full key (CourseID, StudentID).

✅ 2NF Compliant:

Courses Table

CourseIDCourseName
ENG101English Lit

StudentCourses Table

CourseIDStudentID
ENG1011001

🧱 Third Normal Form (3NF): Eliminate Transitive Dependencies

Rule: Must be in 2NF, and no non-key column should depend on another non-key column.

❌ 3NF Violation:

StudentIDAdvisorNameAdvisorOffice
1001Dr. SmithRoom 210

Office depends on the advisor, not directly on StudentID.

✅ 3NF Compliant:

Students Table

StudentIDAdvisorName
1001Dr. Smith

Advisors Table

AdvisorNameAdvisorOffice
Dr. SmithRoom 210

🧱 Boyce-Codd Normal Form (BCNF): Eliminate Hidden Dependencies

Rule: Every determinant (a column that uniquely identifies others) must be a candidate key.

❌ BCNF Violation:

ProfessorCourseRoom
Dr. LeePhysics101R201

If a professor teaches only one course, Professor → Course is a dependency, but Professor isn’t a candidate key.

✅ BCNF Compliant:

ProfessorCourse Table

ProfessorCourse
Dr. LeePhysics101

CourseRoom Table

CourseRoom
Physics101R201

🔄 When to Denormalize (And Why)

Sometimes, you want to duplicate data. Denormalization means purposefully breaking some normalization rules to make reading data faster or simpler.

✅ Pros:

  • Faster data retrieval
  • Fewer joins in queries
  • Better performance for reports and dashboards

❌ Cons:

  • Harder to keep data in sync
  • Redundant data increases storage and risk of error

Everyday Analogy:

It’s like printing menus for every table at a restaurant. Easy for diners to access, annoying to update when prices change.


💾 Did NoSQL Kill Normalization?

Not at all. NoSQL systems like MongoDB often denormalize by default, but database designers still rely on normalized thinking to plan good structure.

  • Normalization is still crucial in relational databases (e.g., PostgreSQL, MySQL)
  • Even in NoSQL, you often start normalized, then decide what to embed
  • In critical systems (banking, healthcare), normalized data is still king

🧠 Recap Table

Normal FormUse When…
1NFYou need atomic fields and no lists
2NFYou’re using composite keys
3NFYou want clarity and no indirect dependencies
BCNFYou need strict dependency control
DenormalizedYou want speed and simplified access

Module 4: Normalization Demystified – PDF Version

Scroll to Top