rickardgarnau-byte/centralized-db-architecture-postgresql
A scalable PostgreSQL database (3NF) designed to replace manual Excel workflows. Includes a GDPR-compliant architecture and analytical SQL views.
YrkesCo Database Design
A relational database system designed for a Swedish vocational college (yrkeshögskola), replacing manual Excel-based workflows with a centralized, GDPR-compliant PostgreSQL database.
Background
YrkesCo manages students, educators, courses, programs, and internship placements across two campuses (Stockholm and Gothenburg). The goal was to replace scattered Excel files with a normalized, scalable database that supports future growth.
Tech Stack
- PostgreSQL – Relational database
- SQL – Schema creation, constraints, and queries
- dbdiagram.io – Physical model visualization
Project Structure
yh_labb/
├── create_tables.sql # Physical model - table definitions and constraints
├── insert_data.sql # Seed data for all tables
├── queries.sql # Sample JOIN queries demonstrating the model
└── presentation.pdf # Slides covering conceptual, logical, and physical models
Data Model Overview
The database is designed in three stages:
Conceptual → Logical → Physical
Key Design Decisions
GDPR compliance — Sensitive personal data (social security numbers, salaries, addresses) is isolated in separate tables (student_info, employee_info) to enable strict access control independently of operational data.
3NF normalization — Postal addresses are extracted into a shared postal_address table, eliminating redundancy across schools, companies, students, and agencies.
Hybrid teacher model — Both permanently employed teachers and external consultants are handled through a shared teacher table, with consultants extending it via a consultant table linked to consultant_agency.
Course instances — A course_instance table separates the definition of a course from its scheduled delivery, allowing the same course to be taught multiple times across different classes and terms.
Tables
| Table | Description |
|---|---|
postal_address |
Shared address reference (postal code → city) |
school |
Campus locations (Stockholm, Gothenburg) |
program |
Vocational programs (e.g. Data Engineer) |
course |
Course catalog with credits and description |
program_content |
M:N link between programs and courses |
education_leader |
Program managers (max 3 classes each) |
teacher |
All educators (permanent and consultant) |
consultant |
Consultant-specific data (agency, hourly rate) |
consultant_agency |
Agency info with org number and F-tax status |
employee_info |
Sensitive employee data (salary, address) |
class |
A cohort of students tied to a program and leader |
student |
Student enrollment and class assignment |
student_info |
Sensitive student data (SSN, contact details) |
course_instance |
A scheduled run of a course for a specific class |
teacher_course_rel |
M:N link between teachers and course instances |
attendance |
Daily attendance log per student per instance |
course_grade |
Grade and points per student per course instance |
company |
Companies offering LIA internship placements |
lia_placement |
Internship records linking students to companies |
Sample Queries
-- All students and their programs
SELECT s.first_name || ' ' || s.last_name AS student,
c.class_name, COALESCE(p.program_name, 'Standalone Course') AS program
FROM student s
JOIN class c ON s.class_id = c.class_id
LEFT JOIN program p ON c.program_id = p.program_id;
-- Average grade points per course (lowest first)
SELECT c.course_name, ROUND(AVG(cg.points), 1) AS avg_points
FROM course_grade cg
JOIN course_instance ci ON cg.instance_id = ci.instance_id
JOIN course c ON ci.course_code = c.course_code
GROUP BY c.course_name ORDER BY avg_points ASC;
-- LIA placements with supervisor and location
SELECT s.first_name, s.last_name, c.company_name,
pa.city, l.supervisor_name, l.start_date, l.end_date
FROM student s
JOIN lia_placement l ON s.student_id = l.student_id
JOIN company c ON l.company_id = c.company_id
JOIN postal_address pa ON c.postal_code = pa.postal_code;Normalization (3NF)
The schema satisfies Third Normal Form:
- Every non-key attribute depends on the whole primary key (no partial dependencies)
- No transitive dependencies — e.g. city is stored in
postal_address, not repeated in every table that has an address - Sensitive attributes are in dedicated tables, not mixed into operational tables
Relationship Statements
- A School administers many Classes, but a Class belongs to exactly one School.
- An Education Leader is responsible for one or more Classes (max 3).
- A Program consists of many Courses, and a Course can belong to many Programs (M:N via
program_content). - A Student belongs to exactly one Class and may have one LIA placement.
- A Company can offer multiple LIA Placements, but each placement is tied to a specific student.
- A Course Instance is a time-bound delivery of a Course, assigned to one Class and taught by one or more Teachers.
Bonus Features Implemented
- Standalone courses (not tied to a program) via
is_standaloneflag oncourse - Multi-campus support via the
schooltable — expandable without schema changes - Fixed employee tracking via
employee_infowithis_consultantflag - LIA placement tracking with supervisor contact and grade
- Attendance logging per student per course instance