GitHunt
RI

rickardgarnau-byte/centralized-db-architecture-postgresql

A scalable PostgreSQL database (3NF) designed to replace manual Excel workflows. Includes a GDPR-compliant architecture and analytical SQL views.

YrkesCo Database Design

A relational database system designed for a Swedish vocational college (yrkeshögskola), replacing manual Excel-based workflows with a centralized, GDPR-compliant PostgreSQL database.


Background

YrkesCo manages students, educators, courses, programs, and internship placements across two campuses (Stockholm and Gothenburg). The goal was to replace scattered Excel files with a normalized, scalable database that supports future growth.


Tech Stack

  • PostgreSQL – Relational database
  • SQL – Schema creation, constraints, and queries
  • dbdiagram.io – Physical model visualization

Project Structure

yh_labb/
├── create_tables.sql     # Physical model - table definitions and constraints
├── insert_data.sql       # Seed data for all tables
├── queries.sql           # Sample JOIN queries demonstrating the model
└── presentation.pdf      # Slides covering conceptual, logical, and physical models

Data Model Overview

The database is designed in three stages:

Conceptual → Logical → Physical

Key Design Decisions

GDPR compliance — Sensitive personal data (social security numbers, salaries, addresses) is isolated in separate tables (student_info, employee_info) to enable strict access control independently of operational data.

3NF normalization — Postal addresses are extracted into a shared postal_address table, eliminating redundancy across schools, companies, students, and agencies.

Hybrid teacher model — Both permanently employed teachers and external consultants are handled through a shared teacher table, with consultants extending it via a consultant table linked to consultant_agency.

Course instances — A course_instance table separates the definition of a course from its scheduled delivery, allowing the same course to be taught multiple times across different classes and terms.


Tables

Table Description
postal_address Shared address reference (postal code → city)
school Campus locations (Stockholm, Gothenburg)
program Vocational programs (e.g. Data Engineer)
course Course catalog with credits and description
program_content M:N link between programs and courses
education_leader Program managers (max 3 classes each)
teacher All educators (permanent and consultant)
consultant Consultant-specific data (agency, hourly rate)
consultant_agency Agency info with org number and F-tax status
employee_info Sensitive employee data (salary, address)
class A cohort of students tied to a program and leader
student Student enrollment and class assignment
student_info Sensitive student data (SSN, contact details)
course_instance A scheduled run of a course for a specific class
teacher_course_rel M:N link between teachers and course instances
attendance Daily attendance log per student per instance
course_grade Grade and points per student per course instance
company Companies offering LIA internship placements
lia_placement Internship records linking students to companies

Sample Queries

-- All students and their programs
SELECT s.first_name || ' ' || s.last_name AS student,
       c.class_name, COALESCE(p.program_name, 'Standalone Course') AS program
FROM student s
JOIN class c ON s.class_id = c.class_id
LEFT JOIN program p ON c.program_id = p.program_id;

-- Average grade points per course (lowest first)
SELECT c.course_name, ROUND(AVG(cg.points), 1) AS avg_points
FROM course_grade cg
JOIN course_instance ci ON cg.instance_id = ci.instance_id
JOIN course c ON ci.course_code = c.course_code
GROUP BY c.course_name ORDER BY avg_points ASC;

-- LIA placements with supervisor and location
SELECT s.first_name, s.last_name, c.company_name,
       pa.city, l.supervisor_name, l.start_date, l.end_date
FROM student s
JOIN lia_placement l ON s.student_id = l.student_id
JOIN company c ON l.company_id = c.company_id
JOIN postal_address pa ON c.postal_code = pa.postal_code;

Normalization (3NF)

The schema satisfies Third Normal Form:

  • Every non-key attribute depends on the whole primary key (no partial dependencies)
  • No transitive dependencies — e.g. city is stored in postal_address, not repeated in every table that has an address
  • Sensitive attributes are in dedicated tables, not mixed into operational tables

Relationship Statements

  • A School administers many Classes, but a Class belongs to exactly one School.
  • An Education Leader is responsible for one or more Classes (max 3).
  • A Program consists of many Courses, and a Course can belong to many Programs (M:N via program_content).
  • A Student belongs to exactly one Class and may have one LIA placement.
  • A Company can offer multiple LIA Placements, but each placement is tied to a specific student.
  • A Course Instance is a time-bound delivery of a Course, assigned to one Class and taught by one or more Teachers.

Bonus Features Implemented

  • Standalone courses (not tied to a program) via is_standalone flag on course
  • Multi-campus support via the school table — expandable without schema changes
  • Fixed employee tracking via employee_info with is_consultant flag
  • LIA placement tracking with supervisor contact and grade
  • Attendance logging per student per course instance