Data Cleaning & Analysis Using MS Excel Functions

Hands-On No-Code Data Cleaning, Standardization & Statistical Analysis Using Microsoft Excel on a Real-World Internship Eligibility Dataset

Project Overview

This repository demonstrates end-to-end data cleaning and analysis using MS Excel on a structured student internship eligibility dataset.
The focus areas include:

Cleaning inconsistent department names
Standardizing categorical values
Handling percentages and scores
Computing weighted scores
Determining eligibility status
Generating statistical insights

This project reflects industry-style Excel usage commonly required in operations, analytics, and product roles.

Dataset File

Excel Workbook (Cleaned & Processed)

Click on the button below to download the cleaned dataset

📊 Student Internship Eligibility Dataset (Sample Preview)

Student ID	Name	Department	Std. Dept	T1	T2	T3	Attendance	Application Date	Internship Pref	Assignment %	Weighted Score	Status
S001	Aadhya Jain	CSE	Computer Science	75	59	65	77%	2025-08-24	Company C	90%	64.40	Eligible
S002	Myra Chatterjee	Comp Sci	Computer Science	66	75	70	86%	2025-06-08	Company B	82%	71.15	Eligible
S003	Diya Gupta	CS	Computer Science	98	67	82	91%	2025-08-24	Company D	79%	79.15	Eligible
S004	Sai Agarwal	Computer Science	Computer Science	96	69	76	99%	2025-07-27	Company B	84%	76.55	Eligible
S005	Anika Mukherjee	ECE	Electronics	78	91	96	81%	2025-09-15	Company A	73%	91.55	Eligible

Data Cleaning & Standardization

Application_Raw Spreadsheet

🔹 Department Standardization

To handle inconsistent department naming (CSE, CS, Comp Sci, etc.), a lookup-driven standardization was applied.

Formula used:

=VLOOKUP(C2, Dept_Lookup!$A$2:$B$17, 2, FALSE)

Ensures consistent department naming
Prevents analytical duplication
Enables accurate aggregation & reporting

Eligibility Status Logic

Eligibility was calculated using attendance + assignment thresholds, dynamically driven by company-specific criteria.
Formula used:

=IF(
  AND(
    H3 >= (VLOOKUP(J3, Interview_Slots!$A$2:$D$5, 4, FALSE) / 100),
    L3 >= VLOOKUP(J3, Interview_Slots!$A$2:$D$5, 3, FALSE)
  ),
  "Eligible",
  "Not Eligible"
)

Company-wise rules
No hardcoded values
Scalable to additional companies

Allocation Spreadsheet

🔹 Rank Calculation (Within Department)

Students were ranked department-wise based on weighted score.
Formula used:

=IFERROR(
  RANK(E3, FILTER($E$2:$E$1000, $D$2:$D$1000 = D3), 0),
  98
)

Fair intra-department ranking
Dynamic filtering
Error-handled for edge cases

🔹Allocation Status

Final allocation is determined based on rank vs available interview slots.
Formula used:

=IF(
  G5 <= VLOOKUP(D5, Interview_Slots!$A$2:$B$5, 2, FALSE),
  "Allocated",
  "Not Allocated"
)

Slot-aware allocation
Realistic hiring workflow simulation

Data Visualization & Insight Dashboard

This section highlights the visual analytics layer of the project, designed to transform cleaned operational data into actionable insights for decision-making teams.

📊 Chart 1: Best Performing Departments by Average Score

Data Flow:

application_raw → Pivot Table → Bar Chart

Compares average scores across standardized departments
Highlights top-performing academic departments
Includes a Grand Total benchmark for overall comparison

Chart 2: Company Preference Distribution