MicroDet – Lightweight Drone-Based Object Detection

A lightweight, anchor-free object detection system built using MicroDet, optimized for aerial / drone imagery.
This project focuses on efficient person detection with minimal computational overhead, making it suitable for edge devices and UAV applications.

Features

Lightweight MicroDet architecture
Anchor-free detection (DFL-based regression)
COCO-format dataset support
Mixed-precision (AMP) training
EMA (Exponential Moving Average) weights
End-to-end pipeline: Train → Validate → Infer
Bounding box visualization with NMS
Config-driven (.toml) model & training setup

Project Structure

microdet/
├── tmp/
│   ├── model/
│   │   ├── backbone/
│   │   ├── neck/
│   │   ├── detect/
│   │   ├── loss/
│   │   └── model_wrapper.py
│   ├── train/
│   │   ├── train.py
│   │   └── validate.py
│   ├── infer/
│   │   ├── run_infer.py
│   │   └── image.png
│   └── data/
│       ├── coco_dataset.py
│       └── collate.py
├── runs/
│   └── microdet_drone/
│       ├── weights/
│       │   ├── last.ckpt
│       │   └── best.ckpt
│       └── logs/
├── microdet.toml
├── requirements.txt
└── README.md

Model Overview

MicroDet is a one-stage, anchor-free detector designed for speed and efficiency.

Architecture

Backbone: Lightweight CNN for feature extraction
Neck: Multi-scale feature aggregation
Head:
- Classification branch (Quality Focal Loss)
- Regression branch (Distribution Focal Loss – DFL)

Feature Map Strides

[8, 16, 32]

Loss Functions

Loss Type	Purpose
Quality Focal Loss (QFL)	Classification confidence
Distribution Focal Loss (DFL)	Bounding box regression
GIoU Loss	Box overlap accuracy

Configured in microdet.toml:

[model.head.loss]
config = [2.0, 0.25, 1.0, 7, "giou"]

Dataset

COCO-style annotation format
Single class: person
Input resolution: 640×640
Supports training & validation splits

Example:

[data.train]
config = [
  "tmp/data/dataset/images",
  "tmp/data/dataset/result.json",
  [640, 640],
  true,
  {}
]

Training

Train from scratch

python -m tmp.train.train \
  --config microdet.toml \
  --device cuda

Resume training

python -m tmp.train.train \
  --config microdet.toml \
  --device cuda \
  --resume runs/microdet_drone/weights/last.ckpt

Evaluation

Validation runs every val_interval epochs
Metrics:
- mAP
- Confidence stability
- Qualitative bounding box accuracy

Inference

Run inference on a test image:

python tmp/infer/run_infer.py

Output

Bounding boxes drawn on original image
Non-Maximum Suppression (NMS) applied

Output saved to:

tmp/infer/output.png

Sample Output

✔ Detected persons from aerial view
✔ Bounding boxes after NMS
✔ Scaled correctly to original image resolution

Early training may show many low-confidence boxes; tuning assigner radius and confidence threshold improves results.

Configuration Highlights

Assigner (Important)

[model.head.assigner_cfg]
config = ["CenterAssigner", {
  "8"  = 5.0,
  "16" = 5.0,
  "32" = 5.0
}]

Optimizer

[schedule.optimizer]
config = ["adamw", 0.0005, 0.05, true, true]

Known Challenges

Low confidence during early epochs
Over-detection without NMS
DFL decoding tensor contiguity issues
Correct stride handling during inference

✔ All addressed through architectural tuning and post-processing.

Tech Stack

Language: Python
Framework: PyTorch
Vision: OpenCV
Model: MicroDet
Data Format: COCO
Hardware: CUDA GPU

Applications

Drone surveillance
Crowd monitoring
Search & rescue
Smart city analytics
Edge AI deployments

Future Work

Multi-class detection
Video inference & tracking
Model quantization
Edge deployment (Jetson / TPU)
Knowledge distillation

Author - Ravindran S

Developer • ML Enthusiast • Neovim Customizer • Linux Power User

Hi! I'm Ravindran S, an engineering student passionate about:

Linux & System Engineering
AIML (Artificial Intelligence & Machine Learning)
Full-stack Web Development
Hackathon-grade project development

🔗 Connect With Me

You can reach me here:

ravindran-dev/microdet_v2