ravindran-dev/microdet_v2
A lightweight, anchor-free object detection system built using MicroDet, optimized for drone imagery.
MicroDet – Lightweight Drone-Based Object Detection
A lightweight, anchor-free object detection system built using MicroDet, optimized for aerial / drone imagery.
This project focuses on efficient person detection with minimal computational overhead, making it suitable for edge devices and UAV applications.
Features
-
Lightweight MicroDet architecture
-
Anchor-free detection (DFL-based regression)
-
COCO-format dataset support
-
Mixed-precision (AMP) training
-
EMA (Exponential Moving Average) weights
-
End-to-end pipeline: Train → Validate → Infer
-
Bounding box visualization with NMS
-
Config-driven (.toml) model & training setup
Project Structure
microdet/
├── tmp/
│ ├── model/
│ │ ├── backbone/
│ │ ├── neck/
│ │ ├── detect/
│ │ ├── loss/
│ │ └── model_wrapper.py
│ ├── train/
│ │ ├── train.py
│ │ └── validate.py
│ ├── infer/
│ │ ├── run_infer.py
│ │ └── image.png
│ └── data/
│ ├── coco_dataset.py
│ └── collate.py
├── runs/
│ └── microdet_drone/
│ ├── weights/
│ │ ├── last.ckpt
│ │ └── best.ckpt
│ └── logs/
├── microdet.toml
├── requirements.txt
└── README.mdModel Overview
MicroDet is a one-stage, anchor-free detector designed for speed and efficiency.
Architecture
-
Backbone: Lightweight CNN for feature extraction
-
Neck: Multi-scale feature aggregation
-
Head:
-
Classification branch (Quality Focal Loss)
-
Regression branch (Distribution Focal Loss – DFL)
-
Feature Map Strides
[8, 16, 32]Loss Functions
| Loss Type | Purpose |
|---|---|
| Quality Focal Loss (QFL) | Classification confidence |
| Distribution Focal Loss (DFL) | Bounding box regression |
| GIoU Loss | Box overlap accuracy |
Configured in microdet.toml:
[model.head.loss]
config = [2.0, 0.25, 1.0, 7, "giou"]Dataset
-
COCO-style annotation format
-
Single class: person
-
Input resolution: 640×640
-
Supports training & validation splits
Example:
[data.train]
config = [
"tmp/data/dataset/images",
"tmp/data/dataset/result.json",
[640, 640],
true,
{}
]Training
Train from scratch
python -m tmp.train.train \
--config microdet.toml \
--device cudaResume training
python -m tmp.train.train \
--config microdet.toml \
--device cuda \
--resume runs/microdet_drone/weights/last.ckptEvaluation
-
Validation runs every val_interval epochs
-
Metrics:
-
mAP
-
Confidence stability
-
Qualitative bounding box accuracy
-
Inference
Run inference on a test image:
python tmp/infer/run_infer.pyOutput
-
Bounding boxes drawn on original image
-
Non-Maximum Suppression (NMS) applied
Output saved to:
tmp/infer/output.pngSample Output
✔ Detected persons from aerial view
✔ Bounding boxes after NMS
✔ Scaled correctly to original image resolution
Early training may show many low-confidence boxes; tuning assigner radius and confidence threshold improves results.
Configuration Highlights
Assigner (Important)
[model.head.assigner_cfg]
config = ["CenterAssigner", {
"8" = 5.0,
"16" = 5.0,
"32" = 5.0
}]Optimizer
[schedule.optimizer]
config = ["adamw", 0.0005, 0.05, true, true]Known Challenges
-
Low confidence during early epochs
-
Over-detection without NMS
-
DFL decoding tensor contiguity issues
-
Correct stride handling during inference
✔ All addressed through architectural tuning and post-processing.
Tech Stack
-
Language: Python
-
Framework: PyTorch
-
Vision: OpenCV
-
Model: MicroDet
-
Data Format: COCO
-
Hardware: CUDA GPU
Applications
-
Drone surveillance
-
Crowd monitoring
-
Search & rescue
-
Smart city analytics
-
Edge AI deployments
Future Work
-
Multi-class detection
-
Video inference & tracking
-
Model quantization
-
Edge deployment (Jetson / TPU)
-
Knowledge distillation
Author - Ravindran S
Developer • ML Enthusiast • Neovim Customizer • Linux Power User
Hi! I'm Ravindran S, an engineering student passionate about:
- Linux & System Engineering
- AIML (Artificial Intelligence & Machine Learning)
- Full-stack Web Development
- Hackathon-grade project development
🔗 Connect With Me
You can reach me here: