32 results for “topic:sota-model”
A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
[IEEE TII 2025] Official Implementation for "Dual-Detector Reoptimization for Federated Weakly Supervised Video Anomaly Detection via Adaptive Dynamic Recursive Mapping"
Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)
[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens
Researchers who published code, models (in some cases), and demo apps (in few cases) along with their SOTA paper
[IEEE JSTARS 2026] Mamba-FCS: Joint Spatio-Frequency Feature Fusion, Change-Guided Attention, and SeK Inspired Loss for Enhanced Semantic Change Detection in Remote Sensing
[SOTA] MIDI Tempo Detection AI implementation and model (94% accuracy on any MIDI]
figsr — a frequency-domain (FFT-based) SISR architecture. Enhances detail reconstruction and inference speed, combining the strengths of CNNs and Transformers while mitigating their core limitations.
[DEPRECIATED] [339M] [88% acc] Fast full-featured drums inpainting transformer with octo-velocity
SOTA pure drums transformer which is capable of drums track generation for any source composition
This repository includes multiple competitions-solutions/tutorials in deep learning and machine learning
B.Sc. Thesis Deep Learning & NLP research on Medical Image Captioning
SOTA quality fast music transformer with symmetrical quad MIDI notes encoding
Investigation of the capabilities of foundations models in the context of time series forecasting
Contributions to ML tasks in the form of Tools, Videos , Notebooks, Apps and APIs
A multi-agent real-time local discovery system with intent parsing, live place retrieval, review synthesis, transit-aware ranking, explainable recommendations, and SSE progress streaming.
Paper and survey of the papers surrounding semantic simiarity task
Implementation of the MCNN-14 model for fashion image classification, achieving 93.08% accuracy on Fashion-MNIST. Based on our paper “An Efficient Multiple Convolutional Neural Network Model (MCNN-14) for Fashion Image Classification.”
A from-scratch SOTA PyTorch implementation of the Inception-ResNet-V2 model designed by Szegedy et. al., adapted for Face Emotion Recognition (FER), with custom dataset support.
2021.10~2022.4
2021.10~2022.1
SOTA model from scratch
Comparing a customised CNN model and Pre-trained MobileNetV2 on image classification of Damaged Houses during a Hurricane
QTrack: Query Driven Reasoning for Multimodal MOT
Deep learning analysis for craft beer label classification using some custom and state-of-the-art models
Specific domains implemenation for ML. Including CV, NLP, Bioinformatics.
To explore and run popular and interesting SOTA models across computer vision, NLP, audio, and multimodal tasks.
Detecting faces utilizing various computer vision methodologies such as haarcascades and cutting-edge YOLO (You Only Look Once).
This repository contains code for detecting Personal Protective Equipment (PPE) using YOLOv8 and YOLO-World's Custom Model with Custom Classes. The goal of this project is to identify whether individuals in images are wearing appropriate PPE such as helmets, safety vests, goggles, etc.