Ilya Borovik
ilya16
PhD candidate: music generation / text-to-speech synthesis
Languages
Top Repositories
ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)
DeepHumor: Image-based Meme Generation using Deep Learning
SyMuPe: Affective and Controllable Symbolic Music Performance (ACM MM '25, Outstanding Paper Award)
Graph Database Engine
An introduction course on Speech Synthesis and Voice Cloning (Skoltech ISP'25)
Material for the paper "Automatic Quality Assessment of Transcribed Piano MIDI" (ISMIR 2025 LBD)
Repositories
35SyMuPe: Affective and Controllable Symbolic Music Performance (ACM MM '25, Outstanding Paper Award)
A python package for handling modern staff notation of music
A convenient MIDI tokenizer for Deep Learning networks, with multiple encoding strategies
An introduction course on Speech Synthesis and Voice Cloning (Skoltech ISP'25)
ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)
DeepHumor: Image-based Meme Generation using Deep Learning
Material for the paper "Automatic Quality Assessment of Transcribed Piano MIDI" (ISMIR 2025 LBD)
A storage for audio samples used in other repos
A simple TTS model developed for the Speech Synthesis and Voice Cloning course (Skoltech ISP'25)
Transfer Learning and DNN tuning on a MNIST dataset
Graph Database Engine
Vector Quantization, in Pytorch
Multi-Instrumental Neural Network for multi-track sequence generation
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
A simple but complete full-attention transformer with a set of promising experimental features from various papers
Financial Quest with Oleg, Tinkoff voice assistant
Audio watermarking based on the Singular Value Decomposition
Simple Taxi Ordering Web App written in Java EE
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
An implementation of local windowed attention for language modeling
Fully featured implementation of Routing Transformer
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
No description provided.
Personalized tasks & achievements based on customer retail transaction data
Lifestyle advices based on the transaction data
Reformer, the efficient Transformer, in Pytorch
Magenta: Music and Art Generation with Machine Intelligence
Simple analysis of attacks on RSA
MNIST dataset model in a Docker image
Application of Simulated Annealing to the Travelling Salesman problem