Adrián Bazaga
AdrianBZG
Senior Researcher @ Microsoft. Foundational LLM Development. PhD, Machine Learning @ University of Cambridge. Ex: Amazon AGI, Microsoft Research
Languages
Repos
205
Stars
275
Forks
89
Top Language
Python
Loading contributions...
Top Repositories
Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the training on multiple AWS GPU instances
Automated Twitter mass account creation and follow using Selenium and Tor VPN
Multimodal Instruction Tuning for Llama 3
[Nature Scientific Reports] Translating synthetic natural language to database queries: a polyglot deep learning framework
[EMNLP 2024] HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs
Virtual Reality game for the Intelligent Interfaces subject, made with Unity Engine.
Repositories
205Multimodal Instruction Tuning for Llama 3
[ICML 2024] TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting
[EMNLP 2024] HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs
Automated Twitter mass account creation and follow using Selenium and Tor VPN
Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the training on multiple AWS GPU instances
Personal website
Multilingual Sentence & Image Embeddings with BERT
FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models
[Nature Scientific Reports] Translating synthetic natural language to database queries: a polyglot deep learning framework
[ACL 2025] Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
[ICLR 2024] Unsupervised Pretraining for Fact Verification by Language Model Distillation
[Applied Soft Computing] A Convolutional Neural Network for the automatic diagnosis of collagen VI-related muscular dystrophies
Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation
[ICLR 2024] Language Model Knowledge Distillation for Efficient Question Answering in Spanish
Content for the CaRAML website
Fast Gene Set Enrichment Analysis
Official implementation of the TabPFN paper (https://arxiv.org/abs/2207.01848) and the tabpfn package.
Virtual Reality game for the Intelligent Interfaces subject, made with Unity Engine.
:mag_right: ScanCode scans code and detects licenses, copyrights, package manifests & dependencies and more ... to discover and inventory open source and third-party packages used in your code.
No description provided.
Machine Learning Lectures at the European Space Agency (ESA) in 2018
Pseudocode descriptions of the algorithms from Russell And Norvig's "Artificial Intelligence - A Modern Approach"
Java implementation of algorithms from Russell And Norvig's "Artificial Intelligence - A Modern Approach"
My personal website, deployed using Docker
No description provided.
Natural Language Processing Tutorial for Deep Learning Researchers
Autoencoders for Link Prediction and Semi-Supervised Node Classification
InterMine Data Browser: a tool for exploring semi-homogeneous biological datasets
Code for the paper "BIOLITMAP: a web-based geolocated, temporal and thematic visualization of the evolution of bioinformatics publications", Bazaga et al. (2018). Accepted in Oxford Bioinformatics. doi:10.1093/bioinformatics/bty967