"topic:llm-fine-tuning" — Search

The course teaches how to fine-tune LLMs using Group Relative Policy Optimization (GRPO)—a reinforcement learning method that improves model reasoning with minimal data. Learn RFT concepts, reward design, LLM-as-a-judge evaluation, and deploy jobs on the Predibase platform.

Jupyter Notebook51Updated 5 days ago

ai-evaluationai-optimizationai-trainingdeeplearning-ai-coursesgrpolanguage-modelllm-as-judgellm-developmentllm-fine-tuningmachine-learning-algorithmsmulti-step-reasoningopensource-aipredibasereinforcement-learningreward-designrftrlhftoken-level-control

MobinaMhr/Neural-Networks-and-Deep-Learning-Course-Projects-F2024

Fully Connected Neural Networks, Multilayer Neural Networks, MAdaline, CNNs, Segmentation, Detection, RNNs, CNN-LSTM, LSTM, Bi-LSTM, GRU, Transformers, Huber Loss, ViT, DGMs, Triplet VAE, AdvGAN, Image Caption Generation, attention, LLM Fine-Tuning, Soft Prompting, LoRA, Layer Freezing, SlimOrca

Jupyter Notebook41Updated 3 weeks ago

advganbi-lstmcnn-lstmcnnsdetectionfully-connected-neural-networksgruimage-caption-generationllm-fine-tuningloralstmmadalinemultilayer-neural-networksrnnssegmentationslimorcasoft-promptingtransformerstriplet-vaevit

ethicalabs-ai/FlowerTune-Qwen2.5-Coder-0.5B-Instruct

FlowerTune LLM on Coding Dataset

Python31Updated 5 months ago

aifederated-learningfederated-learning-frameworkllm-fine-tuningllm-finetuningllm-trainingmachine-learningmlqwen2qwen2-5transformerstransformers-models

BY571/ARC-TTT

ARC-Test-Time-Training (ARC-TTT)

Python21Updated 9 months ago

abstract-reasoningarcarc-agifine-tuningllm-fine-tuningllm-finetuningllm-trainingtest-time-adaptationtest-time-augmentationtest-time-training

Jatin-Mehra119/Essay-Scoring-Modeling

This repository contains all the notebooks, resources, and documentation used to develop and evaluate models for the Automated Essay Scoring (AES) Kaggle competition. The project aims to build an open-source solution for automated essay evaluation to support educators and provide timely feedback to students.

Jupyter Notebook22Updated 6 months ago

essay-scoringlightgbmlinear-regressionllm-fine-tuningtext-classification

TDRoss/DNA-LLM

Chaining thoughts and LLMs to learn DNA structural biophysics

Python20Updated 1 year ago

dnallm-fine-tuning

Jatin-Mehra119/Plagiarism-detector-using-smolLM-

A web app for detecting plagiarism between two PDFs. Users can upload PDF files, and the app will detect plagiarism by leveraging a fine-tuned LLM model (SmolLM2-135M) trained on the MIT Plagiarism Detection Dataset. 700+ Monthly Downloads on HuggingFace Model Repo.

Jupyter Notebook11Updated 1 month ago

huggingface-transformersllm-fine-tuningllm-training

ankraj1234/MediGuide

Comparing QLoRA, Prompt & Prefix Tuning on Mistral-7B for medical instruction-following

Jupyter Notebook12Updated 8 months ago

chatbotgradiohuggingfacellm-fine-tuningmedical-llmmistralpeftperplexityprefix-tuningprompt-tuningqloraquantizationrouge-scoretransformers

Devanik21/Perplexity

Orion employs mode-specific prompt templates that dynamically incorporate user preferences: Précis Mode: Fast-track synthesis with executive summaries (100-500 words, ~4K tokens) Synopsis Mode: Balanced analytical reports with structured sections (1500-2500 words, ~8K tokens) Treatise Mode: Academic-grade research with abstracts(2000-4000)

Python10Updated 1 week ago

cognitive-architecturesdocument-summarizationinformation-retrievalknowledge-graphslanguage-modelsllm-fine-tuningnatural-language-processingtext-generation

ethicalabs-ai/FlowerTune-phi-4-NLP

FlowerTune LLM on NLP Dataset

Python10Updated 9 months ago

aifederated-learningfederated-learning-frameworkllm-fine-tuningllm-finetuningllm-trainingmachine-learningmicrosoftmicrosoft-phi-4mlphi-4

Zeldrizz/EmpathyLLM-Finetuning-RU-FEELIX

No description provided.

Python10Updated 11 months ago

empathy-llm-fine-tunningllamallm-fine-tuning

gdsmith1/Replicant

Clone your Discord friends with AI!

Python10Updated 5 months ago

aws-ec2aws-s3discordelevenlabsllm-fine-tuningopenaitrollingvoice-cloning

ethicalabs-ai/FlowerTune-Qwen2.5-7B-Instruct-Medical

FlowerTune LLM on Medical Dataset

Python10Updated 1 year ago

aifederated-learningfederated-learning-frameworkllm-fine-tuningllm-finetuningllm-trainingmachine-learningmlqwen2-5

Rekhii/Machine-Learning

Daily ML practice notebooks covering tabular data, deep learning, and weekend LLM fine-tuning experiments.

Jupyter Notebook10Updated 3 days ago

dailyprogrammerdata-sciencedeep-learningkagglellm-fine-tuningmachine-learningpythonpytorchtransformers

hishamp3/codeDetection

Django implementation of CodeBERT for detecting vulnerable code.

Python10Updated 1 year ago

codebertdjango-frameworkhtml-csslarge-language-modelsllm-fine-tuning

dialogue-for-dignity/Dialogue-for-Dignity

A small dialogue dataset exploring the boundaries of machine decision-making, agency, and alignment. Useful for fine-tuning conversational agents or testing moral reasoning

10Updated 2 months ago

ai-ethicsdialoguelanguage-modelllm-fine-tuningopen-dataset

shandingwangyue/lp-llmfine

llm fine tool, faster, saver memory

Python10Updated 2 months ago

fine-tuningfinetuning-llmsllm-fine-tuningllm-finetuningllm-training

V-

v-ade-r/LLM-Finetuning-using-LORA

Schematic Blueprint for Finetuning LLM (e.g. Qwen or Llama) for text classification using LORA. Output model can have original or modified head (e.g. for SequenceClassification).

Python10Updated 1 year ago

llmllm-fine-tuninglorasequence-classification

scionoftech/functiongemma-finetuning-e-commerce

A comprehensive, production-ready tutorial for fine-tuning Google's FunctionGemma-270M-IT model to build an intelligent E-Commerce Customer Support AI Agent with advanced function calling capabilities.

Jupyter Notebook01Updated 2 months ago

agentic-aifunctiongemmagemmagenerative-aillm-fine-tuningllm-trainingtool-calling

sivakiran7/Finetuning_LLM

No description provided.

Jupyter Notebook00Updated 4 months ago

llm-fine-tuningloraqloraquantizationquantization-aware-training

Page 1 of 2