"topic:kl-divergence" — Search

Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay

Python177Updated 7 years ago

batch-switchingepsilon-greedyhowards-pikl-divergencelinear-programmingmarkovian-epidemic-processesmdpsmulti-armed-banditsmultiarm-banditpolicy-evaluationpolicy-iterationrandomised-algorithmsrandomized-policy-iterationreinforcement-learningreinforcement-learning-analysisreinforcement-learning-excercisesthompson-samplingucbucb1

rochitasundar/Generative-AI-with-Large-Language-Models

This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".

Jupyter Notebook176Updated 2 years ago

flan-t5instruction-finetuningkl-divergencelarge-language-modelsllm-evaluationlow-rank-adaptationparameter-efficient-fine-tuningprompt-engineeringproximal-policy-optimizationreinforcement-learningreward-modeltransformer

choderalab/integrator-benchmark

Code for enumerating and evaluating numerical methods for Langevin dynamics using near-equilibrium estimates of the KL-divergence. Accompanies https://doi.org/10.3390/e20050318

Jupyter Notebook133Updated 7 years ago

kl-divergencelangevin-dynamicsmolecular-dynamicsopenmm

nocotan/geodesical_skew_divergence

PyTorch implementation of α-geodesical skew divergence

Python111Updated 4 years ago

divergenceinformation-geometryjs-divergencekl-divergencemachine-learningstatistics

wecarsoniv/beta-divergence-metrics

PyTorch implementations of the beta divergence loss.

Python110Updated 4 years ago

beta-divergencedistance-measuresdistance-metricdistance-metricsdivergencedivergencesitakura-saito-divergencekl-divergencekullback-leibler-divergencelossloss-functionsmean-square-errormean-squared-errornmfnmf-decompositionnon-negative-matrix-factorizationnumpyobjective-functionspytorchtorch

yuancoder222/KL-loss

KL-loss

Cuda102Updated 6 years ago

caffecvpr2019kl-divergencekl-loss

purbeshmitra/semantic-soft-bootstrapping

A self-distillation based training method for long context reasoning in a single LLM without reinforcement learning

Python101Updated 1 month ago

kl-divergenceknowledge-distillationllm-trainingrlvr

ogunlao/foundations_of_ml

Machine Learning algorithms built from scratch for AMMI Machine Learning course

Jupyter Notebook86Updated 5 years ago

coordinate-descentgaussian-discriminant-analysiskfold-cross-validationkl-divergencekmeans-clusteringlinear-regressionnaive-bayes-classifierperceptron-learning-algorithm

LunjunZhang/ema-pg

Code for "EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL" (arxiv.org/abs/2602.04417)

Python81Updated 1 month ago

kl-divergencelarge-language-modelspolicy-gradientreasoning-modelsreinforcement-learning

curvysquare/PPO-and-A2C-for-HULH-poker

My MSc project on applying, tuning and modifying the PPO and A2C algorithms to Pettingzoo MARL library two player poker game

Python50Updated 2 years ago

a2caction-maskingactor-critichulhkl-divergencenash-equilibriumpokerppopython3

yoyolicoris/pytorch-wise-ale

No description provided.

Jupyter Notebook51Updated 6 years ago

kl-divergencevae

anishLearnsToCode/kl-divergence-images

Relative entropy, mutual information, KL divergence of 2 given Images 🖼

Jupyter Notebook51Updated 4 years ago

computer-visioncvkl-divergencepatter-recognition

kyosek/change-point-detection-kl-divergence

Change point detection using KL divergence

Python50Updated 2 years ago

bitcoin-pricekl-divergencemonitoringprice-changes

harsh306/GAN

Basic GANs with variety of loss functions as an exercise for my Thesis with Prof. Randy Paffenroth. KL, Reverse-KL, JS and Wasserstein GAN.

Jupyter Notebook50Updated 8 years ago

deep-neural-networksgangenerative-adversarial-networkjensen-shannon-divergencekl-divergencekullback-leibler-divergencesynthetic-datawasserstein-gans

smsk-01/Hyperspectral-Source-Unmixing-with--VAE

Hyperspectral unmixing using Variational Autoencoders with Dirichlet latent distributions, achieving state-of-the-art performance on endembers and abundances reconstruction.

Jupyter Notebook50Updated 5 months ago

dirichlethyperspectralhyperspectral-unmixingkl-divergencelatent-spacepytorchreparameterization-tricksourcesunmixingvae

HolmesShuan/Bias-Variance-Decomposition-for-KL-Divergence

This repository includes some detailed proofs of "Bias Variance Decomposition for KL Divergence".

40Updated 4 years ago

bias-variance-tradeoffkl-divergenceknowledge-distillation

Moozzaart23/PlagiarismChecker

Implementation of KL Divergence and inverted vector model for plagiarism detection in text files

Python41Updated 6 years ago

information-retrievalkl-divergenceplagiarism-detectionpython

mark-antal-csizmadia/variational-inference-gmm

Coordinate ascent mean-field variational inference (CAVI) using the evidence lower bound (ELBO) to iteratively perform the optimal variational factor distribution parameter updates for clustering.

Jupyter Notebook40Updated 4 years ago

coordinate-ascentelbogaussian-mixture-modelkl-divergencelatent-variablesmean-field-theoryprobabilistic-graphical-modelspythonvariational-inferencevariational-lower-bound

cadmiumcr/summarizer

A collection of summarizer algorithms

Crystal32Updated 2 months ago

crystalcrystal-langcrystal-languageextractive-summarizationkl-divergenceluhnnlpsumbasicsummarizationsummarizer

donlapark/Dirichlet-Mechanism

The Dirichlet Mechanism for Differentially Private KL Divergence Minimization

Jupyter Notebook21Updated 2 years ago

differential-privacydirichlet-distributiondirichlet-multinomialgaussian-mechanismkl-divergencelaplace-mechanismlikelihoodmachine-learningmaximum-likelihoodmaximum-likelihood-estimationnaive-bayesnaive-bayes-algorithmnaive-bayes-classificationnaive-bayes-classifier

ron-taieb/SchedNoise-Diffusion

Implementation of diffusion models with varying noise distributions (Gaussian, GMM, Gamma) and scheduling techniques (cosine, sigmoid) to assess generative performance using KL divergence and dynamic scheduling approaches.

Jupyter Notebook10Updated 1 year ago

deepdeep-learningdiffusion-modelsdynamic-schedulinggenerative-aigenerative-modelskl-divergencemachine-learningnoise-distributionspython

chandnii7/Natural-Language-Processing

NLP implementations like information-theoretic measures of distributional similarity, text preprocessing using shell commands, Naive Bayes text categorization model, Cocke-Younger-Kasami parsing.

Jupyter Notebook12Updated 5 years ago

cocke-younger-kasami-parsingcross-entropycyk-parserentropyinformation-theoryjensen-shannon-divergencejs-divergencekl-divergencekullback-leibler-divergencenaive-bayes-classifiernlpshell-commandtext-categorizationtext-preprocessing

junlulocky/infopy

Python information theory computation

Python10Updated 8 years ago

bicentropyetcinformation-theorykl-divergencemutual-infonormalized-mutual-info

SatvikVarshney/IsingModelBoltzmannMachine

Using Monte-Carlo simulated datasets, a completely transparent Boltzmann Machine trained on 1-D Ising chain data is implemented to predict model couplers in the absence of past coupler values. Methods from machine learning applied to theoretical physics are on display in this work.

HTML10Updated 1 year ago

boltzmann-machinehyperparameter-tuningising-modelkl-divergencemachine-learningmatplotlibnumpypythonstatistical-mechanics

Page 1 of 2