53 results for “topic:trustworthy-machine-learning”
The open-sourced Python toolbox for backdoor attacks and defenses.
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
Open-source framework for uncertainty and deep learning models in PyTorch 🌱
Neural Network Verification Software Tool
[ICML2022 Long Talk] Official Pytorch implementation of "To Smooth or Not? When Label Smoothing Meets Noisy Labels"
A project to add scalable state-of-the-art out-of-distribution detection (open set recognition) support by changing two lines of code! Perform efficient inferences (i.e., do not increase inference time) and detection without classification accuracy drop, hyperparameter tuning, or collecting additional data.
Papers and online resources related to machine learning fairness
[ICCV2021 Oral] Fooling LiDAR by Attacking GPS Trajectory
Papers related to Federated Learning in all top venues
PyTorch package to train and audit ML models for Individual Fairness
Welcome! 👋 This is the working draft of the Aalto Dictionary of Machine Learning (ADictML) — a growing collection of short, clear definitions for key terms in machine learning.
A project to improve out-of-distribution detection (open set recognition) and uncertainty estimation by changing a few lines of code in your project! Perform efficient inferences (i.e., do not increase inference time) without repetitive model training, hyperparameter tuning, or collecting additional data.
A list of research papers of explainable machine learning.
[NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning"
Privacy-Preserving Machine Learning (PPML) Tutorial
SyReNN: Symbolic Representations for Neural Networks
Framework for Adversarial Malware Evaluation.
Trustworthy AI method based on Dempster-Shafer theory - application to fetal brain 3D T2w MRI segmentation
a tool for comparing the predictions of any text classifiers
Morphence: An implementation of a moving target defense against adversarial example attacks demonstrated for image classification models trained on MNIST and CIFAR-10.
MERLIN is a global, model-agnostic, contrastive explainer for any tabular or text classifier. It provides contrastive explanations of how the behaviour of two machine learning models differs.
[Findings of EMNLP 2022] Holistic Sentence Embeddings for Better Out-of-Distribution Detection
A project to train your model from scratch or fine-tune a pretrained model using the losses provided in this library to improve out-of-distribution detection and uncertainty estimation performances. Calibrate your model to produce enhanced uncertainty estimations. Detect out-of-distribution data using the defined score type and threshold.
Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-tuning
Repository for the NeurIPS 2023 paper "Beyond Confidence: Reliable Models Should Also Consider Atypicality"
A School for All Seasons on Trustworthy Machine Learning
Initiating a paradigm shift in reporting and helping with making ML advances more considerate of sustainability and trustworthiness.
TRIAGE: Characterizing and auditing training data for improved regression (NeurIPS 2023)
Implementation for the paper "Approximating full conformal prediction at scale via influence functions"
Code from PLDI '21 paper "Provable Repair of Deep Neural Networks."