31 results for “topic:trustworthiness”
Survey of Small Language Models from Penn State, ...
Deep Fact Validation
Provides web credibility models (Likert scale) to assign a trustworthiness score to a given website.
[USENIX Security 2025] Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models
a matrix to provide the clarified definition and relationship information of trustworthiness characteristics between in the AI/ML standards
A list of tools and methods for building trustworthy software following TrustOps principles.
In this paper, we introduce SAShA, a new attack strategy that leverages semantic features extracted from a knowledge graph in order to strengthen the efficacy of the attack to standard CF models. We performed an extensive experimental evaluation in order to investigate whether SAShA is more effective than baseline attacks against CF models by taking into account the impact of various semantic features.
In the dynamic landscape of medical artificial intelligence, this study explores the vulnerabilities of the Pathology Language-Image Pretraining (PLIP) model, a Vision Language Foundation model, under targeted attacks like PGD adversarial attack.
Trustworthiness Monitoring & Assessment Framework
Codes and Datasets for our WSDM 2022 Paper: "MTLTS: A Multi-Task Framework To Obtain Trustworthy Summaries From Crisis-Related Microblogs"
Visualization and embedding of large datasets using various Dimensionality Reduction (DR) techniques such as t-SNE, UMAP, PaCMAP & IVHD. Implementation of custom metrics to assess DR quality with complete explaination and workflow.
Independent continuation of a project from AstonHack 2017
CodeGenLink is a Visual Studio Code extension that interacts with GitHub Copilot Chat to generate code, analyze its origin, and identify the associated license.
Proof-Carrying Numbers (PCN): Trust is earned only by proof — the absence of a verification mark communicates uncertainty.
Website for health data science at KDD 2021
Emotion architecture from Reddit comments: rater behavior, semantic clusters, and contradiction mapping in GoEmotions.
Secure and trustworthy mobile AI.
Which LLM do you actually trust? Blind-test 100+ AI models with truth scoring and reasoning failure classification. No branding, no marketing — just data.
Squeeze your model with pressure prompts to see if its behavior leaks.
Proposal of a novel adversarial attack approach, called Target Adversarial Attack against Multimedia Recommender Systems (TAaMR), to investigate the modification of MR behavior when the images of a category of low recommended products (e.g., socks) are perturbed to misclassify the deep neural classifier towards the class of more recommended products (e.g., running shoes) with human-level slight images alterations.
Proof of Freshness: collate proof of an authorship date.
Initializes IOTA tangle peering between the K8s nodes of an aeriOS K8s domain
Component K - Trustworthiness Monitoring & Assessment Framework
In this work, we provide 24 combinations of attack/defense strategies, and visual-based recommenders to 1) access performance alteration on recommendation and 2) empirically verify the effect on final users through offline visual metrics.
Required files and configurations to bootstrap and operate an IOTA Tangle for trustworthiness management in aeriOS
Component M - Trustworthiness Monitoring & Assessment Framework
[Frontend] -- Web-based platform enabling users to inspect every step involved in the RAG methodology for KG fact-checking process
An Assurance Process for Big Data Trustworthiness - Marco Anisetti, Claudio A. Ardagna, Filippo Berto
REST API to insert messages into an IOTA Tangle
This repository is an implementation of the paper "Trustworthy Medical Image Segmentation with improved performance for in-distribution samples" published in Neural Networks.