"topic:ai-interpretability" — Search

13 results for “topic:ai-interpretability”

A Mechanistic Interpretability Toolkit for Cross-Layer Transcoder Training and Attribution-Graph Visualization

ai-interpretabilityattribution-graphsauto-interpretabilitycross-layer-transcodermechanistic-interpretabilitytranscodertransformer-circuitsvisual-interface

drKeeman/glitch_core

AI Safety research platform for studying personality drift in AI systems using mechanistic interpretability and clinical assessment tools. Complete simulation framework with neural circuit analysis, statistical drift detection, and intervention protocols.

Jupyter Notebook31Updated 7 months ago

ai-behaviorai-interpretabilityai-safetymechanistic-interpretabilityneural-circuits

scouzi1966/MLXLMProbe

Universal probing and interpretability tool for MLX language models on Apple Silicon

Python20Updated 1 month ago

ablation-studiesai-interpretabilityapple-intelligenceexpert-routinggpt-oss-20bllm-evaluationllm-inferencemechanistic-interpretabilitymixture-of-expertsmlxmlx-llmsmodel-interpretability

skyline-GTRr32/OKI-TRACE

OKI TRACE: Local LLM observability. See step-by-step, layer-by-layer what your AI thinks. Logit Lens & Attention for HuggingFace models.

Python10Updated 1 month ago

aiai-interpretabilityai-toolsai-transparencyattention-mechanismblackboxdeveloper-toolsglass-box-aihuggingfacellm-debuggingllm-observabilitylocal-llmlogit-lensmechanistic-interpretabilityopen-sourcepythontransformers

Heimdall-Organization/wpe-tme-language

WPE/TME: Text-native language for encoding semantic structure and temporal relationships. Geometric calculus with formal semantics. AI reasoning.

10Updated 3 months ago

ai-interpretabilityai-reasoningai-reasoning-systemscognitive-architecturecognitive-architecturesexplainable-aifield-theorygeometric-calculusknowledge-representationlanguagellm-scaffoldingneuro-symbolic-ainotationphase-spacereasoning-systemreasoning-systemssemantic-frameworkspecificationsymbolic-reasoningtemporal-logic

PotatoInfinity/Versor

Conformal Geometric Algebra (CGA) with efficient sequence modeling by introducing a recurrent rotor mechanism and a novel bit-masked hardware kernel that solves the computational bottleneck of Clifford products.

Python10Updated 1 month ago

ai-interpretabilityclifford-algebrasefficient-deep-learningexplainable-aigeometric-deep-learninggeometryisotropic-architecturemanifold-constrained-recurrencemathematical-physicsparadigm-shiftphysical-aiscientific-machine-learningsequence-modelingsub-quadratic-attention

kou-saki/i-asked-it-to-forget

I Asked It to Forget, but It Didn't — A Case of Miscommunication Between AI and Humans

00Updated 11 months ago

ai-behaviourai-errorsai-human-interactionai-interpretabilityai-researchchatgptchatgpt4context-persistencehallucinationinference-based-ailanguage-modelmiscommunicationopenaiprompt-analysisprompt-designprompt-recallstateless-aitechnical-reporttoken-limituser-intent

harleone/Bernoulli-Inspired-Analysis-of-Neural-Information-Propagation

A NeuroAI project using Bernoulli-inspired fluid-flow analogy to explore how information moves through neural networks. The signal strength in the NN is defined as the "pressure" from Bernoulli's equation, the speed of information propagation as the "flow speed of fluid" and, the activation level as the "opening and closing of valves".

00Updated 4 months ago

ai-interpretabilitybernoullideep-learningmachine-learningmnistneuralnetworkneuroaiphysicspythonvisualisation

FrancyJGLisboa/whiteboard-suite

Human Retention Layer for AI Work — hand-solvable math shadow models + session reasoning distillation. Cross-platform agent skills for Claude Code, Copilot, Cursor, Windsurf, Cline, Codex CLI, Gemini CLI, and 10+ more.

PowerShell00Updated 1 week ago

agent-skillai-interpretabilitymath-transparency

AlexTMjugador/redwoodresearch-interp-docker

📦 Redwood Research's transformer interpretability tools, conveniently packaged in a Docker container for simple and reproducible deployments.

Dockerfile00Updated 1 year ago

aiai-interpretabilityai-safetydockerredwood-research

EvezArt/evez-os

Open-source AI cognition layer — circuit-level topology engine producing verifiable FIRE events, bus validation receipts, and falsifiable cognition records in real time. AGPL-3.0.

Python00Updated 2 days ago

agplai-interpretabilityai-transparencyautonomous-agentsdecision-graphneural-network-visualizationtopologyvisual-cognition

rusparrish/Visual-Thinking-Lens

Framework for evaluating and steering generative image systems using geometry-first metrics, structural stress testing, and constraint-based analysis. Designed to expose compositional collapse, spatial priors, and model failure modes without accessing training data or model internals.

Jupyter Notebook00Updated 3 weeks ago

ai-art-researchai-critique-frameworkai-image-analysisai-interpretabilityartificial-visiongenerative-aiimage-quality-assessmentimage-recursionmachine-visionprompt-architecturerecursive-systemsstructural-critiquevisual-intelligencevisual-reasoning

Arehman782/wpe-tme-language

🌐 Explore WPE and TME, text-native languages designed for structural and temporal reasoning, enhancing clarity in semantic calculus.

00Updated just now

ai-interpretabilityai-reasoningcognitive-architecturesdeterministic-aiexplainable-aifield-theorygeometric-calculusknowledge-representationlanguagemulti-domainneuro-symbolicneuro-symbolic-ainotationphase-spacesemantic-frameworkspecificationsymbolic-reasoningtemporal-logic