"topic:e-values" — Search

3 results for “topic:e-values”

Human-Centric-Machine-Learning/token-audit

Repository for the paper "Auditing Pay-Per-Token in Large Language Models"

A kernel-userland protocol enforcing information-theoretic bounds on AI adaptivity leakage, benchmark gaming, and capability spillover.

Rust10Updated 1 week ago

agentic-aiai-safetyai-safety-researchbenchmark-integritycryptographydeterministic-executiondifferential-privacye-valuesepistemic-securityformal-methodskernelreference-monitorrustsandboxingsystems-programmingverificationwasmwebassembly

kadubon/audit-closed-ai-scientist

Benchmark for statistically valid AI scientist systems, using audit-closed protocols, transparency logs, and sequential inference to prevent false discoveries in autonomous research agents.

Python00Updated 4 days ago

agentic-aiai-agentsai-governanceai-scientistaudit-logautomated-scienceautonomous-researchdeterministic-replaye-processe-valuesoptional-stoppingp-hackingreproducible-scienceresearch-automationscientific-discoveryscientific-machine-learningself-driving-labsequential-inferencestatistical-validitytransparency-log