17 results for “topic:cpu-only”
Minimal CPU-only Ollama Docker Image
🦙 chat-o-llama: A lightweight, modern web interface for AI conversations with support for both Ollama and llama.cpp backends. Features persistent conversation management, real-time backend switching, intelligent context compression, and a clean responsive UI.
A high-performance Python library for extracting structured content from PDF documents with layout-aware text extraction. pdf_to_json preserves document structure including headings (H1-H6) and body text, outputting clean JSON format.
An LLM-based content moderator. Firefox extension to block webpages unrelated to work, based on page title and URL. Local LLMs with Ollama and Langchain to ensure your browsing history never leaves your device, for complete privacy. Google Gemini also supported.
CPU-only local audio transcription with a browser UI — powered by faster-whisper, runs in your browser with a single Python script
No description provided.
Image Classification with On-Device Inference, built with Flutter, AI model runs on mobile cpu
Ternsig Virtual Mainframe Runtime (TVMR) — extensible VM with 10 standard extensions (121 instructions), Signal ISA, mastery learning, hot-reload firmware, and thermogram persistence.
Face locking system built on ArcFace (ONNX) and 5-point alignment that recognizes a selected identity, locks onto it, tracks facial actions, and records behavior over time.
Probabilistic Signed Distance Fusion with View Planning on CPU
CPU-friendly experience-based reasoning framework combining meta-learning (MAML), state space models (SSM), and memory buffers for fast few-shot adaptation. Pure NumPy implementation for edge devices and low-compute environments.
Pre-built Llama-CPP Wheel for HF Spaces (Python 3.13)
CPU-optimized RAG pipeline reducing latency 2.7× (247ms → 92ms). Implements caching, filtering, quantization for production. Complete with FastAPI, Docker, benchmarks, investor materials. The engineering showcase that sells itself.
CPU-only RAG stack: PDFs→Docling→Ollama→pgvector. Windows/macOS/Linux. Docker Compose. Graph-aware code search + scanned PDF OCR.
Face Detection service, super fast inference with a nano model
A lightweight reproduction and analysis inspired by recent work on presentation-aware deepfake / spoofing detection, with a focus on codec-induced presentation mismatch (AMR) under CPU-only constraints.
Chat-O-Llama is a user-friendly web interface for managing conversations with Ollama, featuring persistent chat history. Easily set up and start your chat sessions with just a few commands. 🐙💻