#+TITLE: Emacs RAG with LibSQL
#+AUTHOR: John Kitchin
#+DATE: 2025-10-03
- Overview
=emacs-rag-libsql= is a complete Retrieval-Augmented Generation (RAG) system designed for Emacs integration. It provides semantic search capabilities over your local documents using vector embeddings and advanced two-stage reranking for improved relevance.
The system consists of two main components:
- Python FastAPI Server (=emacs-rag-server=) - A REST API service providing document indexing, vector search, and reranking
- Emacs Lisp Package (=emacs-rag=) - An Emacs package for server management, file indexing, and search interface
** Why This Matters
Traditional text search finds what you type. Semantic search finds what you mean.
- Search for "machine learning algorithms" and find documents about "neural networks" and "deep learning"
- Find relevant content even when different terminology is used
- Navigate directly to the most relevant sections in your notes
- Two-stage retrieval ensures both speed and accuracy
- Features
** 🔍 Multiple Search Modes
- Vector Search: Semantic similarity using embeddings for conceptual matching
- Full-Text Search: Fast FTS5-powered keyword search with BM25 ranking
- Hybrid Search: Combines vector and full-text search with configurable weighting
- Org Heading Navigation: Jump directly to any org heading across all indexed files
- Semantic Org Heading Search: Dynamic real-time semantic search across headings (with Ivy)
- Configurable Models: Choose from multiple embedding models based on your needs
** 🎯 Two-Stage Reranking
#+begin_src
Stage 1: Fast Bi-Encoder Retrieval
├─ Encode query → embedding vector
├─ Vector search → Top-K candidates (e.g., K=20)
└─ Fast but approximate ranking
Stage 2: Precise Cross-Encoder Reranking
├─ Score each query-document pair directly
├─ Re-sort by cross-encoder scores
└─ Return Top-N results (N=user limit)
#+end_src
This approach combines the speed of vector search with the accuracy of cross-encoder scoring.
** 📝 Smart Document Processing
- Automatic Chunking: Documents split into overlapping chunks with configurable size
- Line Number Tracking: Navigate directly to the exact line in your files
- Metadata Support: Attach custom metadata (author, tags, etc.) to indexed documents
- Batch Processing: Efficient embedding generation in batches
- Multiple File Types: Extensible to support any text-based format (default: org-mode)
** 🔄 Seamless Emacs Integration
- Auto-indexing: Automatically reindex files when you save them
- Direct Navigation: Jump straight to relevant lines in your documents
- Transient Menu: Beautiful, organized interface for all operations
- Ivy Integration: Enhanced search result selection with dynamic collections (fallback to completing-read)
- Real-time Search: Dynamic Ivy collections update results as you type
- Async Operations: Non-blocking directory indexing
- Server Lifecycle: Automatic server management - starts when needed
- gptel Integration: LLM function calling tools for RAG-augmented AI interactions
** 🗄️ LibSQL Backend
- SQL + Vectors: Combines the power of SQL with vector similarity search
- Efficient Storage: Separate tables for documents and embeddings
- Foreign Key Constraints: Data integrity with cascading deletes
- Fallback Support: Works even without vector extension (slower but functional)
- Local First: Your data stays on your machine
** ⚙️ Highly Configurable
All aspects are configurable through environment variables or Emacs customization:
- Chunk size and overlap
- Embedding models (sentence-transformers)
- Reranking models (cross-encoder)
- Search parameters
- File extensions to index
- Database location
- Server settings
- Architecture
** System Overview
#+begin_src
┌─────────────────────────────────────────────────────┐
│ Emacs Client │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ Server │ │ Indexing │ │ Search │ │
│ │ Management │ │ Commands │ │ Interface │ │
│ └──────────────┘ └──────────────┘ └────────────┘ │
└─────────────────────────────────────────────────────┘
│
HTTP/REST API
│
┌─────────────────────────────────────────────────────┐
│ Python FastAPI Server │
│ ┌──────────────────────────────────────────────┐ │
│ │ API Routes │ │
│ │ /index /search/vector /search/text │ │
│ │ /search/hybrid /org-headings /files │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ File Service │ │Search Service│ │ Stats │ │
│ └──────────────┘ └──────────────┘ └──────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Chunking │ │ Embeddings │ │ Reranker │ │
│ └──────────────┘ └──────────────┘ └──────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ LibSQL Database with Vector Storage │ │
│ └──────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
#+end_src
** Database Schema
*** Documents Table
Stores text chunks with metadata and line tracking:
#+begin_src sql
CREATE TABLE documents (
id TEXT PRIMARY KEY, -- {path}:{chunk_index}
source_path TEXT NOT NULL, -- Absolute file path
chunk_index INTEGER NOT NULL, -- 0-based chunk position
line_number INTEGER NOT NULL, -- Starting line (1-based)
content TEXT NOT NULL, -- Chunk text
chunk_size INTEGER NOT NULL, -- Actual character count
chunk_total INTEGER NOT NULL, -- Total chunks for this file
metadata JSON, -- Custom metadata as JSON
created_at INTEGER,
updated_at INTEGER
);
#+end_src
*** Embeddings Table
Stores vector embeddings linked to documents:
#+begin_src sql
CREATE TABLE embeddings (
id TEXT PRIMARY KEY, -- Same as documents.id
vector BLOB NOT NULL, -- Float32 vector
model TEXT NOT NULL, -- Embedding model identifier
created_at INTEGER,
FOREIGN KEY (id) REFERENCES documents(id) ON DELETE CASCADE
);
CREATE INDEX idx_embeddings_vector ON embeddings(vector) USING vector_cosine;
#+end_src
** ML Models
*** Default Embedding Model
Model: =sentence-transformers/all-MiniLM-L6-v2=
- Dimensions: 384
- Size: ~80MB
- Speed: Very fast inference
- Quality: Good general-purpose semantic similarity
- Training: MS MARCO passage ranking dataset
*** Default Reranker Model
Model: =cross-encoder/ms-marco-MiniLM-L-6-v2=
- Size: ~90MB
- Speed: Moderate (only applied to top-K candidates)
- Quality: Significantly better than distance metrics alone
- Training: MS MARCO passage reranking dataset
- Installation
** Prerequisites
- Python 3.10 or higher
- Emacs 27.1 or higher
- =uv= (recommended) or =pip= for Python dependencies
- =transient= package for Emacs (usually included with modern Emacs)
** Install Python Server
#+begin_src bash
Navigate to server directory
cd emacs-rag-libsql/emacs-rag-server
Install with uv (recommended)
uv sync
Or install with pip
pip install -e .
Verify installation
emacs-rag-server --help
#+end_src
** Install Emacs Package
Add to your Emacs configuration:
#+begin_src emacs-lisp
;; Add to load path
(add-to-list 'load-path "/Users/jkitchin/Dropbox/emacs/user/emacs-rag-libsql/emacs-rag/")
;; Load the package
(require 'emacs-rag)
;; Optional: Set custom database path
(setq emacs-rag-db-path "/Users/jkitchin/Dropbox/emacs/cache/rag-database")
;; Optional: Configure indexed file extensions
(setq emacs-rag-indexed-extensions '("org" "txt" "md"))
;; Optional: Disable auto-indexing on save
(setq emacs-rag-auto-index-on-save nil)
#+end_src
#+RESULTS:
** How do I force it to reload after changing the files?
#+begin_src emacs-lisp :results silent
;; Load the specific file with full path
(load-file (expand-file-name "emacs-rag/emacs-rag-server.el" default-directory))
(load-file (expand-file-name "emacs-rag/emacs-rag-index.el" default-directory))
(load-file (expand-file-name "emacs-rag/emacs-rag-search.el" default-directory))
(load-file (expand-file-name "emacs-rag/emacs-rag.el" default-directory))
(emacs-rag-stop-server)
(emacs-rag-start-server)
#+end_src
- Quick Start Guide
** Using the Transient Menu
The easiest way to use emacs-rag is through the transient menu:
#+begin_src emacs-lisp
M-x emacs-rag-menu
#+end_src
This opens an organized menu with all commands:
Top Row:
- Search (v/t/y/h/F): Vector, text, hybrid search, org headings, open files
- Server (a/p/r/S/l): Start, stop, restart, stats, logs
- Index (b/f/d/o): Buffer, file, directory, open buffers
Bottom Row:
- Delete (x/X/R): Buffer, file, database
- Maintenance (M/B): Rebuild FTS index, rebuild database
- Debug (D): Debug information
** 1. Start the Server
#+begin_src emacs-lisp
M-x emacs-rag-start-server
#+end_src
Or from the transient menu:
#+begin_src emacs-lisp
M-x emacs-rag-menu
;; Press 'a' to start server
#+end_src
The server will start on =http://127.0.0.1:8765= by default.
** 2. Index Your Documents
*** Index Current Buffer
#+begin_src emacs-lisp
M-x emacs-rag-index-buffer
#+end_src
This indexes the current buffer, including any unsaved changes.
*** Index a Directory
#+begin_src emacs-lisp
M-x emacs-rag-index-directory
;; Select directory to index
#+end_src
This will recursively index all eligible files (based on =emacs-rag-indexed-extensions=).
*** Index a Specific File
#+begin_src emacs-lisp
M-x emacs-rag-index-file
;; Select file to index
#+end_src
** 3. Search Your Documents
#+begin_src emacs-lisp
M-x emacs-rag-search-vector
;; Enter your search query: "machine learning concepts"
#+end_src
Results will be displayed with scores. Select one to navigate directly to that location in the file.
** 4. Other Useful Commands
*** Search with Selected Text
All search commands (vector, text, hybrid) automatically use selected region as the query:
#+begin_src emacs-lisp
;; Select text, then:
M-x emacs-rag-search-vector ; Semantic search
M-x emacs-rag-search-text ; Keyword search
M-x emacs-rag-search-hybrid ; Combined search
#+end_src
*** Jump to Org Headings
#+begin_src emacs-lisp
M-x emacs-rag-jump-to-org-heading
#+end_src
Browse all org headings from indexed files with instant navigation.
*** Search Org Headings Semantically
#+begin_src emacs-lisp
M-x emacs-rag-search-org-headings
#+end_src
Perform semantic search across org headings. When using Ivy, this provides a dynamic search interface - results update in real-time as you type, continuously re-querying the semantic search engine with your current input.
This is particularly useful for:
- Finding headings by concept rather than exact wording
- Exploring related topics across multiple org files
- Quick navigation when you remember the topic but not the exact heading text
With Ivy: Type continuously and watch results update dynamically
Without Ivy: Enter query once, then select from static results
*** View Statistics
#+begin_src emacs-lisp
M-x emacs-rag-stats
#+end_src
Shows total indexed chunks and files.
*** Debug Information
#+begin_src emacs-lisp
M-x emacs-rag-debug
#+end_src
Displays comprehensive diagnostic information.
** reload
#+BEGIN_SRC emacs-lisp
;; Load the specific file with full path
(load-file "/Users/jkitchin/Dropbox/emacs/user/emacs-rag-libsql/emacs-rag/emacs-rag-server.el")
(load-file "/Users/jkitchin/Dropbox/emacs/user/emacs-rag-libsql/emacs-rag/emacs-rag-index.el")
(load-file "/Users/jkitchin/Dropbox/emacs/user/emacs-rag-libsql/emacs-rag/emacs-rag-search.el")
(load-file "/Users/jkitchin/Dropbox/emacs/user/emacs-rag-libsql/emacs-rag/emacs-rag-gptel-tools.el")
(load-file "/Users/jkitchin/Dropbox/emacs/user/emacs-rag-libsql/emacs-rag/emacs-rag.el")
(emacs-rag-stop-server)
(emacs-rag-start-server)
(emacs-rag-gptel-enable-tool)
#+END_SRC
#+RESULTS:
: RAG search tool enabled for gptel
- Usage Examples
** Example 1: Research Notes
You have a directory of research notes in org-mode:
#+begin_src emacs-lisp
;; Index your research directory
M-x emacs-rag-index-directory
;; → ~/Documents/research/
;; Search across all notes
M-x emacs-rag-search-vector
;; Query: "neural network optimization techniques"
;; Results show relevant sections from multiple files
;; Select one to jump directly to that content
#+end_src
** Example 2: Code Documentation
Search across your project documentation:
#+begin_src emacs-lisp
;; Add markdown files to indexed types
(setq emacs-rag-indexed-extensions '("org"))
;; Index docs directory
M-x emacs-rag-index-directory
;; → ~/projects/myapp/docs/
;; Search for specific topics
M-x emacs-rag-search-vector
;; Query: "authentication flow"
#+end_src
** Example 3: Journal Entries
Search your daily journal by topic:
#+begin_src emacs-lisp
;; Auto-index enabled - journals update as you save
(setq emacs-rag-auto-index-on-save t)
;; Search across all journal entries
M-x emacs-rag-search-vector
;; Query: "project planning discussions"
;; Find relevant journal entries even if they use different wording
#+end_src
- Configuration
** Emacs Configuration Variables
*** Server Settings
#+begin_src emacs-lisp
(setq emacs-rag-server-host "127.0.0.1") ; Server hostname
(setq emacs-rag-server-port 8765) ; Server port
(setq emacs-rag-db-path "~/.emacs-rag/libsql") ; Database location
#+end_src
*** Indexing Settings
#+begin_src emacs-lisp
(setq emacs-rag-indexed-extensions '("org" "txt" "md")) ; File types
(setq emacs-rag-auto-index-on-save t) ; Auto-reindex on save
#+end_src
*** Search Settings
#+begin_src emacs-lisp
(setq emacs-rag-search-limit 5) ; Default result count
(setq emacs-rag-search-enable-rerank t) ; Enable reranking
(setq emacs-rag-result-display-width 80) ; Result text width
#+end_src
** Server Configuration (Environment Variables)
*** Database
#+begin_src bash
export EMACS_RAG_DB_PATH="$HOME/.emacs-rag/libsql"
#+end_src
*** Chunking
#+begin_src bash
export EMACS_RAG_CHUNK_SIZE="800" # Characters per chunk
export EMACS_RAG_CHUNK_OVERLAP="100" # Overlap between chunks
#+end_src
*** Models
#+begin_src bash
Embedding model
export EMACS_RAG_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"
Alternative: Higher quality but slower
export EMACS_RAG_EMBEDDING_MODEL="sentence-transformers/all-mpnet-base-v2"
Reranking model
export EMACS_RAG_RERANK_MODEL="cross-encoder/ms-marco-MiniLM-L-6-v2"
Enable/disable reranking
export EMACS_RAG_RERANK_ENABLED="true"
Number of candidates to rerank
export EMACS_RAG_RERANK_TOP_K="20"
#+end_src
*** Server
#+begin_src bash
export EMACS_RAG_HOST="127.0.0.1"
export EMACS_RAG_PORT="8765"
#+end_src
- API Reference
** REST API Endpoints
*** POST /index
Index a file with automatic chunking and embedding.
Request:
#+begin_src json
{
"path": "/absolute/path/to/file.org",
"content": "optional content override",
"metadata": {
"author": "John Doe",
"tags": ["research", "ML"]
}
}
#+end_src
Response:
#+begin_src json
{
"path": "/absolute/path/to/file.org",
"chunks_indexed": 15
}
#+end_src
*** GET /search/vector
Semantic similarity search.
Parameters:
- =query= (required): Search text
- =limit= (optional, default: 5): Max results
- =rerank= (optional, default: true): Enable reranking
Response:
#+begin_src json
{
"results": [
{
"source_path": "/path/to/file.org",
"chunk_index": 2,
"line_number": 45,
"content": "Relevant text content...",
"score": 0.8534
}
]
}
#+end_src
*** DELETE /files
Remove all chunks for a file.
Parameters:
- =path= (required): Absolute file path
Response:
#+begin_src json
{
"path": "/path/to/file.org",
"deleted": true
}
#+end_src
*** GET /stats
Database statistics.
Response:
#+begin_src json
{
"total_chunks": 1234,
"total_unique_files": 56,
"sample_chunk": {...}
}
#+end_src
*** GET /health
Health check.
Response:
#+begin_src json
{
"status": "ok"
}
#+end_src
** Emacs Commands
*** Server Management
| Command | Description |
|--------------------------------+------------------------------|
| =emacs-rag-start-server= | Start the RAG server |
| =emacs-rag-stop-server= | Stop the RAG server |
| =emacs-rag-restart-server= | Restart the RAG server |
| =emacs-rag-show-server-buffer= | Show server log buffer |
*** Indexing
| Command | Description |
|--------------------------------------+----------------------------------|
| =emacs-rag-index-file= | Index a specific file |
| =emacs-rag-index-buffer= | Index current buffer |
| =emacs-rag-index-directory= | Recursively index directory |
| =emacs-rag-reindex-all-open-buffers= | Reindex all open eligible buffers|
| =emacs-rag-delete-file= | Remove file from index |
| =emacs-rag-delete-buffer= | Remove current buffer from index |
*** Search
| Command | Description |
|---------------------------------+----------------------------------------------------|
| =emacs-rag-search-vector= | Semantic vector search (uses region) |
| =emacs-rag-search-text= | Full-text FTS5 search (uses region) |
| =emacs-rag-search-hybrid= | Hybrid vector + text search (uses region) |
| =emacs-rag-search-org-headings= | Semantic search of org headings (dynamic with Ivy) |
| =emacs-rag-jump-to-org-heading= | Navigate to any org heading |
| =emacs-rag-open-indexed-file= | Browse and open indexed files |
| =emacs-rag-stats= | Show database statistics |
*** Utilities
| Command | Description |
|------------------------------+--------------------------------------|
| =emacs-rag-menu= | Open transient menu |
| =emacs-rag-debug= | Show debug information |
| =emacs-rag-quick-start= | Show quick start guide |
| =emacs-rag-delete-database= | Delete entire database |
| =emacs-rag-rebuild-database= | Rebuild database with new schema |
| =emacs-rag-rebuild-fts-index=| Rebuild FTS5 index from documents |
- Advanced Usage
** Custom Metadata
Add custom metadata when indexing:
#+begin_src emacs-lisp
(emacs-rag-index-file
"~/notes/research.org"
'((author . "John Doe")
(project . "ML Research")
(tags . ("neural-networks" "optimization"))))
#+end_src
** Programmatic Search
#+begin_src emacs-lisp
(let* ((results (emacs-rag--request
"GET" "/search/vector" nil
'((query . "machine learning")
(limit . 10)
(rerank . "true"))))
(top-result (car (alist-get 'results results))))
;; Process results programmatically
(message "Top result: %s (score: %.3f)"
(alist-get 'source_path top-result)
(alist-get 'score top-result)))
#+end_src
** Batch Indexing with Progress
#+begin_src emacs-lisp
(defun my-index-project ()
"Index all org files in current project."
(interactive)
(when-let ((project-root (project-root (project-current))))
(message "Indexing project: %s" project-root)
(emacs-rag-index-directory project-root)))
#+end_src
** LLM Integration with gptel
The =emacs-rag-gptel-tools= module provides function calling tools that allow LLMs (via gptel) to search your indexed documents and retrieve relevant information during AI interactions.
*** Setup
First, ensure you have gptel installed with tool support:
#+begin_src emacs-lisp
;; Load the gptel tools module
(require 'emacs-rag-gptel-tools)
;; Enable the RAG search tool
(emacs-rag-gptel-enable-tool)
#+end_src
*** Usage
Once enabled, when you interact with an LLM through gptel, it can automatically call the =rag_search= tool to retrieve relevant information from your indexed documents:
#+begin_src emacs-lisp
;; Example: Ask the LLM a question about your documents
;; The LLM will automatically use rag_search if it needs information
M-x gptel-send
Prompt: "What did I write about machine learning optimization in my notes?it m"
;; The LLM will:
;; 1. Call rag_search with query "machine learning optimization"
;; 2. Receive the full text of the most relevant file
;; 3. Use that information to answer your question
#+end_src
*** Available Tool
=rag_search=: Searches through indexed documents using semantic vector search and returns the full text of the top matching file.
Parameters:
- =query= (string): The search query to find relevant documents
The tool automatically handles:
- Server availability checking
- Vector search with reranking enabled
- Retrieving the full file content
- Returning formatted results with relevance scores
*** Disabling the Tool
To disable the RAG search tool:
#+begin_src emacs-lisp
(emacs-rag-gptel-disable-tool)
#+end_src
** Different Embedding Models
For better quality (but slower):
#+begin_src bash
export EMACS_RAG_EMBEDDING_MODEL="sentence-transformers/all-mpnet-base-v2"
emacs-rag-server serve
#+end_src
For multilingual support:
#+begin_src bash
export EMACS_RAG_EMBEDDING_MODEL="sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
emacs-rag-server serve
#+end_src
- Troubleshooting
** Server Won't Start
#+begin_src emacs-lisp
;; Check server buffer for errors
M-x emacs-rag-show-server-buffer
;; Check debug info
M-x emacs-rag-debug
;; Verify Python installation
M-x shell-command RET python3 --version
#+end_src
** No Search Results
- Verify files are indexed: =M-x emacs-rag-stats=
- Check if server is running: =M-x emacs-rag-debug=
- Try disabling reranking temporarily
- Increase search limit with prefix argument: =C-u 10 M-x emacs-rag-search-vector=
** Poor Search Quality
- Enable reranking: =(setq emacs-rag-search-enable-rerank t)=
- Increase reranking pool: =export EMACS_RAG_RERANK_TOP_K=30=
- Try a different embedding model (see Advanced Usage)
- Adjust chunk size: =export EMACS_RAG_CHUNK_SIZE=1000=
** Indexing Fails
- Check file permissions
- Verify file encoding (UTF-8 recommended)
- Check available disk space
- Review server logs: =M-x emacs-rag-show-server-buffer=
** High Memory Usage
- Use a smaller embedding model
- Reduce chunk overlap: =export EMACS_RAG_CHUNK_OVERLAP=50=
- Clear old indexes: =M-x emacs-rag-delete-database=
- Performance Considerations
** Indexing Speed
- Chunk size: Larger chunks = fewer embeddings = faster indexing
- Batch size: Currently fixed at 8 documents per batch
- Model: =all-MiniLM-L6-v2= is the fastest default model
** Search Speed
- Vector search: Very fast (milliseconds)
- Reranking: Slower but only applied to top-K candidates
- Adjust rerank_top_k: Lower values = faster search, potentially less accurate
** Storage
- Embeddings: 384 floats × 4 bytes = ~1.5KB per chunk
- Text: Depends on chunk size (default 800 chars ≈ 800 bytes)
- Typical: ~2-3KB per chunk including metadata
- Development
** Project Structure
#+begin_src
emacs-rag-libsql/
├── emacs-rag/ # Emacs Lisp package
│ ├── emacs-rag.el # Main entry point + menu
│ ├── emacs-rag-server.el # Server management
│ ├── emacs-rag-index.el # Indexing commands
│ └── emacs-rag-search.el # Search interface
├── emacs-rag-server/ # Python FastAPI server
│ ├── src/emacs_rag_server/
│ │ ├── main.py # FastAPI app
│ │ ├── cli.py # CLI interface
│ │ ├── api/routes.py # API endpoints
│ │ ├── models/ # Database, embeddings, schemas
│ │ ├── services/ # Business logic
│ │ └── utils/ # Utilities
│ ├── pyproject.toml
│ └── README.org
├── software-design.org # Design documentation
└── readme.org # This file
#+end_src
** Running Tests
#+begin_src bash
cd emacs-rag-server
uv sync --dev
uv run pytest
#+end_src
** Development Mode
Start server with auto-reload:
#+begin_src bash
emacs-rag-server serve --reload
#+end_src
** Interactive API Documentation
When the server is running:
- Swagger UI: =http://127.0.0.1:8765/docs=
- ReDoc: =http://127.0.0.1:8765/redoc=
- Comparison with Other Tools
** vs. Traditional Grep/Ripgrep
| Feature | emacs-rag-libsql | grep/ripgrep |
|----------------------+----------------------+---------------------|
| Search Type | Semantic | Keyword/Regex |
| Finds Concepts | ✓ | ✗ |
| Speed | Fast (indexed) | Very Fast |
| Setup Required | Yes | No |
| Memory Usage | Moderate | Low |
| Ranking | ML-based | None |
** vs. Org-roam
| Feature | emacs-rag-libsql | org-roam |
|----------------------+----------------------+---------------------|
| Search Type | Semantic full-text | Links + Tags |
| Structure Required | No | Yes (IDs, links) |
| Content Search | ✓ Advanced | Basic |
| Relationship Mapping | ✗ | ✓ |
| Backlinks | ✗ | ✓ |
** vs. Deft
| Feature | emacs-rag-libsql | Deft |
|----------------------+----------------------+---------------------|
| Search Type | Semantic vector | Keyword |
| Relevance Ranking | ML-based | Frequency |
| File Navigation | Line-level | File-level |
| Performance | Indexed (fast) | Live search |
- Future Enhancements
Potential features for future development:
- PDF/DOCX indexing (via docling)
- Multiple collection support
- Project-scoped search
- org-db integration
- Metadata-based filtering in search
- Incremental indexing (detect changes)
- Search result caching
- Export/import database
- Remote server support
- Date-based filtering
- Duplicate detection
- Integration with GPT for RAG (via gptel tools)
- License
This project is licensed under the MIT License. See the [[file:LICENSE][LICENSE]] file for details.
Copyright (c) 2025 John Kitchin
- Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
- Support
For issues, questions, or suggestions:
- Check the troubleshooting section above
- Review =M-x emacs-rag-debug= output
- Check server logs with =M-x emacs-rag-show-server-buffer=
- File an issue on the project repository
- Acknowledgments
This project uses:
- [[https://fastapi.tiangolo.com/][FastAPI]] - Modern web framework for Python
- [[https://github.com/tursodatabase/libsql][LibSQL]] - SQLite fork with vector support
- [[https://www.sbert.net/][Sentence Transformers]] - State-of-the-art text embeddings
- [[https://magit.vc/manual/transient/][Transient]] - Emacs transient command interface
- [[https://github.com/abo-abo/swiper][Ivy]] - Completion framework for Emacs (optional)
- References
- [[file:software-design.org][Software Design Document]] - Detailed architecture and implementation
- [[file:emacs-rag-server/README.org][Server README]] - Python server documentation
- [[https://www.sbert.net/][Sentence-BERT Documentation]]
- [[https://github.com/tursodatabase/libsql][LibSQL Documentation]]
- [[https://fastapi.tiangolo.com/][FastAPI Documentation]]