Hiberstack
Automatic cold storage & memory lifecycle management for search indexes
Hiberstack is a lightweight sidecar service built specifically for Typesense. It automatically offloads inactive collections from memory to cold storage (disk / S3), then reloads them on demand when traffic returns.
It is designed to solve a very specific but recurring operational problem:
Typesense is extremely fast because its indexes live entirely in RAM — but when you have many collections with bursty access patterns (e.g. multi-tenant architectures), RAM becomes prohibitively expensive.
Hiberstack adds elasticity to Typesense without modifying its internals.
Why Hiberstack exists
Typesense is optimized for:
- blazing-fast queries
- in-memory indexes
- always-hot datasets
This works extremely well for single or few collections that are accessed continuously.
However, problems appear when you have:
- hundreds or thousands of collections
- most of them idle most of the time
- limited RAM budgets
- write operations that stall once memory is exhausted
Typical examples:
- multi-tenant SaaS platforms
- research or project-based tools
- AI / RAG systems with per-project indexes
- internal tools with bursty usage
Today, teams handle this manually using:
- cron jobs
- ad-hoc scripts
- manual cleanup
- over-provisioned RAM
Hiberstack makes this automatic, safe, and observable.
What Hiberstack does
At a high level:
- Tracks collection-level activity (queries & writes)
- Identifies inactive (cold) collections based on policy (e.g. 6h idle)
- Exports those collections to cold storage (JSONL + schema)
- Deletes them from the search engine to free RAM
- Transparently reloads them on demand when traffic returns
All of this happens outside the search engine, as a sidecar proxy.
What Hiberstack does NOT try to do
This project is intentionally opinionated.
Hiberstack does NOT:
- optimize single, always-hot large collections
- replace or fork Typesense
- page data at document or segment level
- act as a general cache or LRU store
If you have:
- one large collection
- accessed continuously
Hiberstack is not for you.
Who this project is for
✅ Good fit
Hiberstack is designed for teams that have:
- many collections (per project / tenant / workspace)
- bursty or sporadic access patterns
- RAM-constrained environments
- need for predictable memory usage
Examples:
- SaaS with per-customer indexes
- research platforms with per-study datasets
- AI tools creating indexes per workflow
- internal platforms running on limited infra
❌ Not a good fit
- single-collection deployments
- always-hot datasets
- latency-sensitive systems that cannot tolerate cold-starts
Built exclusively for Typesense
Unlike Elasticsearch or Meilisearch which utilize disk-paging or memory-mapped files to balance RAM and disk, Typesense intentionally keeps everything in pure RAM for maximum theoretical performance.
Hiberstack embraces this design choice by managing the collection lifecycle externally, giving you the extreme speed of Typesense when needed, and the cost-efficiency of S3 when idle.
Architecture overview
Hiberstack runs as a standalone sidecar service.
Client
│
▼
Hiberstack (proxy + control plane)
│
▼
Typesense (unmodified)
Cold storage:
- Local filesystem (default)
- S3-compatible object storage
Operating modes
Proxy mode (default)
Client → Hiberstack → Typesense
- All requests pass through Hiberstack
- Enables precise activity tracking
- Allows transparent reload-on-demand
- Adds sub-millisecond latency on hot paths
Observer mode (planned)
- Hiberstack does not proxy traffic
- Activity inferred from Node metrics
- No latency impact
- Limited reload automation
Collection lifecycle
Each collection is managed via a simple state machine:
HOT → loaded in memory
COLD → offloaded to storage
LOADING → reload in progress
FAILED → last operation failed
Only one transition is allowed at a time per collection.
Offload flow (background only)
Offloading never happens on the request path.
- Background scheduler scans collections
- Idle collections exceeding policy threshold are selected
- Collection is exported (schema + documents)
- Snapshot is stored safely
- Collection is deleted from the engine
- State transitions to
COLD
This immediately frees RAM.
Reload (on-demand) flow
Reload is triggered by access to a cold collection.
Client → request
Proxy → state=COLD → trigger background reload
Proxy → 503 Service Unavailable (Retry-After: 2)
The client's standard retry logic will transparently handle the brief warming period.
Snapshot format
Snapshots are intentionally simple and portable:
snapshots/
collection_name/
schema.json
documents.jsonl.gz
metadata.json
This keeps recovery, debugging, and portability easy.
Configuration example
engine:
type: typesense
url: http://typesense:8108
offload:
after: 6h
storage:
type: s3
bucket: index-hibernate
---
## Safety guarantees
Hiberstack is designed to be conservative:
* Collections are **never deleted** unless snapshot upload succeeds
* All operations are **idempotent**
* Per-collection locks prevent races
* Failures move collections to `FAILED` state
No silent data loss.
---
## Observability
Hiberstack exposes Prometheus metrics:
* `Hiberstack_collections_hot`
* `Hiberstack_collections_cold`
* `Hiberstack_offloads_total`
* `Hiberstack_reloads_total`
* `Hiberstack_reload_duration_seconds`
---
## Why a sidecar (and not internal patching)
* No fork required
* Safe Typesense version upgrades
* Clear separation of concerns
* Easier to reason about failures
Hiberstack manages **lifecycle**, not search logic.
## Project status
* 🚧 Early-stage (v0.x)
* API may change
* Focused on correctness and safety first
---
## Roadmap
* [ ] Typesense native snapshot support integration
* [ ] Local + S3 storage
* [ ] Prometheus metrics dashboarding
* [ ] Pre-warming policies
---
## License
Apache 2.0
---
## Philosophy
Hiberstack is intentionally boring.
No magic. No heuristics. No clever tricks.
Just predictable, explicit control over memory —
for teams who need it.
