Hiberstack

Automatic cold storage & memory lifecycle management for search indexes

Hiberstack is a lightweight sidecar service built specifically for Typesense. It automatically offloads inactive collections from memory to cold storage (disk / S3), then reloads them on demand when traffic returns.

It is designed to solve a very specific but recurring operational problem:

Typesense is extremely fast because its indexes live entirely in RAM — but when you have many collections with bursty access patterns (e.g. multi-tenant architectures), RAM becomes prohibitively expensive.

Hiberstack adds elasticity to Typesense without modifying its internals.

Why Hiberstack exists

Typesense is optimized for:

blazing-fast queries
in-memory indexes
always-hot datasets

This works extremely well for single or few collections that are accessed continuously.

However, problems appear when you have:

hundreds or thousands of collections
most of them idle most of the time
limited RAM budgets
write operations that stall once memory is exhausted

Typical examples:

multi-tenant SaaS platforms
research or project-based tools
AI / RAG systems with per-project indexes
internal tools with bursty usage

Today, teams handle this manually using:

cron jobs
ad-hoc scripts
manual cleanup
over-provisioned RAM

Hiberstack makes this automatic, safe, and observable.

What Hiberstack does

At a high level:

Tracks collection-level activity (queries & writes)
Identifies inactive (cold) collections based on policy (e.g. 6h idle)
Exports those collections to cold storage (JSONL + schema)
Deletes them from the search engine to free RAM
Transparently reloads them on demand when traffic returns

All of this happens outside the search engine, as a sidecar proxy.

What Hiberstack does NOT try to do

This project is intentionally opinionated.

Hiberstack does NOT:

optimize single, always-hot large collections
replace or fork Typesense
page data at document or segment level
act as a general cache or LRU store

If you have:

one large collection
accessed continuously

Hiberstack is not for you.

Who this project is for

✅ Good fit

Hiberstack is designed for teams that have:

many collections (per project / tenant / workspace)
bursty or sporadic access patterns
RAM-constrained environments
need for predictable memory usage

Examples:

SaaS with per-customer indexes
research platforms with per-study datasets
AI tools creating indexes per workflow
internal platforms running on limited infra

❌ Not a good fit

single-collection deployments
always-hot datasets
latency-sensitive systems that cannot tolerate cold-starts

Built exclusively for Typesense

Unlike Elasticsearch or Meilisearch which utilize disk-paging or memory-mapped files to balance RAM and disk, Typesense intentionally keeps everything in pure RAM for maximum theoretical performance.

Hiberstack embraces this design choice by managing the collection lifecycle externally, giving you the extreme speed of Typesense when needed, and the cost-efficiency of S3 when idle.

Architecture overview

Hiberstack runs as a standalone sidecar service.

Client
  │
  ▼
Hiberstack (proxy + control plane)
  │
  ▼
Typesense (unmodified)

Cold storage:

Local filesystem (default)
S3-compatible object storage

Operating modes

Proxy mode (default)

Client → Hiberstack → Typesense

All requests pass through Hiberstack
Enables precise activity tracking
Allows transparent reload-on-demand
Adds sub-millisecond latency on hot paths

Observer mode (planned)

Hiberstack does not proxy traffic
Activity inferred from Node metrics
No latency impact
Limited reload automation

Collection lifecycle

Each collection is managed via a simple state machine:

HOT      → loaded in memory
COLD     → offloaded to storage
LOADING  → reload in progress
FAILED   → last operation failed

Only one transition is allowed at a time per collection.

Offload flow (background only)

Offloading never happens on the request path.

Background scheduler scans collections
Idle collections exceeding policy threshold are selected
Collection is exported (schema + documents)
Snapshot is stored safely
Collection is deleted from the engine
State transitions to COLD

This immediately frees RAM.

Reload (on-demand) flow

Reload is triggered by access to a cold collection.

Client → request
Proxy → state=COLD → trigger background reload
Proxy → 503 Service Unavailable (Retry-After: 2)

The client's standard retry logic will transparently handle the brief warming period.

Snapshot format

Snapshots are intentionally simple and portable:

snapshots/
  collection_name/
    schema.json
    documents.jsonl.gz
    metadata.json

This keeps recovery, debugging, and portability easy.

Configuration example

engine:
type: typesense
url: http://typesense:8108

offload:
after: 6h

storage:
type: s3
bucket: index-hibernate


---

## Safety guarantees

Hiberstack is designed to be conservative:

* Collections are **never deleted** unless snapshot upload succeeds
* All operations are **idempotent**
* Per-collection locks prevent races
* Failures move collections to `FAILED` state

No silent data loss.

---

## Observability

Hiberstack exposes Prometheus metrics:

* `Hiberstack_collections_hot`
* `Hiberstack_collections_cold`
* `Hiberstack_offloads_total`
* `Hiberstack_reloads_total`
* `Hiberstack_reload_duration_seconds`

---

## Why a sidecar (and not internal patching)

* No fork required
* Safe Typesense version upgrades
* Clear separation of concerns
* Easier to reason about failures

Hiberstack manages **lifecycle**, not search logic.


## Project status

* 🚧 Early-stage (v0.x)
* API may change
* Focused on correctness and safety first

---

## Roadmap

* [ ] Typesense native snapshot support integration
* [ ] Local + S3 storage
* [ ] Prometheus metrics dashboarding
* [ ] Pre-warming policies

---

## License

Apache 2.0

---

## Philosophy

Hiberstack is intentionally boring.

No magic. No heuristics. No clever tricks.

Just predictable, explicit control over memory —
for teams who need it.

SoyebSarkar/Hiberstack