3,202 results for “topic:information-retrieval”
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
Open Source AI Platform - AI Chat with advanced features that works with every LLM
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
Topic Modelling for Humans
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
Retrieval and Retrieval-augmented LLMs
Open-source context retrieval layer for AI agents
Apache Lucene and Solr open-source search software
AdalFlow: The library to build & auto-optimize LLM applications.
Fetches system/theme information in terminal for Linux desktop screenshots.
Harness LLMs with Multi-Agent Programming
Accelerated deep learning R&D
Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and memory ops 🦖
Apache Lucene open-source search software
MTEB: Massive Text Embedding Benchmark
Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.
Learning to Rank in TensorFlow
Deep neural network to extract intelligent information from invoice documents.
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
A collection of research on knowledge graphs
Efficient Retrieval Augmentation and Generation Framework
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐