"topic:information-retrieval" — Search

3,202 results for “topic:information-retrieval”

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python29.1k3.5kUpdated 1 hour ago

cnncrnndata-miningdeep-learningeasyocrimage-processinginformation-retrievallstmmachine-learningocroptical-character-recognitionpythonpytorchscene-textscene-text-recognition

deepset-ai/haystack

Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

MDX24.4k2.6kUpdated 1 hour ago

agentagentsaigeminigenerative-aigpt-4information-retrievallarge-language-modelsllmmachine-learningnlporchestrationpythonpytorchquestion-answeringragretrieval-augmented-generationsemantic-searchsummarizationtransformers

VectifyAI/PageIndex

📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

Python21.0k1.6kUpdated just now

agentic-aiagentsaiai-agentscontext-engineeringinformation-retrievalllmragreasoningretrievalretrieval-augmented-generationvector-database

onyx-dot-app/onyx

Open Source AI Platform - AI Chat with advanced features that works with every LLM

Python17.8k2.4kUpdated just now

aiai-chatchatgptchatuienterprise-searchgen-aiinformation-retrievalllmllm-uinextjspythonragself-hostedvector-search

arc53/DocsGPT

Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

Python17.7k2.0kUpdated 7 hours ago

agent-builderagentsaichatgptdocsgpthacktoberfesthacktoberfest2025information-retrievallanguage-modelllmmachine-learningnatural-language-processingpythonpytorchragreactsearchsemantic-searchtransformers

piskvorky/gensim

Topic Modelling for Humans

Python16.4k4.4kUpdated 2 hours ago

data-miningdata-sciencedocument-similarityfasttextgensiminformation-retrievalmachine-learningnatural-language-processingneural-networknlppythontopic-modelingword-embeddingsword-similarityword2vec

weaviate/weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.

Go15.8k1.2kUpdated 1 hour ago

approximate-nearest-neighbor-searchgenerative-searchgrpchnswhybrid-searchimage-searchinformation-retrievalmlopsnearest-neighbor-searchneural-searchrecommender-systemsearch-enginesemantic-searchsemantic-search-enginesimilarity-searchvector-databasevector-searchvector-search-enginevectorsweaviate

Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

HTML14.2k1.2kUpdated just now

data-pipelinesdeep-learningdocument-image-analysisdocument-image-processingdocument-parserdocument-parsingdocxdonutinformation-retrievallangchainllmmachine-learningmlnatural-language-processingnlpocrpdfpdf-to-jsonpdf-to-textpreprocessing

neuml/txtai

💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

Python12.3k788Updated 4 hours ago

agentsaiai-agentsembeddingsinformation-retrievallanguage-modellarge-language-modelsllmnlppythonragretrieval-augmented-generationsearchsearch-enginesemantic-searchsentence-embeddingstransformerstxtaivector-databasevector-search

FlagOpen/FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python11.4k840Updated 1 hour ago

embeddingsinformation-retrievalllmretrieval-augmented-generationsentence-embeddingstext-semantic-similarity

airweave-ai/airweave

Open-source context retrieval layer for AI agents

Python6.0k725Updated 1 hour ago

agent-infrastructureaiai-agentsai-infrastructureapicontext-retrievaldata-connectorsdeveloper-toolsenterprise-datainformation-retrievalintegrationllmopen-sourceragretrievalretrieval-augmented-generationsdksearchsearch-apisemantic-search

apache/lucene-solr

Apache Lucene and Solr open-source search software

4.4k2.6kUpdated 1 week ago

backendinformation-retrievaljavalucenenosqlsearchsearch-enginesolr

SylphAI-Inc/AdalFlow

AdalFlow: The library to build & auto-optimize LLM applications.

Python4.1k364Updated just now

agentaiauto-promptingbm25chatbotfaissframeworkgenerative-aiinformation-retrievalllmmachine-learningnlpoptimizerpythonquestion-answeringragrerankerretrieversummarizationtrainer

KittyKatt/screenFetch

Fetches system/theme information in terminal for Linux desktop screenshots.

Shell4.0k450Updated 11 hours ago

bashdesktopinformation-retrievalshell

langroid/langroid

Harness LLMs with Multi-Agent Programming

Python3.9k361Updated 1 hour ago

agentsaichatgptfunction-callinggptgpt-4gpt4information-retrievallanguage-modelllamallmllm-agentllm-frameworklocal-llmmulti-agent-systemsopenai-apiragretrieval-augmented-generation

catalyst-team/catalyst

Accelerated deep learning R&D

Python3.4k400Updated 1 week ago

computer-visiondeep-learningdistributed-computingimage-classificationimage-processingimage-segmentationinformation-retrievalinfrastructuremachine-learningmetric-learningnatural-language-processingobject-detectionpythonpytorchrecommender-systemreinforcement-learningreproducibilityresearchtext-classificationtext-segmentation

ashvardanian/StringZilla

Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and memory ops 🦖

C3.4k120Updated 2 days ago

datasetedit-distancegpuhashhashinginformation-retrievallevenshtein-distanceparsersearchsimdsorting-algorithmsstringstring-manipulationstring-matchingstring-parsingstring-searchsubstringunicode

apache/lucene

Apache Lucene open-source search software

Java3.4k1.3kUpdated just now

backendinformation-retrievaljavalucenenosqlsearchsearch-engine

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

Python3.2k568Updated 14 hours ago

benchmarkbitext-miningclusteringinformation-retrievallow-resource-nlpmtebmultilingual-nlpmultimodalneural-searchrerankingretrievalsbertsemantic-searchsentence-transformersststext-classificationtext-embedding

rajkumardusad/IP-Tracer

Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.

PHP2.8k512Updated just now

gnuroot-debianhacking-toolhacking-toolsinformation-gatheringinformation-retrievalip-geolocationip-locationip-tracerlinuxlinux-toolstermuxtermux-hackingtermux-tool

tensorflow/rankingArchived

Learning to Rank in TensorFlow

Python2.8k480Updated 1 day ago

deep-learninginformation-retrievallearning-to-rankmachine-learningrankingrecommender-systems

naiveHobo/InvoiceNet

Deep neural network to extract intelligent information from invoice documents.

Python2.7k415Updated 1 hour ago

billingclassificationdeep-learningdeep-neural-networksdeeplearninginformation-extractioninformation-retrievalinvoiceinvoice-insightinvoice-managementinvoice-parserinvoice-pdfinvoice-softwareinvoiceskeraskeras-neural-networkskeras-tensorflow

illuin-tech/colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python2.5k235Updated 9 hours ago

colpalicolqwen2colsmolinformation-retrievalretrieval-augmented-generationvision-language-model

beir-cellar/beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Python2.1k235Updated 6 hours ago

benchmarkbertcolbertdatasetdeep-learningdprelasticsearchinformation-retrievalllmnlppassage-retrievalpytorchquestion-generationragretrievalretrieval-modelssbertsentence-transformerszero-shot-retrieval

castorini/pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python2.0k492Updated 10 hours ago

information-retrieval

xlang-ai/instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Python2.0k156Updated 3 days ago

embeddingsinformation-retrievallanguage-modelprompt-retrievaltext-classificationtext-clusteringtext-embeddingtext-evaluationtext-rerankingtext-semantic-similarity

youngfish42/Awesome-FL

Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)

Python2.0k219Updated 2 days ago

artificial-intelligenceawesomecomputer-visiondata-miningdatabasedeep-learningefficiencyfederated-learningfederated-learning-frameworkgraphgraph-neural-networksinformation-retrievalknowledge-graphmachine-learningnatural-language-processingpaperprivacysecuritysystemtabular-data

shaoxiongji/knowledge-graphs

A collection of research on knowledge graphs

JavaScript1.8k296Updated 5 days ago

commonsensecross-modaldialogue-systemsinformation-retrievalknowledge-graphknowledge-graph-completionmeta-relational-learningnatural-language-processingnerpaperquestion-answeringreasoningrecommendation-systemsrelation-extractionrepresentation-learningsurveytemporal-knowledge-graph

IntelLabs/fastRAGArchived

Efficient Retrieval Augmentation and Generation Framework

Python1.8k165Updated 3 days ago

benchmarkcolbertdiffusiongenerative-aiinformation-retrievalknowledge-graphllmmulti-modalnlpquestion-answeringsemantic-searchsentence-transformerssummarizationtransformers

ashvardanian/SimSIMD

Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐

C1.7k104Updated 3 hours ago

arm-neonarm-sveassemblyavx2avx512bfloat16blasblas-librariesdistance-calculationfloat16information-retrievalmetricsneonnumpyscipysimdsimd-instructionssimilarity-measuressimilarity-searchvector-search

Page 1 of 34