verificationPROTOTYPEPUBLICApache License 2.0

CRoM-EfficientLLM

A Python toolkit to optimize LLM context by intelligently selecting, re-ranking, and managing text chunks to fit a model's budget while maximizing relevance.

GitHub Repository README Docs

About This Work

A Python toolkit to optimize LLM context by intelligently selecting, re-ranking, and managing text chunks to fit a model's budget while maximizing relevance. Combines sparse (TF-IDF) and dense (Sentence-Transformers) retrieval scores for robust and high-quality reranking of documents.

benchmarkingcontext-managementhybrid-searchllmllm-evaluationnatural-language-processingobservabilityprompt-engineeringpythonragrerankerretrievaltoken

Repository Overview

README Core

CRoM (Context Rot Mitigation)-EfficientLLM is a Python toolkit designed to optimize the context provided to Large Language Models (LLMs). It provides a suite of tools to intelligently select, re-rank, and manage text chunks to fit within a model\'s context budget while maximizing relevance and minimizing performance drift.

This project is ideal for developers building RAG (Retrieval-Augmented Generation) pipelines who need to make the most of limited context windows.

Install the package directly from source using pip. For development, it\'s recommended to install in editable mode with the extras.

Use & Documentation

Detailed installation, commands, examples, and deeper usage notes live in the repository README and docs.

Open README Open Docs

README Map

Key Features
Installation
Quickstart
🚀 Interactive Demo
Local Demo
CLI Benchmarking Examples

Key Signals

Budget Packer: Greedily packs the highest-scoring text chunks into a defined token budget using a stable sorting algorithm.
Hybrid Reranker: Combines sparse (TF-IDF) and dense (Sentence-Transformers) retrieval scores for robust and high-quality reranking of documents.
Drift Estimator: Monitors the semantic drift between sequential model responses using L2 or cosine distance with EWMA smoothing.
Observability: Exposes Prometheus metrics for monitoring token savings and drift alerts in production.
Comprehensive Benchmarking: Includes a CLI for end-to-end pipeline evaluation, budget sweeps, and quality-vs-optimal analysis.

Announcements

synced Mar 13, 2026

No mirrored announcements yet.