llm-prompt-optimizer
A deterministic prompt optimization middleware for AI coding agents that preserves intent, compresses context, and detects drift, available as an MCP server with tools for optimize_prompt, classify_prompt, and more.
README
llm-prompt-optimizer
Deterministic prompt optimization middleware for AI coding agents.
A universal, installable, MCP-compatible Prompt Compiler + Context Governance Engine that acts as a deterministic middleware layer for AI coding agents.
Works as a standalone engine while optionally integrating with graph providers or external tools when available.
Core Philosophy
| The optimizer MUST | The optimizer MUST NEVER |
|---|---|
| Preserve semantic intent | Hallucinate user goals |
| Remain deterministic | Invent broader tasks |
| Discover minimal required context | Rewrite prompts creatively |
| Explain why dependencies were included | Widen repo scope unnecessarily |
| Gracefully degrade when tools unavailable | Assume architecture redesign |
| Optimize at line-level granularity | Depend on hardcoded limits |
| Remain agent-agnostic | Require external graph tools to function |
Highest priority rule: Intent preservation > Compression. Always.
Pipeline Architecture
User Prompt
↓
Intent Guard ← extracts & seals user intent (immutable)
↓
Prompt Classification ← 7-layer multi-label classification (no keyword-only)
↓
Dependency Resolution ← FallbackGraphEngine (AST, imports, symbols, git)
↓
Adaptive Context Expansion ← value-based: (relevance × confidence × proximity) / token_cost
↓
Precise Context Extraction ← line-level, never full files
↓
Prompt Optimization ← compression + constraint injection (no creative rewriting)
↓
Semantic Validation ← reject if similarity < 0.90
↓
Drift Detection ← scope widening, hallucinated files, altered intent
↓
Prompt Compiler ← final assembly for LLM
↓
AI Agent / LLM
Quick Start
Install
pip install llm-prompt-optimizer
# With ML-powered semantic similarity (recommended):
pip install "llm-prompt-optimizer[ml]"
# With Anthropic adapter:
pip install "llm-prompt-optimizer[anthropic]"
# Everything:
pip install "llm-prompt-optimizer[all]"
SDK Usage
from llm_prompt_optimizer import Optimizer
optimizer = Optimizer()
result = optimizer.optimize(
prompt="Debug condition mismatch in the provided /path/repository/file. No code changes.",
strict_mode=True,
repo_root="/path/to/your/repo",
)
print(result.optimized_prompt.text)
print(f"Tokens: {result.optimized_prompt.token_estimate}")
print(f"Similarity: {result.optimized_prompt.semantic_similarity:.3f}")
print(f"Category: {result.classification.primary_category.value}")
print(f"Intent: {result.intent_lock.intent_summary}")
print(f"Drift clean: {result.drift_report.is_clean}")
CLI Usage
# Optimize a prompt
llm-prompt-optimizer optimize "debug condition mismatch in path/to/folder"
# Strict mode (no dependency expansion)
llm-prompt-optimizer optimize "debug condition mismatch" --strict
# JSON output
llm-prompt-optimizer optimize "debug condition mismatch" --json
# Classify a prompt
llm-prompt-optimizer classify "implement /health endpoint in api/routes.py"
# Validate semantic similarity
llm-prompt-optimizer validate \
--raw "debug condition mismatch" \
--optimized "# Task\ndebug condition mismatch\n## Constraints\n- No code changes."
# Detect drift
llm-prompt-optimizer detect-drift \
--raw "debug condition mismatch" \
--optimized "debug condition mismatch. Also refactor everything."
# Estimate token cost
llm-prompt-optimizer estimate-cost "debug condition mismatch"
# Run benchmark suite
llm-prompt-optimizer benchmark
# Start MCP server (stdio — for Claude Desktop / Cursor)
llm-prompt-optimizer serve --transport stdio
# Start HTTP server
llm-prompt-optimizer serve --transport http --port 8765
# WebSocket server
llm-prompt-optimizer serve --transport websocket --port 8766
MCP Server Integration
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"llm-prompt-optimizer": {
"command": "llm-prompt-optimizer",
"args": ["serve", "--transport", "stdio"],
"env": {
"LPO_STRICT_INTENT": "true",
"LPO_SEMANTIC_THRESHOLD": "0.90",
"LPO_LOG_LEVEL": "WARNING"
}
}
}
}
Cursor
Add to .cursor/mcp.json:
{
"mcpServers": {
"lpo": {
"command": "lpo",
"args": ["serve", "--transport", "stdio"]
}
}
}
Available MCP Tools
| Tool | Description |
|---|---|
optimize_prompt |
Full pipeline optimization |
classify_prompt |
Multi-label task classification |
resolve_precise_context |
Line-level context extraction |
discover_dependencies |
Minimal dependency discovery |
detect_prompt_drift |
Scope/intent drift detection |
estimate_prompt_cost |
Token cost estimation |
compress_context |
Context compression |
validate_intent |
Semantic similarity validation |
benchmark_prompt |
Run built-in benchmarks |
Registering the MCP server only makes these tools available to the agent — it does not force the agent to call them on every prompt. To make
optimize_promptrun automatically before each coding turn, see the next section.
Default-On Optimization (install / uninstall / reinstall)
By default, registering the MCP server makes optimize_prompt available to your agent but doesn't force it to be called. To make every coding-related prompt auto-route through the optimizer, the package ships an installer that writes a small, sentinel-bracketed instruction block into your agent's host configuration files.
TL;DR
# 1) install the package
pip install llm-prompt-optimizer
# 2) register the MCP server in your host (see "MCP Server Integration" above)
# 3) install the auto-routing rule
llm-prompt-optimizer install-rules
# inspect / undo at any time
llm-prompt-optimizer rules-status
llm-prompt-optimizer uninstall-rules
What install-rules actually does
It auto-detects which agent hosts are present on your machine (Claude Code, Cursor, Continue) and writes a rule block into each one's user-global config. The rule tells the agent to call optimize_prompt first for any coding-related message and includes explicit fail-open wording: if the optimizer is unavailable, errors, or times out, proceed normally with the user's original message. So a server outage or a missing tool degrades to "no optimization", never to "broken assistant".
A health check runs first and refuses to install if the MCP server is not registered in any host config. This prevents shipping a rule that points at a tool that demonstrably doesn't exist. Pass --force to override.
Where rule files are written
| Host | User-global path | Project-local path (with --scope project) |
|---|---|---|
| Claude Code | ~/.claude/CLAUDE.md |
<repo>/CLAUDE.md |
| Cursor | ~/.cursor/rules/llm-prompt-optimizer.mdc |
<repo>/.cursorrules |
| Continue | ~/.continue/.continuerules |
<repo>/.continuerules |
The install manifest itself lives at ~/.config/llm-prompt-optimizer/install-manifest.json (or %APPDATA%\llm-prompt-optimizer\install-manifest.json on Windows). Override with $LPO_INSTALL_HOME.
Each inserted block is wrapped in sentinels — only this region is ever touched on uninstall:
<!-- BEGIN llm-prompt-optimizer v1 -->
…rule text…
<!-- END llm-prompt-optimizer -->
Inspect what's installed
llm-prompt-optimizer rules-status
Reports the block version, health-check result, which hosts are detected on the machine, which already have the MCP server registered, and the full list of currently managed files.
Reinstall / upgrade
Just run install-rules again. The installer detects an existing block (any prior version, e.g. v0, v1) and replaces it in place. There is no duplication, and the original file_existed_before state is preserved across re-installs — which means a later uninstall-rules still correctly distinguishes files we created from files you already owned.
llm-prompt-optimizer install-rules # safe to run multiple times
Uninstall
llm-prompt-optimizer uninstall-rules
Manifest-driven and surgical:
- Only the BEGIN/END-bracketed region is removed; everything you wrote above and below stays byte-for-byte intact.
- Files the installer created from scratch are deleted (so you don't get an orphaned empty
.mdclying around). - Files you already owned are kept with just the block stripped.
- Running it twice is a no-op — no errors, nothing changed.
Recovering from a stale install
If you removed the package without first running uninstall-rules — e.g. pip uninstall llm-prompt-optimizer then notice an orphaned block weeks later — you can sweep host-standard paths for any leftover BEGIN/END region, even with no manifest entry:
llm-prompt-optimizer uninstall-rules --purge-unknown
If you don't have the package installed anywhere, the block is plain text in a known file — open the rule file in any editor and delete everything between the <!-- BEGIN llm-prompt-optimizer … --> and <!-- END llm-prompt-optimizer --> markers (inclusive).
Flag reference
llm-prompt-optimizer install-rules \
[--host claude-code|cursor|continue] \ # repeat for multiple; default = auto-detect
[--scope user-global|project] \ # default: user-global
[--project-root <path>] \ # required if --scope project
[--force] \ # skip MCP-registration health check
[--dry-run] \ # show what would change, don't write
[--json] # machine-readable output
llm-prompt-optimizer uninstall-rules \
[--purge-unknown] \ # sweep orphans not in the manifest
[--dry-run] \
[--json]
llm-prompt-optimizer rules-status [--json]
What happens if the MCP server is disconnected or removed?
| Scenario | Behavior |
|---|---|
| Server registered but down (crashed, port conflict, dependency upgrade broke it) | Host marks the tool unavailable. The rule's fail-open clause kicks in: agent answers normally with the user's original message. No retry loop, no error surfaced. |
Server fully uninstalled (entry removed from claude_desktop_config.json / mcp.json) |
The tool isn't visible. Fail-open clause still applies, but you should also run uninstall-rules to remove the now-pointless instruction. |
Package removed without cleanup (pip uninstall without uninstall-rules first) |
Orphaned block remains in rule files. Run uninstall-rules --purge-unknown from any environment that still has the package, or delete the BEGIN/END region by hand. |
| Re-installing on top of an existing install | Old block (any version) is found and replaced in place. No duplication. |
| Manifest file is missing or corrupt | Uninstall does not crash. Use --purge-unknown to clean orphans on a best-effort basis. |
| You manually edited the rule file between install and uninstall | Edits outside the BEGIN/END region are preserved. Edits inside the region are lost on uninstall (this is by design — it's our managed region). |
Project-scope install (advanced)
For teams that want every developer who clones the repo to get auto-routing, you can write a project-local rule file that gets committed to git:
llm-prompt-optimizer install-rules --scope project --project-root .
Be aware: this writes <repo>/CLAUDE.md, <repo>/.cursorrules, etc. Anyone who clones the repo without the optimizer installed will see the rule but the tool won't resolve — the fail-open clause means their agent still works, but they'll incur a failed tool call per turn until they install the package or remove the rule. Use this only when the team is committed to the workflow.
REST API
# Start API server
uvicorn llm_prompt_optimizer.api.app:app --host 0.0.0.0 --port 8765
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/optimize |
Optimize a prompt |
POST |
/classify |
Classify a prompt |
POST |
/validate |
Semantic validation |
POST |
/detect-drift |
Drift detection |
POST |
/estimate-cost |
Token cost estimation |
POST |
/resolve-context |
Precise context resolution |
GET |
/plugins |
List loaded plugins |
POST |
/benchmark |
Run benchmarks |
Interactive docs: http://localhost:8765/docs
Docker
# Build and run the API server
docker compose up optimizer-api
# Run tests
docker compose --profile test up optimizer-test
# MCP stdio mode (for agent integration)
docker compose --profile mcp up optimizer-mcp
Configuration
Copy .env.example to .env and configure:
# Policy
LPO_STRICT_INTENT=true
LPO_SEMANTIC_THRESHOLD=0.90
LPO_TOKEN_BUDGET=8000
# MCP Server
LPO_MCP_HOST=127.0.0.1
LPO_MCP_PORT=8765
LPO_MCP_TRANSPORT=stdio
# Logging
LPO_LOG_LEVEL=INFO
# Graph Provider (optional — uses FallbackGraphEngine by default)
LPO_GRAPH_PROVIDER=
Python Configuration
from llm_prompt_optimizer import Optimizer, OptimizerConfig
from llm_prompt_optimizer.config.settings import PolicyConfig, TokenBudgetConfig
config = OptimizerConfig()
config.policy.strict_intent_mode = True
config.policy.semantic_similarity_threshold = 0.90
config.policy.allow_scope_expansion = "controlled"
config.token_budget.default_budget_tokens = 8000
config.token_budget.adaptive_budgeting = True
optimizer = Optimizer(config=config)
Plugin System
from llm_prompt_optimizer.plugins.system import PolicyPlugin, OptimizerPlugin, PluginSystem
from llm_prompt_optimizer import Optimizer
class MyEnterprisePolicy(PolicyPlugin):
name = "enterprise_policy"
version = "1.0.0"
plugin_type = "policy"
def initialize(self, config):
self.blocked_terms = config.get("blocked_terms", [])
def evaluate(self, context):
violations = []
intent = context.get("intent_lock")
if intent:
for term in self.blocked_terms:
if term in intent.intent_summary.lower():
violations.append(f"Blocked term: {term}")
return violations
plugins = PluginSystem()
plugins.register(MyEnterprisePolicy(), config={"blocked_terms": ["prod_db"]})
optimizer = Optimizer(plugin_system=plugins)
Plugin types
| Type | Base Class | Purpose |
|---|---|---|
optimizer |
OptimizerPlugin |
Pre/post prompt middleware |
policy |
PolicyPlugin |
Custom governance rules |
classifier |
ClassifierPlugin |
Custom classification signals |
graph |
GraphPlugin |
Alternative dependency discovery |
telemetry |
TelemetryPlugin |
Custom telemetry backends |
Dependency Resolution
Priority order (graceful degradation):
- External Graph MCP — Graphify, Code Review Graph (when available)
- Local AST Graph — Python AST import chain analysis
- Import Resolver — dotted module → file path resolution
- Symbol Resolver — function/class definition finding
- Execution Path Discovery — call graph tracing
- Git Context — co-changed file analysis
- Folder Heuristic — directory proximity scoring
- Strict User Scope — only explicitly stated files
The system always works standalone — external tools only enhance it.
Adaptive Context Expansion
Context expansion uses a value formula — not fixed limits:
context_value_score = (relevance × confidence × execution_proximity) / token_cost
Stop conditions (not arbitrary limits):
- Confidence gain below threshold
- Token cost exceeds marginal value
- Execution relevance weakens
- Semantic confidence decreases
- Budget exhausted
Precise Context Extraction
Never sends entire files. Returns exact line spans:
{
"file": "utils/condition.py",
"start_line": 61,
"end_line": 95,
"symbol": "calculate_condition",
"confidence": 0.93,
"reason": "Symbol definition: calculate_condition"
}
Principle: Relevant Lines > Relevant Files
Benchmarks
python benchmarks/run_benchmarks.py
python benchmarks/run_benchmarks.py --json
python benchmarks/run_benchmarks.py --case debug_condition_mismatch
Metrics measured:
- Semantic similarity preservation
- Token reduction %
- Classification accuracy
- Context usefulness
- Dependency precision
- Adaptive expansion efficiency
- Constraint preservation
Development
# Clone
git clone https://github.com/your-org/llm-prompt-optimizer
cd llm-prompt-optimizer
# Install in dev mode
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Run tests with coverage
pytest tests/ --cov=llm_prompt_optimizer --cov-report=html
# Lint
ruff check llm_prompt_optimizer/
# Type check
mypy llm_prompt_optimizer/
# Run benchmarks
python benchmarks/run_benchmarks.py
Project Structure
llm-prompt-optimizer/
├── llm_prompt_optimizer/
│ ├── core/
│ │ ├── intent_guard/ # IntentGuard — locks user intent
│ │ ├── classifier/ # Multi-layer prompt classification
│ │ ├── dependency_resolution/ # DependencyResolver orchestrator
│ │ ├── adaptive_context_expansion/ # Value-based expansion
│ │ ├── precise_context/ # Line-level extraction
│ │ ├── optimizer/ # PromptOptimizer + PromptCompiler
│ │ ├── semantic_validator/ # Similarity validation
│ │ ├── drift_detection/ # Drift detector
│ │ ├── fallback_graph/ # Standalone AST graph engine
│ │ ├── token_budget/ # Token budget management
│ │ ├── policy/ # PolicyEngine
│ │ ├── telemetry/ # TelemetryEngine + AuditLogger
│ │ └── benchmarking/ # BenchmarkEngine
│ ├── adapters/ # LLM adapters (Anthropic, OpenAI, Ollama…)
│ ├── mcp_server/ # MCP server (stdio, HTTP, WebSocket, IPC)
│ ├── installer/ # Host-rule install/uninstall (CLAUDE.md, .cursorrules, …)
│ ├── api/ # FastAPI REST endpoints
│ ├── sdk/ # Optimizer SDK (main entry point)
│ ├── plugins/ # Plugin system
│ ├── models/ # Typed data models
│ ├── config/ # Configuration
│ └── utils/ # Helpers, token estimation, AST tools
├── examples/
│ ├── claude_mcp/ # Claude Desktop MCP integration
│ ├── langchain/ # LangChain integration
│ └── local_agents/ # Custom plugin examples
├── tests/
│ ├── core/ # Unit tests per component
│ ├── integration/ # Full pipeline + MCP tests
│ └── api/ # API endpoint tests
├── benchmarks/ # Benchmark runner + cases
├── docs/ # Extended documentation
├── Dockerfile
├── docker-compose.yml
├── pyproject.toml
└── .env.example
Supported Agent Ecosystems
- Claude Desktop (MCP stdio)
- Cursor (MCP stdio)
- Continue.dev (MCP stdio)
- LangChain (SDK middleware)
- AutoGen (SDK middleware)
- CrewAI (SDK middleware)
- Any HTTP client (REST API)
- Any stdio MCP client (MCP server)
License
MIT — see LICENSE
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.