MCP Servers

local_lense

A local RAG-powered documentation search system that uses vector embeddings and Qdrant to enable semantic search across markdown, HTML, and other file formats. It provides an MCP interface for AI tools like Cursor to intelligently query and retrieve information from local knowledge bases.

README

local_lense

A production-ready RAG (Retrieval-Augmented Generation) system that enables semantic search across local documentation using vector embeddings and similarity search. Built with TypeScript, this tool demonstrates modern AI integration patterns including vector databases, embedding generation, and MCP (Model Context Protocol) tooling.

Perfect for: Engineering teams needing intelligent documentation search, knowledge bases, or RAG system implementations.

What is local_lense?

local_lense is a RAG (Retrieval-Augmented Generation) powered documentation search tool that:

Indexes your local documentation - Processes markdown, HTML, JSON, YAML, and text files to create a searchable vector index
Semantic search - Uses vector embeddings to find relevant content based on meaning, not just keywords
Cursor integration - Exposes search capabilities via MCP so Cursor AI can search your docs
Fast and local - Everything runs locally with Qdrant vector database
Extensible - Supports custom source processors for indexing content from web, databases, or other sources

How it works

local_lense uses a RAG (Retrieval-Augmented Generation) architecture:

Indexing Phase:
- Scans your configured documentation directory
- Splits documents into chunks
- Generates vector embeddings using transformer models
- Stores embeddings in Qdrant vector database
Search Phase:
- Takes a natural language query
- Generates an embedding for the query
- Searches Qdrant for similar document chunks
- Returns relevant sections with relevance scores
Refresh Mechanism:
- Uses a single "docs" collection that is dropped and re-indexed on initialization
- Simple and straightforward approach for reliable indexing
MCP Integration (Future):
- Exposes search as MCP tools
- Cursor AI can query your docs directly
- Seamless integration with your workflow

Prerequisites

Node.js (v18 or higher)
Docker and Docker Compose (for Qdrant vector database)
TypeScript (installed as dev dependency)

Quick Start

1. Clone the repository

git clone <repository-url>
cd local_lense

2. Install dependencies

npm install

3. Start Qdrant vector database

docker-compose up -d

This starts a Qdrant container on localhost:6333. The data persists in a Docker volume.

4. Configure your documentation path

Edit configs.json:

{
  "sourcePath": "~/Documents/my-docs",
  "searchResultLimit": 3
}

sourcePath: Path to your documentation directory (supports ~ for home directory)
searchResultLimit: Maximum number of search results to return

5. Build the project

npm run build

6. Run indexing and search

Currently, the tool runs as a test script. Edit src/main.ts to configure your search query, then:

npm run dev

Configuration

configs.json

The main user configuration file located in the project root:

sourcePath (string, required): Path to your documentation directory
- Important: Use full absolute paths - avoid using ~ (tilde) for home directory expansion
- Example: Use "/Users/username/Documents/my-docs" instead of "~/Documents/my-docs"
- Full paths ensure reliable operation across different contexts and environments
searchResultLimit (number, optional): Maximum number of results per search
- Default: 3
keywordBoost (boolean, optional): Enable keyword-based score boosting to improve relevance with local embedding models
- Boosts scores when query keywords appear in document content or file paths
- Default: true
keywordBoostWeight (number, optional): Controls the strength of keyword boosting (0.0 to 1.0)
- Higher values increase the boost effect
- Default: 0.2 (20% boost weight)

Note: Collection management is handled automatically by the system. The system uses a single "docs" collection that is always dropped and re-indexed on initialization.

Docker Compose

The docker-compose.yaml file configures Qdrant:

Port: 6333 (Qdrant HTTP API)
Storage: Persistent volume qdrant_storage
Health checks: Automatic container health monitoring

Supported File Types

The default FileSourceProcessor (see src/ragIndexer/implementations/fileSourceProcessor.ts) supports the following file types:

Fully Supported

Markdown: .md, .markdown
HTML: .html, .htm
JSON: .json
YAML: .yaml, .yml
Text: .txt, .text

Other Files

Files with unsupported extensions are processed as ContentType.OTHER. While they will be indexed, the content may not be optimally formatted for search.

The processor recursively scans directories and automatically detects file types based on their extensions. All supported files are read as UTF-8 text.

Custom Source Processors

local_lense uses a pluggable source processor architecture. While the default implementation processes local files, you can implement custom source processors to index content from other sources.

Implementing a Custom Processor

To create a custom source processor, implement the ISourceProcessor interface (see src/ragIndexer/types.ts):

import { ISourceProcessor, SourceItem } from './types';

export class MyCustomProcessor implements ISourceProcessor {
    public get sourceItems(): ReadonlyArray<SourceItem> {
        // Return processed source items
    }

    public process(): ReadonlyArray<SourceItem> {
        // Fetch and process content from your source
        // Return array of SourceItem objects with:
        // - sourceLocation: identifier (file path, URL, etc.)
        // - contentType: ContentType enum value
        // - content: the actual content string
    }
}

Example Use Cases for Custom Processors

Web Scraping: Index content from websites or web APIs
Database Sources: Query and index content from databases
Cloud Storage: Index documents from Google Drive, Dropbox, etc.
RSS Feeds: Index blog posts or news articles
Git Repositories: Index code documentation from git repos

See src/ragIndexer/types.ts for the complete interface definition and type definitions.

Using in Cursor

Prerequisites: Before using local_lense in Cursor, ensure Docker is running and Qdrant is started:

# In your local_lense directory
docker-compose up -d

This starts the Qdrant vector database on localhost:6333, which the MCP tool requires.

Configuration: Add local_lense to your Cursor MCP server settings:

{
  "mcpServers": {
    "local_lense": {
      "command": "node",
      "args": ["/full/path/to/your/local_lense/build/index.js"],
      "env": {}
    }
  }
}

Important Notes:

Use the full absolute path to build/index.js in the args field
Avoid hyphens in repository/directory names (use underscores instead) due to Cursor MCP configuration parsing issues
See "When local_lense Works Best" below for important usage limitations and recommendations

When local_lense Works Best

Reliable Usage

When documentation is indexed from paths OUTSIDE Cursor's working directory:

Examples: /Users/name/Documents/my-docs, /Users/name/notes, separate directory from codebase
Cursor will use the MCP tool because built-in tools can't access those paths

Limited Usage

When documentation is indexed from paths WITHIN Cursor's working directory:

Cursor may use built-in grep instead of the MCP tool
This is an acceptable limitation - built-in tools will handle searches in the working directory.
Workaround this by specifying more direct queries. e.g. "use 'search' from your registered mcp tools with query: {query}"

Best Practices

Directory Placement: Index documentation from directories outside your project workspace for most reliable MCP tool usage
Path Configuration: Always use full absolute paths in configs.json (see Configuration section above)
Repository Naming: Use underscores instead of hyphens in directory names to avoid MCP path parsing issues

Example Use Cases

Engineering Documentation: Search team wikis, architecture docs, API documentation
Personal Knowledge Base: Index your notes, research, and personal documentation
Project Documentation: Quick access to project-specific docs and guides
Research Notes: Semantic search across research papers and notes

Architecture

┌─────────────────────────────────────────────────────────┐
│                    RAG Pipeline                         │
└─────────────────────────────────────────────────────────┘

Indexing Flow:
┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  Documents   │ --> │   Chunking   │ --> │  Embeddings  │
│  (MD/HTML)   │     │   Strategy   │     │  Generation  │
└──────────────┘     └──────────────┘     └──────┬───────┘
                                                   │
                                                   ▼
                                          ┌──────────────┐
                                          │   Qdrant     │
                                          │ Vector Store │
                                          └──────┬───────┘
                                                 │
Search Flow:                                      │
┌──────────────┐     ┌──────────────┐           │
│ User Query   │ --> │   Embed      │ ----------┘
│ (Natural Lang)│     │   Query      │
└──────────────┘     └──────────────┘
                            │
                            ▼
                     ┌──────────────┐
                     │ Similarity   │
                     │   Search     │
                     └──────┬───────┘
                            │
                            ▼
                     ┌──────────────┐
                     │  Ranked      │
                     │  Results     │
                     └──────────────┘

Troubleshooting

Qdrant connection errors

Ensure Docker is running: docker ps
Check Qdrant container: docker-compose ps
Verify port 6333 is available: curl http://localhost:6333/health

Path not found errors

Verify sourcePath in configs.json exists (see Configuration section for path requirements)
Check file permissions
Ensure the path is accessible from the local_lense working directory

Empty search results

Run indexing first: await ragIndexer.init() in main.ts
Verify documents were processed (check Qdrant dashboard at http://localhost:6333/dashboard)
Ensure the "docs" collection exists and contains indexed documents

Build errors

Ensure TypeScript is installed: npm install
Check Node.js version: node --version (should be v18+)
Clear build cache: rm -rf build && npm run build

MCP tool not being used

Symptom: Cursor uses grep instead of local_lense search tool
Cause: Documentation path is within Cursor's working directory
Solution: Move documentation to a separate directory or accept the limitation

Repository name with hyphens causes path truncation

Symptom: MCP server path gets truncated (e.g., local-lense becomes local)
Cause: Cursor's MCP server configuration has issues parsing paths containing hyphens
Solution: Use underscores instead of hyphens in repository/directory names (e.g., local_lense instead of local-lense)
Note: This is a Cursor MCP configuration limitation, not a local_lense issue

Development

Project Structure

local_lense/
├── src/
│   ├── main.ts                    # Entry point (test script)
│   ├── services/                  # Core services
│   │   ├── configService.ts       # Configuration management
│   │   └── embedService.ts       # Embedding generation
│   ├── ragIndexer/                # Indexing logic
│   │   ├── ragIndexer.ts
│   │   └── implementations/
│   │       └── fileSourceProcessor.ts
│   └── ragSearch/                 # Search logic
│       ├── ragSearch.ts
│       └── implementations/
│           ├── qdrantVectorSearchService.ts
│           ├── qdrantVectorCollectionService.ts
│           └── qdrantVectorStorageService.ts
├── configs.json                   # Configuration file
├── docker-compose.yaml            # Qdrant setup
└── package.json

Building

npm run build

Output goes to build/ directory.

Running Development Mode

npm run dev

Uses tsx to run TypeScript directly without building.

Roadmap

[ ] MCP server implementation for Cursor integration
[ ] Relevance score tuning and filtering

License

ISC

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured