Notes MCP Server
A local note-taking server that provides intelligent management of markdown files through hybrid keyword and AI-powered semantic search. It enables users to create, find, and retrieve notes using advanced indexing, vector embeddings, and a built-in web viewer.
README
Notes MCP Server
A local note-taking Model Context Protocol (MCP) server that provides intelligent note management with markdown storage and powerful search capabilities.
Features
- Markdown Storage: Notes stored as markdown files with frontmatter metadata
- Intelligent Indexing: CSV-based index for fast retrieval and search
- Powerful Search: Full-text search with relevance scoring and fuzzy matching
- Semantic Search: AI-powered meaning-based search using embeddings (finds conceptually similar notes)
- Hybrid Search: Combines keyword and semantic search for best results
- ULID-based Naming: Chronologically sortable unique identifiers
- Database-Ready: Structure designed for easy migration to relational databases
- MCP Tools: Four core tools for note management
Quick Start
Installation
pip install -e .
Configuration
Create a configuration file .notes_config.yaml:
notes:
directory: "~/notes"
max_file_size: 10485760 # 10MB
auto_rebuild_index: true
index_rebuild_interval: 3600 # 1 hour
search:
max_results: 50
fuzzy_threshold: 0.8
stop_words_file: "stopwords.txt"
security:
validate_paths: true
sanitize_input: true
Running the Server
Local Development
# Activate virtual environment
source venv/bin/activate
# Run the MCP server
notes-mcp
MCP Server Configuration
The server can be configured through:
- Configuration File:
.notes_config.yaml(recommended) - Environment Variables: Override config file settings
- Command Line Arguments: Runtime overrides
Example configuration file:
notes:
directory: "~/.notes"
max_file_size: 10485760 # 10MB
auto_rebuild_index: true
index_rebuild_interval: 3600 # 1 hour
search:
max_results: 50
fuzzy_threshold: 0.8
stop_words_file: "stopwords.txt"
security:
validate_paths: true
sanitize_input: true
Environment variables:
export NOTES_DIR="~/my-notes"
export SEARCH_MAX_RESULTS=100
export SEARCH_FUZZY_THRESHOLD=0.7
MCP Server Integration
Claude Desktop
-
Install the package:
cd /path/to/notes-mcp pip install -e . -
Configure Claude Desktop:
- Open Claude Desktop settings
- Add MCP server configuration:
{ "mcpServers": { "notes-mcp": { "command": "notes-mcp", "args": [], "env": { "NOTES_DIR": "~/.notes" } } } } -
Restart Claude Desktop to load the MCP server
Windsurf
-
Install the package:
cd /path/to/notes-mcp pip install -e . -
Configure Windsurf:
- Open Windsurf settings
- Add MCP server:
{ "name": "notes-mcp", "command": "notes-mcp", "args": [], "env": { "NOTES_DIR": "~/notes" } } -
Restart Windsurf to enable the MCP tools
Other MCP Clients
The server follows the MCP specification and can be integrated with any MCP-compatible client. Use the following server configuration:
{
"name": "notes-mcp",
"command": "notes-mcp",
"args": [],
"env": {
"NOTES_DIR": "~/notes"
}
}
Usage Examples
Creating Notes
# Using the MCP tools in Claude
create_note(
title="Project Planning",
content="# Project Planning\n\n## Goals\n- Define objectives\n- Set timeline",
tags=["planning", "project"]
)
Searching Notes
# Find notes by content
search_note(query="planning")
# Find notes by tags
find_note(tags=["project", "urgent"])
# Get recent notes
find_note(limit=10)
Managing Notes
# Get a specific note
get_note(identifier="01H8X9V2P3R5Y7T8Q0W2E4R6T8Y0U2I3")
# Update a note
# (Note: This would be implemented through the MCP tools)
Server Commands
The notes-mcp command supports several options:
# Run with custom config file
notes-mcp --config /path/to/config.yaml
# Run with custom notes directory
notes-mcp --notes-dir /path/to/notes
# Show server status
notes-mcp --status
# Rebuild index
notes-mcp --rebuild-index
# Validate index integrity
notes-mcp --validate-index
MCP Tools
create_note
Create a new note with automatic indexing.
Parameters:
title(required): Note titlecontent(required): Markdown contenttags(optional): User-defined tagsfriendly_name(optional): Custom friendly namedate(optional): Custom date
get_note
Retrieve a specific note by ULID or filename.
Parameters:
identifier(required): ULID or filenameinclude_metadata(optional): Include frontmatter metadatainclude_content(optional): Include note body
find_note
Find notes by date range, tags, or friendly name pattern.
Parameters:
date_from(optional): Start date (YYYYMMDDHHmmss)date_to(optional): End date (YYYYMMDDHHmmss)tags(optional): Filter by tagsfriendly_name_pattern(optional): Regex pattern for friendly namelimit(optional): Maximum results (default: 50)
search_note
Full-text search across note content and metadata.
Parameters:
query(required): Search querysearch_in(optional): Search targets ["content", "title", "tags", "summary"]fuzzy(optional): Enable fuzzy matching (default: true)limit(optional): Maximum results (default: 50)
Web Viewer
A lightweight web interface for browsing notes.
Quick Start
# Install frontend dependencies
just web-install
# Option 1: Development mode (with hot reload)
# Terminal 1: Start backend
just web-server
# Terminal 2: Start frontend dev server
just web-dev
# Open http://localhost:5173
# Option 2: Production mode
just web
# Open http://localhost:8000
Features
- Card-based layout showing last 12 notes
- Full-text search with fuzzy matching
- Filter by tags
- Sort by date or title
- Markdown rendering with syntax highlighting
- Stats panel with word frequency
Tech Stack
- Frontend: Svelte + Vite (portable to mobile/desktop via Capacitor/Tauri)
- Backend: FastAPI
- Markdown: marked.js
See WEB.md for detailed documentation.
File Structure
Notes are stored with the filename format: <ULID>_<YYYYMMDDHHmmss>_<friendly_name>.md
Example: 01H8X9V2P3R5Y7T8Q0W2E4R6T8Y0U2I3_20260120143000_meeting_notes.md
Each note contains frontmatter:
---
title: Project Meeting Notes
date: 2026-01-20
tags: project;meeting;urgent
ulid: 01H8X9V2P3R5Y7T8Q0W2E4R6T8Y0U2I3
---
# Project Meeting Notes
Meeting content in markdown format...
Index Format
The .notes_index.csv file contains metadata for fast searching:
ulid|date_key|friendly_name|title|tags|frequent_words|summary|file_path|created_at|updated_at|last_accessed|file_size|checksum
Architecture
The server is entirely file-based — there is no database. Storage is split across three layers:
Storage layers
- Markdown files — each note is a
.mdfile with YAML frontmatter (title, tags, timestamps) and a filename encoding the ULID, date key, and friendly name. - CSV text index (
.notes_index.csv) — a flat CSV file that mirrors note metadata for fast lookups without reading every markdown file. Protected by a file lock for concurrent writes. - FAISS semantic index (
.semantic_index.faiss+.semantic_metadata.json) — vector embeddings generated bybge-small-en-v1.5(384-dim) stored in a FAISSIndexFlatL2index. Optional; gracefully disabled whenfaissorsentence-transformersare not installed.
Search pipeline
┌─────────┐
│ Query │
└────┬────┘
│
┌─────────────┴──────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Keyword Search │ │ Semantic Search │
└────────┬────────┘ └────────┬────────┘
│ │
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Load all notes from │ │ Encode query with │
│ CSV index into │ │ bge-small-en-v1.5 │
│ memory │ │ (384-dim embedding) │
└─────────┬───────────┘ └──────────┬──────────┘
│ │
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Per note, per field:│ │ FAISS IndexFlatL2 │
│ │ │ brute-force L2 │
│ fuzzy=false: │ │ nearest-neighbour │
│ substring match │ │ search │
│ +0.5 word-boundary│ └───────────┬─────────┘
│ │ │
│ fuzzy=true: │ ▼
│ partial_ratio │ ┌──────────────────────┐
│ token_set_ratio │ │ Convert L2 distance │
│ best of two / 100 │ │ to similarity (0–1): │
└─────────┬───────────┘ │ │
│ │ sim = 1 - (d / 10.0) │
▼ │ score = sim × 2.0 │
┌─────────────────────┐ └───────────┬──────────┘
│ Apply field weights:│ │
│ │ │
│ title ×3.0 │ │
│ source_dir ×3.0 │ │
│ source_repo ×3.0 │ │
│ tags ×2.0 │ │
│ summary ×1.5 │ │
│ content ×1.0 │ │
└─────────┬───────────┘ │
│ │
▼ │
┌─────────────────────┐ │
│ Recency boost: │ │
│ │ │
│ factor = 1 - (days │ │
│ since update /365)│ │
│ score += score │ │
│ × factor × 0.2 │ │
│ (max +20% boost) │ │
└─────────┬───────────┘ │
│ │
▼ ▼
keyword_score semantic_score
│ │
└──────────┬───────────────────┘
│
▼
┌────────────────────────────┐
│ Hybrid merge │
│ │
│ combined = keyword × 0.5 │
│ + semantic × 0.5 │
│ │
│ (weights are configurable) │
└─────────────┬──────────────┘
│
▼
┌────────────────────────────┐
│ Sort by combined score │
│ (descending), return top N │
└────────────────────────────┘
The two paths run independently and are merged at the end:
- Keyword path: loads all notes from the CSV index, scores each field using either exact substring matching or fuzzy matching (
thefuzzlibrary withpartial_ratio+token_set_ratio), applies field weights (title/source_dir/source_repo ×3, tags ×2, summary ×1.5, content ×1) and a recency boost (max +20%). - Semantic path: encodes the query with the same sentence-transformer model used at indexing time, performs brute-force L2 nearest-neighbour search against the FAISS index, and converts distances to normalised similarity scores.
- Merge: per-note scores from both paths are combined using weighted addition, then sorted by descending score. Notes found by only one path still appear (the missing path contributes 0).
Reindexing
NoteManager.rebuild_index() orchestrates a full rebuild:
- Truncates the CSV and re-scans all
*.mdfiles from the notes directory, parsing frontmatter and computing checksums. - Fetches the freshly built note list and passes it to
SemanticEngine.rebuild_index(), which creates a new FAISS index from scratch by re-encoding every note.
A validate_index command detects orphaned files (on disk but not in the CSV) and missing files (in the CSV but deleted from disk).
Known Limitations
This implementation is designed for personal use with up to a few hundred notes. The following limitations apply:
Scaling
- Every search loads the entire CSV into memory and scans linearly. Performance degrades noticeably beyond ~1,000 notes.
- Updates and deletes rewrite the entire CSV file. There is no in-place row editing.
add_notescans the full CSV for duplicate checking before appending — O(n) per insert.- No pagination on
get_all_notes— always loads everything into memory. - No caching — repeated searches re-read and re-parse the CSV each time.
Index consistency
- The CSV index and FAISS index are updated independently with no transactional guarantee. A crash between the two writes can leave them out of sync.
get_all_notes()does not acquire the file lock, so reads can race with in-progress writes.- Reindexing is all-or-nothing; there is no incremental sync that detects only changed files.
Semantic search
IndexFlatL2is brute-force O(n) per query. FAISS offers approximate indices (IVF, HNSW) that would improve performance for larger collections.- Uses L2 distance, but sentence-transformer models are typically optimised for cosine similarity. Normalising embeddings or switching to
IndexFlatIPwould be more accurate.
Fuzzy search
- Uses
thefuzz(pure Python Levenshtein) which is slow on long text fields, especiallycontent. There is no short-circuiting for large documents. - The fuzzy threshold is a single global value with no per-field tuning.
Troubleshooting
Common Issues
Server fails to start
# Check if MCP library is installed
pip install mcp>=1.25.0
# Verify the server can start
notes-mcp --status
Notes not appearing in search
# Rebuild the index
notes-mcp --rebuild-index
# Validate index integrity
notes-mcp --validate-index
Permission errors
- Ensure the notes directory exists and is writable
- Check file permissions:
chmod 755 ~/notes
MCP client connection issues
- Verify the server is running:
notes-mcp --status - Check stderr output for errors (logs go to stderr)
- Ensure the command path is correct in MCP client config
Semantic Search
Semantic search finds notes based on meaning, not just keywords. It uses AI embeddings to understand the context and intent of your queries.
How It Works
- When a note is created or updated, its content is converted to a vector embedding
- Embeddings are stored in a FAISS index (
.semantic_index.faiss) - Search queries are also converted to embeddings and compared against note embeddings
- Results are ranked by semantic similarity
Enabling Semantic Search
Semantic search is enabled by default. To configure:
search:
enable_semantic: true # Enable/disable semantic search
semantic_model: "bge-small-en-v1.5" # Embedding model to use
semantic_weight: 0.5 # Weight in hybrid search (0-1)
keyword_weight: 0.5 # Weight in hybrid search (0-1)
Example Queries
Without semantic search (keyword only):
Query: "project planning"
Results: Only notes containing exact words "project" or "planning"
With semantic search:
Query: "project planning"
Results:
- "Project planning" (keyword match)
- "Quarterly roadmap and objectives" (semantically similar)
- "Team goal-setting session" (semantically similar)
- "Sprint retrospective notes" (loosely related)
Performance
- First search: ~2-3 seconds (model loads from cache)
- Subsequent searches: 50-100ms
- Storage: ~1 KB per note in FAISS index
- Model: 33 MB (cached in
~/.cache/huggingface)
Disabling Semantic Search
To use only keyword search (faster if you have memory constraints):
search_engine = SearchEngine(config=config)
search_engine.enable_semantic = False
results = search_engine.search_notes("query")
Debug Mode
Run the test script to verify server functionality:
python test_mcp_server.py
Development
Project Structure
notes-mcp/
├── notes_mcp/
│ ├── __init__.py
│ ├── server.py # MCP server implementation
│ ├── note.py # Note model
│ ├── note_manager.py # High-level note operations
│ ├── index_manager.py # CSV index management
│ ├── search_engine.py # Search and text analysis
│ ├── config.py # Configuration management
│ ├── utils.py # Utility functions
│ └── web/ # Web viewer
│ ├── app.py # FastAPI backend
│ └── frontend/ # Svelte frontend
├── tests/ # Test suite (146 tests)
├── examples/ # Usage examples
└── requirements.txt
Running Tests
# Run all tests
pytest
# Run with coverage report
pytest --cov=notes_mcp --cov-report=html
# Run specific test file
pytest tests/test_search_engine.py -v
Code Coverage
Current coverage: 88% (excluding CLI tools)
pytest --cov=notes_mcp --cov-report=term-missing
API Reference
Python API
from notes_mcp.note_manager import NoteManager
from notes_mcp.search_engine import SearchEngine
from notes_mcp.config import Config
# Initialise
config = Config()
note_manager = NoteManager(config)
search_engine = SearchEngine(note_manager, config)
# Create a note
note = note_manager.create_note(
title="My Note",
content="Note content here",
tags=["tag1", "tag2"]
)
# Search notes
results = search_engine.search_notes("query", fuzzy=True)
for result in results:
print(f"{result.note.title}: {result.score}")
# Find notes by criteria
notes = search_engine.find_notes(
tags=["project"],
date_from="20260101",
date_to="20261231"
)
SearchEngine Methods
| Method | Description |
|---|---|
search_notes(query, search_in, fuzzy, limit) |
Full-text search with relevance scoring |
find_notes(date_from, date_to, tags, pattern, limit) |
Filter notes by criteria |
fuzzy_match(text, query, threshold) |
Calculate fuzzy match score |
advanced_search(query, tags, date_from, date_to, fuzzy, boost_tags, limit) |
Combined search with boosting |
TextAnalyzer Methods
| Method | Description |
|---|---|
tokenize_text(text) |
Split text into tokens |
remove_stop_words(words) |
Filter out common words |
extract_keywords(text, max_keywords) |
Extract important keywords |
generate_summary(text, max_sentences) |
Create text summary |
calculate_word_frequency(text) |
Count word occurrences |
License
MIT License - see LICENSE file for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.