mcp-just-seek-knowledge
Stores and searches AI-generated knowledge about software projects, enabling Cursor to access project structures, design patterns, best practices, and technical documentation.
README
mcp-just-seek-knowledge
MCP (Model Context Protocol) server that stores and searches AI-generated knowledge about software projects, allowing Cursor to access information about project structures, design patterns, best practices, and technical documentation.
📋 About the Project
Objective
Create an MCP server that stores and searches AI-generated knowledge about software projects.
Technology Stack
- Language: Python
- Embedding Framework: LangChain
- Database: PostgreSQL with pgVector
- Protocol: MCP (Model Context Protocol) for Cursor integration
Main Features
- Ingest: Create new records in the knowledge base
- Update: Update existing records in the knowledge base
- Search: Semantic search in the database
- List Catalog: List all existing
service_namein the database (exposed as MCP tool) - Delete: Delete records by
service_name(available via CLI script, not exposed as MCP tool)
🛠️ Environment Setup
Complete Setup Process
1. Clone the project or navigate to it (if needed)
cd /home/pereirrd/dev/git/pereirrd/mcp-just-seek-knowledge
2. Create and activate virtual environment
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
# On Linux/WSL:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate
3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
4. Configure environment variables
Create a .env file in the project root (copy from .env.example if it exists, or create manually):
# Example .env
PGVECTOR_URL=postgresql://postgres:postgres@localhost:5433/software_design_knowledge
POSTGRES_HOST=localhost
POSTGRES_PORT=5433
POSTGRES_DB=software_design_knowledge
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
OPENAI_API_KEY=your_openai_api_key
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSION=1536
Note: PostgreSQL variables can also be configured in Cursor's mcp.json (see section below).
5. Start PostgreSQL (if using Docker Compose)
docker-compose up -d
This will create PostgreSQL with pgvector automatically on port 5433.
Important: If port 5432 is already in use, docker-compose.yml is configured to automatically use port 5433.
6. Test the MCP server (optional)
python src/mcp_server.py
The server should start without errors and automatically create the software_design_knowledge table if it doesn't exist.
Verify Installation
To verify if dependencies were installed correctly:
pip list | grep -E "langchain|psycopg|openai|python-dotenv"
Or test imports directly:
python -c "from src.database.connection import get_connection_string; from src.mcp.mcp_server import MCPServer; print('✅ Dependencies installed correctly!')"
⚙️ Cursor Configuration
To add this MCP server to Cursor, configure the ~/.cursor/mcp.json file (global configuration) or .cursor/mcp.json in the project root (local configuration).
Example configuration (~/.cursor/mcp.json):
{
"mcpServers": {
"mcp-just-seek-knowledge": {
"command": "python",
"args": ["/absolute/path/to/project/src/mcp_server.py"],
"env": {
"OPENAI_API_KEY": "your_openai_api_key",
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"EMBEDDING_DIMENSION": "1536"
}
}
}
}
Important:
- Use absolute paths in the
argsfield - Configure all necessary environment variables
- Cursor loads this file automatically on startup
- After adding, restart Cursor to load the MCP server
Note about Cursor
When configuring MCP in Cursor (~/.cursor/mcp.json), Cursor will use the system Python or the one active in PATH. Recommendations:
Option 1: Use global Python (install dependencies globally)
If you prefer to use the system's global Python:
pip install -r requirements.txt
And configure mcp.json with:
{
"mcpServers": {
"mcp-just-seek-knowledge": {
"command": "python",
"args": ["/absolute/path/to/project/src/mcp_server.py"],
"env": {
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"EMBEDDING_DIMENSION": "1536"
}
}
}
}
Option 2: Use virtual environment Python (recommended)
To use the project's virtual environment, specify the full path to the venv Python in mcp.json:
{
"mcpServers": {
"mcp-just-seek-knowledge": {
"command": "/absolute/path/to/mcp-just-seek-knowledge/venv/bin/python",
"args": ["/absolute/path/to/mcp-just-seek-knowledge/src/mcp_server.py"],
"env": {
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"EMBEDDING_DIMENSION": "1536"
}
}
}
}
Advantages of Option 2:
- Isolates project dependencies
- Avoids conflicts with other Python projects
- Facilitates version management
Note: The project's .env file will be automatically loaded by the MCP server, so you don't need to repeat PostgreSQL variables in mcp.json (unless you prefer).
🚀 Implementation
Preparation and Structure
Directory Structure
Created src/ structure with organized subdirectories:
src/database/- Database managementsrc/embeddings/- Embedding servicessrc/services/- Business services (ingest, update, search)src/mcp/- MCP server and handlers
__init__.py files created in all Python packages.
Dependency Configuration
requirements.txt file created with all necessary dependencies:
- LangChain Framework: langchain, langchain-community, langchain-core, langchain-openai, langchain-postgres
- PostgreSQL: psycopg, pgvector
- OpenAI: openai
- Utilities: python-dotenv
Environment Variables
.env.example file created with all necessary variables:
PGVECTOR_URL- PostgreSQL connection URLPOSTGRES_DB,POSTGRES_USER,POSTGRES_PASSWORDOPENAI_API_KEY,OPENAI_EMBEDDING_MODELEMBEDDING_DIMENSION
.gitignore file configured to exclude .env and Python and IDE files.
Docker and PostgreSQL
docker-compose.yml file created with:
- PostgreSQL service using
pgvector/pgvector:pg16image - Volume configuration for persistence
- Healthcheck configured
- Ports and environment variables configured
Initialization script init-scripts/01-init-pgvector.sh to automatically create the pgvector extension.
Database Configuration
Database Schema (src/database/schema.py)
Structure of software_design_knowledge table (software project knowledge):
id- Unique identifier (SERIAL PRIMARY KEY)service_name- Service name (VARCHAR(255) NOT NULL UNIQUE)content- Knowledge content (TEXT NOT NULL)embedding- Embedding vector (vector(1536) NOT NULL)metadata- Additional metadata (JSONB)created_at- Creation date (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)updated_at- Update date (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)
Indexes:
- IVFFlat index for optimized vector search
- Index for
service_namefor service searches
Triggers:
- Automatic trigger to update
updated_aton updates
Connection Management (src/database/connection.py)
Implemented functions:
get_connection_string()- Gets connection string from environment variablescreate_connection()- Creates PostgreSQL connectionsschema_exists()- Checks if table existscreate_schema()- Creates complete schema (table, indexes, triggers)initialize_database()- Initializes the database
Error handling and logging implemented.
Data Repository (src/database/repository.py)
KnowledgeRepository class implemented using psycopg directly.
Implemented methods:
insert()- Insert document into databaseupdate()- Update document by service_nameupsert()- Insert or update (upsert behavior)delete()- Delete document by service_nameget_by_service_name()- Search document by service_namesimilarity_search()- Semantic search using pgVector (<=>operator)
Features:
- Support for optional filters (similarity threshold, service_name filter)
- Integration with JSONB metadata structure
Embedding Services
EmbeddingService class (src/embeddings/embedding_service.py) using OpenAIEmbeddings from LangChain.
Features:
- Single and batch embedding creation
- Configuration via environment variables (default model:
text-embedding-3-small) - Error handling and logging
Business Services
Four main services implemented:
Ingest Service (src/services/ingest_service.py)
- Adds new knowledge to the database
- Validates
service_nameandcontent - Automatically creates embedding
- Complete error handling
Update Service (src/services/update_service.py)
- Updates existing knowledge (upsert behavior)
- If
service_namedoesn't exist, creates new record - If exists, updates existing record
- Automatically updates embedding
Search Service (src/services/search_service.py)
- Semantic search by similarity
- Optional parameters:
k(number of results),threshold(minimum similarity),service_name(filter) - Returns results ordered by relevance
List Catalog Service (src/services/list_catalog_service.py)
- Lists all existing
service_namein the database - Does not use embeddings (repository only)
Common features:
- Integration with
EmbeddingServiceandKnowledgeRepository - Input validation
- Error handling
- Detailed logging
- Structured returns
🗑️ CLI Scripts
Record Deletion
The project includes a CLI script for record deletion that is not exposed as an MCP tool. This functionality is only available via command line for administrative operations.
Script: src/database/delete_service.py
Functionality:
- Deletes a record from the knowledge base by
service_name - Validates record existence before deletion
- Provides clear feedback on operation result
Usage:
python src/database/delete_service.py <service_name>
Examples:
# Delete a specific service
python src/database/delete_service.py user-service
# The script returns:
# - ✓ "Record deleted successfully" if the record was found and removed
# - ✗ "Record not found" if the service_name doesn't exist
# - ✗ "Error deleting record" in case of operation failure
Features:
- Parameter validation (service_name cannot be empty)
- Error handling with detailed logging
- Appropriate exit codes (0 for success, 1 for failure)
- Clear feedback messages for the user
Note: This functionality is not available as an MCP tool for security and access control reasons. Use only for necessary administrative operations.
📚 pgvector Initialization Script
The init-scripts/01-init-pgvector.sh script is automatically used by PostgreSQL during container initialization.
How it works
1. Volume mapped in docker-compose.yml
The local init-scripts/ directory is mapped to /docker-entrypoint-initdb.d inside the container through volume configuration in docker-compose.yml.
2. PostgreSQL automatic behavior
The official PostgreSQL image (including pgvector/pgvector) automatically executes all files present in /docker-entrypoint-initdb.d when:
- The database is initialized for the first time (when the data volume is empty)
- Files are executed in alphabetical order (hence the 01- prefix)
- Accepts .sql, .sh and other executable files
3. What the script does
The 01-init-pgvector.sh script:
- Executes
CREATE EXTENSION IF NOT EXISTS vector;to create the pgvector extension - Lists installed extensions for verification
- Uses
set -eto stop on error
Important
- Scripts in
init-scripts/are only executed on first initialization (when volume is empty) - If the container has been started before, the script will not be executed again
- To re-execute, it's necessary to remove the volume:
docker-compose down -v
⌨️ Cursor Commands (Slash Commands)
This repository includes custom Cursor commands in .cursor/commands/, which help create, update, and list the knowledge base in the MCP mcp-just-seek-knowledge.
Available commands
/criar_base_conhecimento: analyzes the entire open workspace (all projects/directories), reads documentation (including Swagger/OpenAPI) and creates a unique record for the workspace usingmcp-just-seek-knowledge.ingest./atualizar_base_conhecimento: same analysis as the previous command, but updates (upsert) the workspace record usingmcp-just-seek-knowledge.update./listar_base_conhecimento: lists existingservice_nameviamcp-just-seek-knowledge.list_catalogand presents a friendly layout withcount,service_nameandmetadata(enriched viamcp-just-seek-knowledge.search).
How to use
- Ensure the MCP
mcp-just-seek-knowledgeis configured in Cursor (~/.cursor/mcp.jsonor.cursor/mcp.json). - Open the project(s) in the Cursor workspace.
- In Cursor chat, execute a command by typing:
/criar_base_conhecimento/atualizar_base_conhecimento/listar_base_conhecimento
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.