legal_mcp
Enables semantic search over Polish court judgments and legislative acts via MCP. Allows LLMs to retrieve legal documents using natural language queries.
README
Legal MCP - RAG System for Polish Legal Documents
An implementation of a RAG (Retrieval-Augmented Generation) system utilizing the Model Context Protocol (MCP). This project demonstrates a modular architecture for legal document processing and retrieval using ChromaDB as a vector store and Ollama for local LLM inference.
🏗 Project Structure
The project is organized into several microservices:
mcp_server/: The core MCP server that exposes semantic search tools over the vector database. This is the primary service — connect any MCP-compatible LLM client directly to it.ingestion/: REST API service for fetching and embedding documents from the SAOS (court judgments) and ELI (legislative acts) APIs into ChromaDB.frontend/: Optional local chat UI backed by Ollama and the MCP server.data/: Local storage for the ChromaDB database and other persistent assets.scripts/: Utility scripts for ingestion and maintenance.
🚀 Getting Started
Prerequisites
- Docker & Docker Compose: Required for containerized deployment.
- Python 3.10+: For local development.
- Ollama: Installed locally, or use the integrated service in
docker-compose.yml. - NVIDIA Container Toolkit: (Optional) For GPU acceleration within Docker.
Environment Setup
-
Clone the repository:
git clone https://github.com/barwojcik/legal_mcp.git cd legal_mcp -
Copy and review the environment variables:
cp .env.example .env # Edit .env if you want to use OpenAI/Google embeddings instead of Ollama
Running with Docker Compose
Spin up ChromaDB, Ollama, the MCP server, and the ingestion service:
docker compose up -d chroma ollama mcp-server ingestion
To also run the optional frontend:
docker compose up -d
Services will be available at:
- ChromaDB:
http://localhost:8000 - Ollama:
http://localhost:11434 - MCP Server:
http://localhost:8001/mcp - Ingestion API:
http://localhost:8002 - Frontend (optional):
http://localhost:8003
Populate the database
bash scripts/ingest_saso.sh
This fetches one page (20 judgments) from the SAOS API and embeds them into ChromaDB. See scripts/ingest_saso.sh and the Ingestion API docs for more options.
🔌 Connecting a Commercial LLM
The MCP server speaks the Model Context Protocol over HTTP/SSE. Once the stack is running, point your LLM client at http://localhost:8001/mcp.
Claude Desktop
Add the following to your claude_desktop_config.json
(usually at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS
or %APPDATA%\Claude\claude_desktop_config.json on Windows):
{
"mcpServers": {
"legal": {
"url": "http://localhost:8001/mcp",
"type": "http"
}
}
}
Cursor / Zed / other MCP clients
Add an MCP server entry pointing to http://localhost:8001/mcp. Refer to your client's documentation for the exact configuration format.
Once connected, the LLM will have access to 14 tools for searching and retrieving Polish court judgments and legislative acts.
🛠 Ingestion API
The ingestion service exposes a REST API at http://localhost:8002.
Ingest SAOS court judgments:
curl -X POST http://localhost:8002/update \
-H "Content-Type: application/json" \
-d '{"n_pages": 1, "page_size": 20}'
Ingest ELI legislative acts:
curl -X POST http://localhost:8002/eli-update \
-H "Content-Type: application/json" \
-d '{"n_pages": 1, "page_size": 20}'
🛠 Development
Linting and Type Checking
# Run ruff
ruff check . --fix
# Run mypy
mypy .
Pre-commit Hooks
pre-commit install
⚖ License
This project is licensed under the Apache-2.0 licence.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.