Legal Contract Review Agent
An AI-powered MCP server for analyzing Japanese legal contracts and identifying risks through a RAG-enhanced workflow. It enables clients to search legal knowledge, analyze clause risks, and generate automated contract review reports.
README
ContractGuard
Japanese contract risk analysis built as an AI engineering case study — LangGraph workflow + pgvector RAG + multi-modal ingestion + recoverable streaming UX.
⚠️ Not a legal service. This repository has never been operated commercially — Japan Attorney Act §72 (弁護士法第72条) reserves paid legal advice for licensed attorneys. The codebase is published as an open-source technical artifact only. Outputs are not legal opinions.
Status
Production-ready open-source reference implementation. The full stack — frontend, backend, OCR, payment, email, Postgres, Redis, error tracking — is wired with real integrations and ready to deploy. It has simply never been launched, by design (Attorney Act §72).
A synthetic Japanese contract sits in docs/samples/ so the local flow can be exercised end-to-end immediately after clone.
Architecture
flowchart LR
U[React/Vite UI<br/>text, PDF, image upload] --> API[FastAPI routers]
API --> Q[Quote + PII + OCR budget guards]
Q --> PAY[KOMOJU checkout<br/>reference implementation]
PAY --> JOB[Persistent analysis job]
JOB --> SSE[Recoverable SSE stream<br/>status + events + after_seq]
JOB --> LG[LangGraph pipeline]
LG --> P[parse_contract]
P --> A[clause-by-clause risk analysis]
A --> T[tool call: analyze_clause_risk]
T --> RAG[(PostgreSQL pgvector<br/>331 Japanese legal articles)]
A --> S[tool call: generate_suggestion<br/>medium/high risks only]
S --> REP[report generation + translation]
REP --> CACHE[(Redis 72h report cache)]
REP --> DB[(PostgreSQL orders/reports/costs)]
Tech Stack
| Layer | Stack |
|---|---|
| Frontend | React, Vite, TypeScript, i18next (9 languages) |
| Backend | FastAPI, SQLAlchemy async, Alembic, APScheduler |
| AI workflow | LangGraph + OpenAI tool calling, MCP server |
| RAG | PostgreSQL pgvector, 331 public e-Gov Japanese statutes |
| OCR | Google Cloud Vision (DOCUMENT_TEXT_DETECTION) |
| Storage | PostgreSQL (orders / reports / events), Redis (72h cache + rate limiting) |
| Payment | KOMOJU checkout |
| Resend | |
| Observability | Sentry + PostHog |
| Infra | Docker Compose (local), Fly.io + Vercel (deployment reference) |
Quick Start (local)
Local run only requires an OpenAI API key.
cp .env.example .env
# Edit .env: set OPENAI_API_KEY
docker compose up --build
Then open http://localhost:5173 and upload docs/samples/sample-contract-ja.txt.
In this minimal mode:
- ✅ Plain-text contracts and text-based PDFs (selectable text) work end-to-end.
- ❌ Image / scanned-PDF OCR is disabled. To enable it, add
GOOGLE_APPLICATION_CREDENTIALS_JSONandGOOGLE_VISION_PROJECT_ID. - KOMOJU / Resend auto-bypass in dev — no real payment, no real email.
Production Setup
The repository is shaped to deploy to production by setting APP_ENV=production and supplying credentials for each external service:
| Service | Required env vars |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Google Cloud Vision (OCR) | GOOGLE_APPLICATION_CREDENTIALS_JSON, GOOGLE_VISION_PROJECT_ID |
| KOMOJU (payment) | KOMOJU_SECRET_KEY, KOMOJU_PUBLISHABLE_KEY, KOMOJU_WEBHOOK_SECRET |
| Resend (email) | RESEND_API_KEY |
| Sentry | SENTRY_DSN, VITE_SENTRY_DSN |
| PostHog | POSTHOG_API_KEY, VITE_POSTHOG_KEY |
| Database / Cache | DATABASE_URL (managed Postgres + pgvector), REDIS_URL (managed Redis) |
| App | FRONTEND_URL (non-localhost), ADMIN_API_TOKEN |
When APP_ENV=production, the app refuses to boot if any of the above is missing or FRONTEND_URL still points at localhost. Strict-validation logic lives in backend/config.py (validate_runtime()).
fly.toml and vercel.json describe the deployment topology used during development. The service is not currently hosted.
Flow
- Upload a contract (text, PDF, or image). The upload route runs text extraction, PII checks, token estimation, non-contract detection, and OCR budget guards.
- Checkout reference path creates an order. Empty KOMOJU credentials trigger a local bypass in dev.
/review/:orderIdstarts or resumes the persistent analysis job and streams progress events that survive page refresh.- LangGraph parses clauses, analyzes each clause with RAG-grounded tool calls, and generates suggestions only where the risk warrants it.
/report/:orderIdshows the saved report, clause excerpts, risk filters, and PDF export — retained for 72 hours.
User contract text is deleted after analysis. The vector store contains only public e-Gov statutes; user contracts are never embedded.
Demo

Repository Map
backend/agent/graph.py— LangGraph pipeline.backend/agent/tools.py— RAG-grounded tool calls.backend/services/analysis_executor.py— persistent analysis job + event sourcing.backend/rag/store.py— pgvector storage and search.backend/config.py— runtime configuration and strict validation.frontend/src/pages/ReviewPage.tsx— recoverable analysis progress UI.frontend/src/pages/ReportPage.tsx— report UI with risk filters and PDF export.tests/— backend pytest suites.scripts/smoke_local_flow.sh— end-to-end local smoke test.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.