Fraud Detection MCP
AI-powered fraud detection and investigation platform that exposes tools for querying, scoring, and investigating financial applications using LangGraph, MLflow, and SQLite in-memory.
README
Fraud Detection MCP
AI-powered fraud detection and investigation platform using a real financial fraud dataset, SQLite in-memory SQL, MLflow model tracking, MCP tools, and LangGraph agents.
The project is intentionally built like a production-style AI/ML system rather than a notebook:
- Real dataset: Bank Account Fraud Dataset Suite from Kaggle / NeurIPS 2022.
- SQL layer: CSV is loaded into SQLite in-memory, so all fraud tools query SQL tables.
- MLflow: model training logs metrics, parameters, artifacts, and the trained model.
- OpenAI: investigation summaries require
OPENAI_API_KEYfrom.env. - LangGraph: orchestrates the fraud investigation workflow.
- MCP: exposes fraud tools as a Model Context Protocol server.
- FastAPI: serves investigation endpoints and a small dashboard page.
Architecture
BAF Kaggle Dataset
↓
scripts/download_dataset.py
↓
data/processed/baf_base_sample.csv
↓
scripts/train_model.py
↓
MLflow run + models/fraud_model.joblib
↓
FastAPI app
↓
SQLite :memory: SQL database
↓
Fraud tools
↓
LangGraph Fraud Agent
↓
OpenAI investigation summary
MCP Server exposes the same fraud tools to MCP-compatible clients.
1. Create and activate venv
Windows:
py -3.11 -m venv .venv
.venv\Scripts\activate
Mac / Linux:
python3 -m venv .venv
source .venv/bin/activate
Install dependencies:
pip install -r requirements.txt
2. Configure OpenAI key
Create .env from the example:
copy .env.example .env
Mac / Linux:
cp .env.example .env
Then edit .env:
OPENAI_API_KEY=sk-your-real-key
OPENAI_MODEL=gpt-4.1-mini
Do not commit .env to GitHub.
3. Download the real BAF dataset
This uses KaggleHub, not a fake dataset.
python scripts/download_dataset.py
It downloads:
sgpjesus/bank-account-fraud-dataset-neurips-2022
Then it copies the base CSV and creates:
data/processed/baf_base_sample.csv
By default it samples 50,000 rows for fast local development. You can change --sample-size.
Example:
python scripts/download_dataset.py --sample-size 100000
4. Train the model with MLflow
python scripts/train_model.py
Outputs:
models/fraud_model.joblib
models/fraud_model_metadata.json
mlruns/
Open MLflow UI:
mlflow ui --backend-store-uri ./mlruns
Then open:
http://127.0.0.1:5000
5. Run the FastAPI app
uvicorn app.main:app --reload
Open:
http://127.0.0.1:8000
Useful endpoints:
GET /health
GET /applications/high-risk?limit=10
GET /applications/{application_id}
POST /applications/{application_id}/score
POST /applications/{application_id}/investigate
GET /cases
GET /audit
Example:
curl -X POST http://127.0.0.1:8000/applications/10/investigate
6. Run the MCP server
python -m mcp_servers.fraud_mcp_server
The MCP server exposes tools such as:
get_application
list_high_risk_applications
score_application
investigate_application
create_review_case
get_review_cases
get_audit_log
safe_select_query
LangGraph workflow
START
↓
load_application
↓
score_application
↓
policy_decision
↓
llm_investigation
↓
maybe_create_review_case
↓
END
The agent uses deterministic tools for data access and scoring, then calls OpenAI to produce an analyst-style investigation summary based on evidence.
Why SQLite in-memory?
The first version uses SQLite :memory: to provide a real SQL query layer without requiring PostgreSQL or Docker. The storage layer is abstracted so it can later be replaced with PostgreSQL while keeping the MCP tools and LangGraph agent almost unchanged.
Suggested GitHub description
Fraud detection and investigation platform using OpenAI, LangGraph, MCP tools, MLflow, SQLite in-memory, and the Bank Account Fraud Dataset.
Suggested topics:
fraud-detection, mcp, ai-agents, langgraph, openai, mlflow, sqlite, fastapi, machine-learning, financial-ai
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.