SACL MCP Server

SACL MCP Server

Provides bias-aware code retrieval for AI assistants by detecting textual bias in code search, semantically augmenting code understanding beyond comments and docstrings, and intelligently reranking results based on functional relevance rather than documentation quality.

Category
Visit Server

README

SACL MCP Server

Semantic-Augmented Reranking and Localization for Code Retrieval

A Model Context Protocol (MCP) server that implements the SACL research framework to provide bias-aware code retrieval for AI coding assistants like Claude Code, Cursor, and other MCP-enabled tools.

๐ŸŽฏ Overview

SACL addresses the critical problem of textual bias in code retrieval systems. Traditional systems over-rely on surface-level features like docstrings, comments, and variable names, leading to biased results that favor well-documented code regardless of functional relevance.

Key Features

  • ๐Ÿง  Bias Detection: Identifies over-reliance on textual features
  • ๐Ÿ” Semantic Augmentation: Enriches code understanding beyond surface text
  • ๐Ÿ“Š Intelligent Reranking: Prioritizes functional relevance over documentation
  • ๐ŸŽฏ Code Localization: Pinpoints functionally relevant code segments
  • ๐Ÿ”— Relationship Analysis: Maps code dependencies and relationships
  • ๐ŸŽจ Context-Aware Retrieval: Returns results with related components
  • ๐Ÿš€ Agent-Controlled Updates: Explicit file updates for Docker compatibility
  • ๐Ÿ—„๏ธ Knowledge Graph: Persistent semantic storage with Graphiti/Neo4j
  • ๐Ÿ”ง MCP Integration: Works with Claude Code, Cursor, and other AI tools

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   AI Assistant  โ”‚โ”€โ”€โ”€โ”€โ”‚  SACL MCP Server โ”‚โ”€โ”€โ”€โ”€โ”‚   Graphiti/Neo4j โ”‚
โ”‚ (Claude, Cursor)โ”‚    โ”‚                 โ”‚    โ”‚  Knowledge Graph โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  SACL Framework โ”‚
                    โ”‚                 โ”‚
                    โ”‚ โ€ข Bias Detectionโ”‚
                    โ”‚ โ€ข Semantic Aug. โ”‚
                    โ”‚ โ€ข Reranking     โ”‚
                    โ”‚ โ€ข Localization  โ”‚
                    โ”‚ โ€ข Relationships โ”‚
                    โ”‚ โ€ข Context-Aware โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿš€ Quick Start

Prerequisites

  • Node.js 18+
  • Neo4j database
  • OpenAI API key

Installation

# Clone the repository
git clone <repository-url>
cd sacl

# Install dependencies
npm install

# Copy environment configuration
cp .env.example .env

# Edit .env with your settings
OPENAI_API_KEY=your_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password

Using Docker (Recommended)

# Start Neo4j and SACL server
docker-compose up -d

# Check logs
docker-compose logs -f sacl-mcp-server

Manual Setup

# Build the project
npm run build

# Start the server
npm start

๐Ÿ”ง Configuration

Environment Variables

Variable Description Default
OPENAI_API_KEY OpenAI API key (required) -
SACL_REPO_PATH Repository to analyze Current directory
SACL_NAMESPACE Unique namespace Auto-generated
SACL_LLM_MODEL LLM model for analysis gpt-4
SACL_EMBEDDING_MODEL Embedding model text-embedding-3-small
SACL_BIAS_THRESHOLD Bias detection sensitivity (0-1) 0.5
SACL_MAX_RESULTS Maximum search results 10
SACL_CACHE_ENABLED Enable embedding cache true
NEO4J_URI Neo4j connection URI bolt://localhost:7687
NEO4J_USER Neo4j username neo4j
NEO4J_PASSWORD Neo4j password password

๐ŸŽฎ Usage

MCP Tools

The SACL server provides comprehensive MCP tools for bias-aware code analysis:

1. analyze_repository

Performs full SACL analysis of a repository:

{
  "repositoryPath": "/path/to/repo",
  "incremental": false
}

2. query_code

Bias-aware code search with optional context:

{
  "query": "function that sorts arrays efficiently",
  "repositoryPath": "/path/to/repo",
  "maxResults": 10,
  "includeContext": false  // Set true for relationship context
}

3. query_code_with_context ๐Ÿ†•

Enhanced search with relationship context and related components:

{
  "query": "authentication middleware",
  "repositoryPath": "/path/to/repo",
  "maxResults": 10,
  "includeRelated": true
}

4. update_file ๐Ÿ†•

Explicitly update single file analysis when changes are made:

{
  "filePath": "src/services/auth.js",
  "changeType": "modified"  // "created", "modified", or "deleted"
}

5. update_files ๐Ÿ†•

Batch update multiple files:

{
  "files": [
    { "filePath": "src/index.js", "changeType": "modified" },
    { "filePath": "src/utils/new.js", "changeType": "created" }
  ]
}

6. get_relationships ๐Ÿ†•

Analyze code relationships and dependencies:

{
  "filePath": "src/controllers/UserController.js",
  "maxDepth": 3,
  "relationshipTypes": ["imports", "calls", "extends"]  // Optional filter
}

7. get_file_context ๐Ÿ†•

Get comprehensive context for a file:

{
  "filePath": "src/models/User.js",
  "includeSnippets": true  // Include code previews
}

8. get_bias_analysis

Detailed bias metrics and debugging:

{
  "filePath": "src/utils/sort.js"  // Optional
}

9. get_system_stats

System performance and statistics:

{}

MCP Client Configuration

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "sacl": {
      "command": "node",
      "args": ["/path/to/sacl/dist/index.js"],
      "env": {
        "OPENAI_API_KEY": "your-key",
        "NEO4J_URI": "bolt://localhost:7687",
        "NEO4J_USER": "neo4j",
        "NEO4J_PASSWORD": "password"
      }
    }
  }
}

Cursor IDE

Configure in your Cursor settings to connect to the SACL MCP server.

๐Ÿ“Š SACL Framework

Stage 1: Bias Detection

Identifies three types of textual bias:

  • Docstring Dependency: Over-reliance on documentation
  • Identifier Name Bias: Focusing on variable/function names
  • Comment Over-reliance: Prioritizing commented code

Stage 2: Semantic Augmentation

Enriches code representations with:

  • Functional Signatures: What the code actually does
  • Behavior Patterns: Computational patterns (iteration, recursion, etc.)
  • Structural Features: Complexity metrics, AST analysis
  • Augmented Embeddings: Bias-adjusted semantic vectors

Stage 3: Reranking & Localization

  • Bias-Aware Ranking: Reduces textual weight based on bias score
  • Code Localization: Identifies functionally relevant segments
  • Semantic Similarity: Uses augmented embeddings
  • Functional Relevance: Considers computational patterns

Stage 4: Relationship Analysis ๐Ÿ†•

Maps code relationships and dependencies:

  • Import/Export Analysis: Module dependencies and exports
  • Function Call Mapping: Call graphs and method invocations
  • Class Inheritance: Extends/implements relationships
  • Dependency Tracking: External and internal dependencies
  • Context-Aware Results: Related components with each query result

๐Ÿงช Example Workflow

  1. Repository Analysis:

    AI Assistant โ†’ analyze_repository โ†’ SACL processes all files โ†’ Knowledge graph populated
    
  2. Code Query with Context:

    AI Assistant โ†’ query_code_with_context("authentication") โ†’ SACL retrieval โ†’ Context-aware results
    
  3. File Updates:

    AI modifies code โ†’ update_file("src/auth.js", "modified") โ†’ SACL re-analyzes โ†’ Relationships updated
    
  4. Relationship Exploration:

    AI Assistant โ†’ get_relationships("UserController.js") โ†’ Dependency graph โ†’ Related components
    
  5. Results Include:

    • Original textual similarity score
    • Semantic similarity score
    • Bias-adjusted final score
    • Localized code regions
    • Related components and dependencies
    • Context explanation with relationship importance
    • Explanation of ranking decisions

๐Ÿ“ˆ Performance

Based on SACL research benchmarks:

  • 12.8% improvement in Recall@1 on HumanEval
  • 9.4% improvement on MBPP
  • 7.0% improvement on SWE-Bench-Lite
  • P95 latency: <300ms for retrieval operations

๐Ÿ” Bias Analysis Example

๐Ÿง  SACL Bias Analysis

File: src/algorithms/quicksort.js

Bias Metrics:
โ€ข Overall Bias Score: 73.2% ๐Ÿ”ด
โ€ข Semantic Pattern: Recursive divide-and-conquer sorting
โ€ข Functional Signature: Array input โ†’ sorted array output

Bias Indicators:
โ€ข docstring_dependency: High docstring dependency (15.3% of code)
โ€ข identifier_name_bias: High reliance on descriptive names
โ€ข comment_over_reliance: Excessive comments (18.7% of code)

๐Ÿ’ก Improvement Suggestions:
โ€ข Reduce reliance on variable naming for semantic understanding
โ€ข Focus on structural patterns over comments
โ€ข Improve functional signature extraction

๐Ÿ› ๏ธ Development

Project Structure

src/
โ”œโ”€โ”€ core/                    # SACL framework implementation
โ”‚   โ”œโ”€โ”€ BiasDetector.ts      # Textual bias detection
โ”‚   โ”œโ”€โ”€ SemanticAugmenter.ts # Semantic enhancement
โ”‚   โ”œโ”€โ”€ SACLReranker.ts      # Reranking and localization with context
โ”‚   โ””โ”€โ”€ SACLProcessor.ts     # Main orchestrator with relationship support
โ”œโ”€โ”€ mcp/                     # MCP server implementation
โ”‚   โ””โ”€โ”€ SACLMCPServer.ts     # MCP protocol handlers (9 tools)
โ”œโ”€โ”€ graphiti/                # Knowledge graph integration
โ”‚   โ””โ”€โ”€ GraphitiClient.ts    # Graphiti/Neo4j interface with relationships
โ”œโ”€โ”€ utils/                   # Utility modules
โ”‚   โ””โ”€โ”€ CodeAnalyzer.ts      # AST analysis and relationship extraction
โ”œโ”€โ”€ types/                   # TypeScript type definitions
โ”‚   โ”œโ”€โ”€ index.ts             # Core types and interfaces
โ”‚   โ””โ”€โ”€ relationships.ts     # Relationship type definitions
โ””โ”€โ”€ index.ts                 # Application entry point

Building

npm run build    # Build TypeScript
npm run dev      # Development with auto-reload
npm run lint     # Code linting
npm run format   # Code formatting
npm test         # Run tests

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Implement changes following SACL methodology
  4. Add tests for new functionality
  5. Submit a pull request

๐Ÿ“š Research Background

This implementation is based on the research paper:

"SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization"

  • Authors: Dhruv Gupta, Gayathri Ganesh Lakshmy, Yiqing Xie
  • arXiv: 2506.20081v2

Key Research Contributions

  1. Systematic Bias Detection: Identifies textual bias through feature masking
  2. Semantic Augmentation: Enhances code understanding beyond text
  3. Bias-Aware Ranking: Reduces surface-level feature dependency
  4. Localization: Pinpoints functionally relevant code regions

๐Ÿ”— Integration

Supported AI Tools

  • Claude Code: Direct MCP integration
  • Cursor: MCP server connection
  • VS Code Extensions: Via MCP protocol
  • Custom Tools: Any MCP-compatible client

Language Support

  • JavaScript/TypeScript: Full AST analysis with relationship extraction

    • Import/export tracking
    • Function call analysis
    • Class inheritance detection
    • Dynamic imports support
  • Python: Regex-based analysis

    • Import statement parsing
    • Class inheritance detection
    • Function call patterns
  • Other Languages (Java, C++, C#, Go, Rust): Basic analysis

    • Import/include statements
    • Class declarations
    • Function definitions
  • Extensible: Easy to add new language analyzers

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ†˜ Support

  • Issues: GitHub Issues
  • Documentation: See /docs directory
  • Research Paper: arXiv:2506.20081v2

๐Ÿ”ฎ Future Enhancements

  • [ ] Multi-language AST parsing for all supported languages
  • [ ] Real-time Graphiti integration (currently uses mock methods)
  • [ ] Semantic relationship detection beyond syntactic analysis
  • [ ] Visual relationship graphs in MCP responses
  • [ ] Custom bias threshold configuration per project
  • [ ] Integration with Language Server Protocol (LSP)
  • [ ] Advanced localization algorithms with machine learning
  • [ ] Performance optimizations for large codebases (>10k files)
  • [ ] Real-time bias notifications during code writing
  • [ ] Custom relationship type definitions

SACL MCP Server - Bringing research-backed bias-aware code retrieval to AI coding assistants.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured