FastMCP Document Analyzer
A comprehensive document analysis server that performs sentiment analysis, keyword extraction, readability scoring, and text statistics while providing document management capabilities including storage, search, and organization.
README
🔍 FastMCP Document Analyzer
A comprehensive document analysis server built with the modern FastMCP framework
📋 Table of Contents
- 🌟 Features
- 🚀 Quick Start
- 📦 Installation
- 🔧 Usage
- 🛠️ Available Tools
- 📊 Sample Data
- 🏗️ Project Structure
- 🔄 API Reference
- 🧪 Testing
- 📚 Documentation
- 🤝 Contributing
🌟 Features
📖 Document Analysis
- 🎭 Sentiment Analysis: VADER + TextBlob dual-engine sentiment classification
- 🔑 Keyword Extraction: TF-IDF and frequency-based keyword identification
- 📚 Readability Scoring: Multiple metrics (Flesch, Flesch-Kincaid, ARI)
- 📊 Text Statistics: Word count, sentences, paragraphs, and more
🗂️ Document Management
- 💾 Persistent Storage: JSON-based document collection with metadata
- 🔍 Smart Search: TF-IDF semantic similarity search
- 🏷️ Tag System: Category and tag-based organization
- 📈 Collection Insights: Comprehensive statistics and analytics
🚀 FastMCP Advantages
- ⚡ Simple Setup: 90% less boilerplate than standard MCP
- 🔒 Type Safety: Full type validation with Pydantic
- 🎯 Modern API: Decorator-based tool definitions
- 🌐 Multi-Transport: STDIO, HTTP, and SSE support
🚀 Quick Start
1. Clone and Setup
git clone <repository-url>
cd document-analyzer
python -m venv venv
source venv/Scripts/activate # Windows
# source venv/bin/activate # macOS/Linux
2. Install Dependencies
pip install -r requirements.txt
3. Initialize NLTK Data
python -c "import nltk; nltk.download('punkt'); nltk.download('vader_lexicon'); nltk.download('stopwords'); nltk.download('punkt_tab')"
4. Run the Server
python fastmcp_document_analyzer.py
5. Test Everything
python test_fastmcp_analyzer.py
📦 Installation
System Requirements
- Python 3.8 or higher
- 500MB free disk space
- Internet connection (for initial NLTK data download)
Dependencies
fastmcp>=2.3.0 # Modern MCP framework
textblob>=0.17.1 # Sentiment analysis
nltk>=3.8.1 # Natural language processing
textstat>=0.7.3 # Readability metrics
scikit-learn>=1.3.0 # Machine learning utilities
numpy>=1.24.0 # Numerical computing
pandas>=2.0.0 # Data manipulation
python-dateutil>=2.8.2 # Date handling
Optional: Virtual Environment
# Create virtual environment
python -m venv venv
# Activate (Windows)
venv\Scripts\activate
# Activate (macOS/Linux)
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
🔧 Usage
Starting the Server
Default (STDIO Transport)
python fastmcp_document_analyzer.py
HTTP Transport (for web services)
python fastmcp_document_analyzer.py --transport http --port 9000
With Custom Host
python fastmcp_document_analyzer.py --transport http --host 0.0.0.0 --port 8080
Basic Usage Examples
# Analyze a document
result = analyze_document("doc_001")
print(f"Sentiment: {result['sentiment_analysis']['overall_sentiment']}")
# Extract keywords
keywords = extract_keywords("Artificial intelligence is transforming healthcare", 5)
print([kw['keyword'] for kw in keywords])
# Search documents
results = search_documents("machine learning", 3)
print(f"Found {len(results)} relevant documents")
# Get collection statistics
stats = get_collection_stats()
print(f"Total documents: {stats['total_documents']}")
🛠️ Available Tools
Core Analysis Tools
| Tool | Description | Example |
|---|---|---|
analyze_document |
🔍 Complete document analysis | analyze_document("doc_001") |
get_sentiment |
😊 Sentiment analysis | get_sentiment("I love this!") |
extract_keywords |
🔑 Keyword extraction | extract_keywords(text, 10) |
calculate_readability |
📖 Readability metrics | calculate_readability(text) |
Document Management Tools
| Tool | Description | Example |
|---|---|---|
add_document |
📝 Add new document | add_document("id", "title", "content") |
get_document |
📄 Retrieve document | get_document("doc_001") |
delete_document |
🗑️ Delete document | delete_document("old_doc") |
list_documents |
📋 List all documents | list_documents("Technology") |
Search and Discovery Tools
| Tool | Description | Example |
|---|---|---|
search_documents |
🔍 Semantic search | search_documents("AI", 5) |
search_by_tags |
🏷️ Tag-based search | search_by_tags(["AI", "tech"]) |
get_collection_stats |
📊 Collection statistics | get_collection_stats() |
📊 Sample Data
The server comes pre-loaded with 16 diverse documents covering:
| Category | Documents | Topics |
|---|---|---|
| Technology | 4 | AI, Quantum Computing, Privacy, Blockchain |
| Science | 3 | Space Exploration, Healthcare, Ocean Conservation |
| Environment | 2 | Climate Change, Sustainable Agriculture |
| Society | 3 | Remote Work, Mental Health, Transportation |
| Business | 2 | Economics, Digital Privacy |
| Culture | 2 | Art History, Wellness |
Sample Document Structure
{
"id": "doc_001",
"title": "The Future of Artificial Intelligence",
"content": "Artificial intelligence is rapidly transforming...",
"author": "Dr. Sarah Chen",
"category": "Technology",
"tags": ["AI", "technology", "future", "ethics"],
"language": "en",
"created_at": "2024-01-15T10:30:00"
}
🏗️ Project Structure
document-analyzer/
├── 📁 analyzer/ # Core analysis engine
│ ├── __init__.py
│ └── document_analyzer.py # Sentiment, keywords, readability
├── 📁 storage/ # Document storage system
│ ├── __init__.py
│ └── document_storage.py # JSON storage, search, management
├── 📁 data/ # Sample data
│ ├── __init__.py
│ └── sample_documents.py # 16 sample documents
├── 📄 fastmcp_document_analyzer.py # 🌟 Main FastMCP server
├── 📄 test_fastmcp_analyzer.py # Comprehensive test suite
├── 📄 requirements.txt # Python dependencies
├── 📄 documents.json # Persistent document storage
├── 📄 README.md # This documentation
├── 📄 FASTMCP_COMPARISON.md # FastMCP vs Standard MCP
├── 📄 .gitignore # Git ignore patterns
└── 📁 venv/ # Virtual environment (optional)
🔄 API Reference
Document Analysis
analyze_document(document_id: str) -> Dict[str, Any]
Performs comprehensive analysis of a document.
Parameters:
document_id(str): Unique document identifier
Returns:
{
"document_id": "doc_001",
"title": "Document Title",
"sentiment_analysis": {
"overall_sentiment": "positive",
"confidence": 0.85,
"vader_scores": {...},
"textblob_scores": {...}
},
"keywords": [
{"keyword": "artificial", "frequency": 5, "relevance_score": 2.3}
],
"readability": {
"flesch_reading_ease": 45.2,
"reading_level": "Difficult",
"grade_level": "Grade 12"
},
"basic_statistics": {
"word_count": 119,
"sentence_count": 8,
"paragraph_count": 1
}
}
get_sentiment(text: str) -> Dict[str, Any]
Analyzes sentiment of any text.
Parameters:
text(str): Text to analyze
Returns:
{
"overall_sentiment": "positive",
"confidence": 0.85,
"vader_scores": {
"compound": 0.7269,
"positive": 0.294,
"negative": 0.0,
"neutral": 0.706
},
"textblob_scores": {
"polarity": 0.5,
"subjectivity": 0.6
}
}
Document Management
add_document(...) -> Dict[str, str]
Adds a new document to the collection.
Parameters:
id(str): Unique document IDtitle(str): Document titlecontent(str): Document contentauthor(str, optional): Author namecategory(str, optional): Document categorytags(List[str], optional): Tags listlanguage(str, optional): Language code
Returns:
{
"status": "success",
"message": "Document 'my_doc' added successfully",
"document_count": 17
}
Search and Discovery
search_documents(query: str, limit: int = 10) -> List[Dict[str, Any]]
Performs semantic search across documents.
Parameters:
query(str): Search querylimit(int): Maximum results
Returns:
[
{
"id": "doc_001",
"title": "AI Document",
"similarity_score": 0.8542,
"content_preview": "First 200 characters...",
"tags": ["AI", "technology"]
}
]
🧪 Testing
Run All Tests
python test_fastmcp_analyzer.py
Test Categories
- ✅ Server Initialization: FastMCP server setup
- ✅ Sentiment Analysis: VADER and TextBlob integration
- ✅ Keyword Extraction: TF-IDF and frequency analysis
- ✅ Readability Calculation: Multiple readability metrics
- ✅ Document Analysis: Full document processing
- ✅ Document Search: Semantic similarity search
- ✅ Collection Statistics: Analytics and insights
- ✅ Document Management: CRUD operations
- ✅ Tag Search: Tag-based filtering
Expected Test Output
=== Testing FastMCP Document Analyzer ===
✓ FastMCP server module imported successfully
✓ Server initialized successfully
✓ Sentiment analysis working
✓ Keyword extraction working
✓ Readability calculation working
✓ Document analysis working
✓ Document search working
✓ Collection statistics working
✓ Document listing working
✓ Document addition and deletion working
✓ Tag search working
=== All FastMCP tests completed successfully! ===
📚 Documentation
Additional Resources
- 📖 FastMCP Documentation
- 📖 MCP Protocol Specification
- 📖 FASTMCP_COMPARISON.md - FastMCP vs Standard MCP
Key Concepts
Sentiment Analysis
Uses dual-engine approach:
- VADER: Rule-based, excellent for social media text
- TextBlob: Machine learning-based, good for general text
Keyword Extraction
Combines multiple approaches:
- TF-IDF: Term frequency-inverse document frequency
- Frequency Analysis: Simple word frequency counting
- Relevance Scoring: Weighted combination of both methods
Readability Metrics
Provides multiple readability scores:
- Flesch Reading Ease: 0-100 scale (higher = easier)
- Flesch-Kincaid Grade: US grade level
- ARI: Automated Readability Index
Document Search
Uses TF-IDF vectorization with cosine similarity:
- Converts documents to numerical vectors
- Calculates similarity between query and documents
- Returns ranked results with similarity scores
🤝 Contributing
Development Setup
# Clone repository
git clone <repository-url>
cd document-analyzer
# Create development environment
python -m venv venv
source venv/Scripts/activate # Windows
pip install -r requirements.txt
# Run tests
python test_fastmcp_analyzer.py
Adding New Tools
FastMCP makes it easy to add new tools:
@mcp.tool
def my_new_tool(param: str) -> Dict[str, Any]:
"""
🔧 Description of what this tool does.
Args:
param: Parameter description
Returns:
Return value description
"""
# Implementation here
return {"result": "success"}
Code Style
- Use type hints for all functions
- Add comprehensive docstrings
- Include error handling
- Follow PEP 8 style guidelines
- Add emoji icons for better readability
Testing New Features
- Add your tool to the main server file
- Create test cases in the test file
- Run the test suite to ensure everything works
- Update documentation as needed
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- FastMCP Team for the excellent framework
- NLTK Team for natural language processing tools
- TextBlob Team for sentiment analysis capabilities
- Scikit-learn Team for machine learning utilities
Made with ❤️ using FastMCP
🚀 Ready to analyze documents? Start with
python fastmcp_document_analyzer.py
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.