Digital Persona Memory MCP Server
Builds a searchable memory bank from persona-specific PDF content and exposes it as an MCP tool for querying.
README
Digital Persona Memory App
Builds a searchable memory bank from PDF content and exposes it as an MCP tool.
Add it as an MCP server in your favorite AI client (eg: LM Studio) with the included system prompt and chat with your heroes, historical figures, or any collection of PDF content.
Example personas are included for Karl Marx and Friedrich Engels. The personas are built from every work both figures ever published. You can chat, ask them questions, or even debate them as though they were still alive and they'll respond based on their own publicly available works.
Alternative uses include adding your school textbooks in PDF format and asking them questions. No more finding that one page, just ask it to explain whatever you want to know.
A modified version of this app is used to bring Facebook users to life as Digital Personas using their exported Facebook data.
What This App Does
- extracts and indexes persona content from PDF-derived JSON sections
- builds per-person Chroma memory databases
- serves an MCP search tool with semantic ranking plus recency scoring
- supports multiple personas by adding folders under
pdf/
Core Features
parser/contains PDF parsing tools to turn PDF persona text into JSON section files.2_build_memory.pyembeds documents and builds a Chroma memory DB.3_avatar_mcp.pyexposessearch_my_memory()as an MCP tool.--namelets you create and query separate memory databases for different personas.config.jsoncentralizes paths, embedding server settings, memory DB defaults, and search tuning.
Project Files
1_prep_data.py: Prepare imported JSON data and normalize text for memory import.2_build_memory.py: Build the Chroma memory database with optional persona selection.3_avatar_mcp.py: Run the MCP server and query the selected memory database.config.json: Configuration for PDF sources, embeddings, memory DB paths, and search settings.parser/: PDF downloader and parser utilities for creating JSON section files from PDFs.
Python Requirements
Install dependencies in your virtual environment:
pip install -r requirements.txt
Data Layout
PDF Persona Support
Each persona owns a directory under pdf/.
For example:
pdf/marx/(Karl Marx persona, included as a sample)pdf/engles/(Friedrich Engels persona, included as a sample)
Memory DB Layout
The memory database lives under the base path in config.json.
When using --name, a separate collection directory is created:
avatar_memory_db/marx/avatar_memory_db/engles/
Each persona DB contains Chroma persistence files plus a document_index.json.
LM Studio Setup
Before building or querying memory:
- Start LM Studio's Local Server.
- Load the embedding model configured in
config.json. - Confirm
embeddings.api_basematches the Local Server URL.
End-to-End Workflow
1) Add persona-specific PDF data or formatted JSON sections
Option 1: Add PDF files in pdf/<persona>/ and run the parser tools in parser/ to create JSON section files in pdf/<persona>/json/.
You will need to edit the parser tools to meet your specific PDF structure and content. The included parser/ tools are a starting point for common PDF layouts.
Option 2: Add JSON section files directly in pdf/<persona>/json/.
The JSON files should be structured as:
{
"sections": [
{
"title": "Section Title",
"text": "Section text content...",
"timestamp": "2024-01-01T00:00:00Z" // Timestamp is optional but recommended for recency scoring
},
...
]
}
2) Prepare JSON data for ingestion
python 1_prep_data.py
This will normalize the text and prepare the JSON sections for embedding. It will prompt you to select a persona if multiple are present.
3) Build the memory DB
Option 1: Build default collection (no persona name specified. Will use config.json default collection name):
python 2_build_memory.py
Option 2: Build a named persona collection:
JSON files for Karl Marx and Friedrich Engels are included as examples.
python 2_build_memory.py --name marx
python 2_build_memory.py --name engles
4) Start the MCP server
Option 1: Run the default DB server:
python 3_avatar_mcp.py
Option 2: Run a named persona DB server:
python 3_avatar_mcp.py --name marx
python 3_avatar_mcp.py --name engles
The exposed MCP tool is:
search_my_memory(topic, num_results=...)
Search Behavior
Results are ranked by:
- semantic similarity from embeddings
- an optional recency boost
Tune behavior in config.json:
search.default_num_resultssearch.candidate_multipliersearch.recency_half_life_dayssearch.recency_weight_alpha
Troubleshooting
- No memory results:
- Confirm
pdf/<persona>/json/files exist. - Confirm prepared_documents.json exists in the persona folder (or whatever filename you specified in
config.json). - Rebuild the memory DB.
- Confirm
- Collection or DB errors:
- Verify
memory_db.pathandcollection_nameinconfig.json. - Use
--nameconsistently for build and server.
- Verify
- LM Studio errors:
- Confirm the Local Server is running with embedding model loaded.
- Check
embeddings.api_base, and ensure the model name inconfig.jsonmatch LM Studio's model name.
- Long build times:
- Large persona data can take 5-10+ minutes depending on hardware and embedding throughput.
Screenshot of this in action
<img width="1408" height="751" alt="image" src="https://github.com/user-attachments/assets/c374217b-79eb-4d63-8d56-7f8396264fd8" />
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.