Lore Agent
A self-improving knowledge agent that provides local retrieval, web research, and structured answer synthesis for Claude Code and VS Code Copilot via MCP. It enables AI tools to manage a project-specific knowledge lifecycle through automated research and indexing.
README
Scholar Agent
General-purpose LLMs are often inaccurate and outdated in specialized domains. Scholar Agent combines online research + local knowledge accumulation into a sustainable knowledge flywheel, making your AI smarter in your domain over time. It also builds a human-readable knowledge base for quick learning. Integrates seamlessly with Claude Code and VS Code Copilot via MCP.
What It Does
Your question
│
▼
Online research (LLM web search + academic APIs)
│
▼
Structured synthesis (with citations, confidence, uncertainty)
│
▼
Local accumulation (Markdown knowledge cards + BM25 index)
│
▼
Next question: AI checks local first ── hit? ──► use directly, fast & accurate
│ miss
▼
Research again → accumulate → reindex ──► knowledge base keeps growing
Each round compounds. Knowledge cards have full lifecycle management: draft → reviewed → trusted → stale → deprecated.
Academic Research Pipeline
Scholar Agent includes a comprehensive academic paper research pipeline:
- Paper Search — Search papers from arXiv, DBLP, and Semantic Scholar. Filter by top conferences (CVPR, ICCV, ECCV, ICLR, AAAI, NeurIPS, ICML, ACL, EMNLP, MICCAI)
- Smart Scoring — Four-dimensional scoring engine (relevance, recency, popularity, quality) ranks papers by your research interests
- Deep Analysis Notes — Auto-generate 20+ section Obsidian-style markdown notes with
<!-- LLM: -->placeholders for AI-assisted completion - Figure Extraction — Extract images from arXiv source archives and PDFs (via PyMuPDF)
- Daily Recommendations — Automated daily paper search, scoring, deduplication, and recommendation note generation
- Paper → Knowledge Card — Convert paper analyses into knowledge cards that feed back into the knowledge flywheel
- Keyword Auto-Linking — Scan notes for technical terms and create
[[wiki-links]]automatically
Quick Start
Embed into an existing project
cd my-project && git clone https://github.com/zfy465914233/scholar-agent.git
bash scholar-agent/setup.sh
# Restart Claude Code to activate
This will create the directory structure, copy config templates, install skills, and build the knowledge index.
Use as a standalone project
# Clone and install
git clone https://github.com/zfy465914233/scholar-agent.git
cd scholar-agent
pip install -r requirements.txt
# Build the knowledge index
python scripts/local_index.py --output indexes/local/index.json
MCP configs are pre-configured:
- Claude Code:
.mcp.jsonis ready.cdinto the project and start Claude Code. - VS Code Copilot:
.vscode/mcp.jsonis ready. Open the project, enable agent mode.
MCP Tools
Core Tools (always available)
| Tool | Description |
|---|---|
query_knowledge |
Search local knowledge base |
save_research |
Save structured research results as a knowledge card |
list_knowledge |
Browse all knowledge cards |
capture_answer |
Quick-capture a Q&A pair as a draft card |
ingest_source |
Ingest a URL or raw text into the knowledge base |
build_graph |
Generate an interactive knowledge graph (vis.js) |
Academic Tools (set SCHOLAR_ACADEMIC=1 to enable)
| Tool | Description |
|---|---|
search_papers |
Search arXiv + Semantic Scholar with 4-dim scoring |
search_conf_papers |
Search conference papers via DBLP + S2 enrichment |
analyze_paper |
Generate deep-analysis markdown notes (20+ sections) |
extract_paper_images |
Extract figures from arXiv source / PDF |
paper_to_card |
Convert paper analysis into a knowledge card |
daily_recommend |
Daily paper recommendation workflow |
link_paper_keywords |
Auto-link keywords as [[wikilinks]] in notes |
Recommended Workflow
For best analysis quality, follow this order:
- Download the paper:
download_paper("2510.24701", title="Paper Title", domain="LLM") - Extract images:
extract_paper_images("2510.24701")(auto-detects local PDF) - Deep analysis:
analyze_paper(paper_json)(auto-detects local PDF, extracts full text)
Tip: Downloading the PDF before analysis enables full-text extraction, producing high-quality notes with specific data, formulas, and experimental results. Without a local PDF, analysis relies on the abstract only.
Configuration
.scholar.json
The .scholar.json file configures knowledge paths and academic research settings. See .scholar.example.json for a full example with comments.
Key sections:
knowledge_dir— Path to knowledge cards directoryindex_path— Path to BM25 search indexacademic.research_interests— Your research domains, keywords, and arXiv categoriesacademic.scoring— Paper scoring weights and dimensions
Environment Variables
Copy .env.example to .env and configure:
| Variable | Required | Description |
|---|---|---|
SCHOLAR_ACADEMIC |
No | Set to 1 to enable academic tools |
S2_API_KEY |
No | Semantic Scholar API key (get one free) |
LLM_API_KEY |
No | LLM API key for advanced synthesis pipeline |
Project Structure
scholar-agent/
├── mcp_server.py # MCP server (13 tools)
├── setup_mcp.py # Embed into existing projects
├── pyproject.toml # Package configuration
├── .scholar.json # Project & academic configuration
├── schemas/ # Answer + evidence JSON schemas
├── scripts/
│ ├── academic/ # Academic research modules
│ │ ├── arxiv_search.py # arXiv + Semantic Scholar search
│ │ ├── conf_search.py # Conference paper search (DBLP)
│ │ ├── paper_analyzer.py # Deep-analysis note generation
│ │ ├── scoring.py # 4-dim paper scoring engine
│ │ ├── image_extractor.py # Figure extraction from PDFs
│ │ ├── note_linker.py # Wiki-link discovery + keyword linking
│ │ └── daily_workflow.py # Daily recommendation pipeline
│ ├── scholar_config.py # Configuration reader
│ ├── local_index.py # BM25 index builder
│ ├── local_retrieve.py # Knowledge retrieval
│ ├── close_knowledge_loop.py # Knowledge card builder
│ └── ... # Research, synthesis, governance, graph
├── knowledge/ # Knowledge cards (gitignored, user-generated)
├── indexes/ # Generated indexes (gitignored)
└── tests/ # 247 tests
More Features
- Multi-perspective research — Parallel research from 5 perspectives (academic, technical, applied, contrarian, historical)
- Obsidian compatible — Standard Markdown + YAML frontmatter +
[[wiki-links]] - Knowledge governance CLI — Validate frontmatter, detect orphaned cards, find duplicates, manage lifecycle
- Provider fault tolerance — Each search source fails independently; falls back to local retrieval when offline
Testing
python -m pytest tests/ -v
247 tests, ~13s. No external services needed.
License
MIT — see LICENSE.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.