ResearchTwin

ResearchTwin

For AI agents and humans: Discover researchers, publications, datasets, and code repositories across a federated network of researcher digital twins. Compute S-Index impact metrics combining citation data from Semantic Scholar and Google Scholar with code and dataset quality scores from GitHub and Figshare.

Category
Visit Server

README

ResearchTwin: Federated Agentic Web of Research Knowledge

License Live Platform S-Index Spec Project Board

ResearchTwin is an open-source, federated platform that transforms a researcher's publications, datasets, and code repositories into a conversational Digital Twin. Built on a Bimodal Glial-Neural Optimization (BGNO) architecture, it enables dual-discovery where both humans and AI agents collaborate to accelerate scientific discovery.

Live at researchtwin.net | Join the Network


Project Vision

The exponential growth of scientific outputs has created a "discovery bottleneck." Traditional static PDFs and siloed repositories limit knowledge synthesis and reuse. ResearchTwin addresses this by:

  • Integrating multi-modal research artifacts from Semantic Scholar, Google Scholar, GitHub, and Figshare
  • Computing a real-time S-Index metric (Quality × Impact × Collaboration) across all output types
  • Providing a conversational chatbot interface for interactive research exploration
  • Exposing an Inter-Agentic Discovery API with Schema.org types for machine-to-machine research discovery
  • Enabling a federated, Discord-like architecture supporting local nodes, hubs, and hosted edges

Architecture Overview

BGNO (Bimodal Glial-Neural Optimization)

Data Sources          Glial Layer          Neural Layer         Interface
┌──────────────┐    ┌─────────────┐    ┌──────────────┐    ┌────────────┐
│Semantic Scholar│───▶│             │    │              │    │  Web Chat  │
│Google Scholar │───▶│  SQLite     │───▶│  RAG with    │───▶│  Discord   │
│GitHub API     │───▶│  Cache +    │    │  Claude API  │    │  Agent API │
│Figshare API   │───▶│  Rate Limit │    │              │    │  Embed     │
└──────────────┘    └─────────────┘    └──────────────┘    └────────────┘
  • Connector Layer: Pulls papers (S2+GS with deduplication), repos (GitHub), datasets (Figshare), and ORCID metadata
  • Glial Layer: SQLite caching with 24h TTL, rate limiting, S2+GS title-similarity merge (0.85 threshold)
  • Neural Layer: RAG with Claude — context assembly, prompt engineering, conversational synthesis
  • Interface Layer: D3.js knowledge graph, chat widget, Discord bot, REST API

Federated Network Tiers

Tier Name Description Status
Tier 1 Local Nodes Researchers run python run_node.py locally Live
Tier 2 Hubs Lab aggregators federating multiple nodes Planned
Tier 3 Hosted Edges Cloud-hosted at researchtwin.net Live

Inter-Agentic Discovery API

Machine-readable endpoints with Schema.org @type annotations:

Endpoint Schema.org Type Purpose
GET /api/researcher/{slug}/profile Person Researcher profile with HATEOAS links
GET /api/researcher/{slug}/papers ItemList of ScholarlyArticle Papers with citations
GET /api/researcher/{slug}/datasets ItemList of Dataset Datasets with QIC scores
GET /api/researcher/{slug}/repos ItemList of SoftwareSourceCode Repos with QIC scores
GET /api/discover?q=keyword&type=paper SearchResultSet Cross-researcher search

Getting Started

Hosted (Tier 3) — Zero Setup

  1. Visit researchtwin.net/join.html
  2. Register with your name, email, and research identifiers
  3. Your Digital Twin is live immediately

Local Node (Tier 1) — Full Control

git clone https://github.com/martinfrasch/researchtwin.git
cd researchtwin
pip install -r backend/requirements.txt
cp node_config.json.example node_config.json
# Edit node_config.json with your details
python run_node.py --config node_config.json

Docker Deployment

cp .env.example .env  # Add your API keys
docker-compose up -d --build

Required API keys: ANTHROPIC_API_KEY (for Claude RAG) Optional: S2_API_KEY, GITHUB_TOKEN, DISCORD_BOT_TOKEN, SMTP credentials


Repository Structure

researchtwin/
├── backend/
│   ├── main.py              # FastAPI endpoints (REST + Discovery API)
│   ├── researchers.py        # SQLite researcher CRUD + token management
│   ├── database.py           # SQLite schema, WAL mode, migrations
│   ├── models.py             # Pydantic models for all endpoints
│   ├── rag.py                # RAG context assembly for Claude
│   ├── qic_index.py          # S-Index / QIC computation engine
│   ├── email_service.py      # SMTP service for profile update codes
│   ├── connectors/           # Data source connectors
│   │   ├── semantic_scholar.py
│   │   ├── scholarly_lib.py  # Google Scholar via scholarly
│   │   ├── github_connector.py
│   │   └── figshare.py
│   └── discord_bot/          # Discord bot with /research and /sindex
├── frontend/
│   ├── index.html            # Main dashboard with D3.js knowledge graph
│   ├── join.html             # Self-registration page
│   ├── update.html           # Email-verified profile updates
│   ├── privacy.html          # Privacy policy
│   └── widget-loader.js      # Embeddable chat widget
├── run_node.py               # Tier 1 local node launcher
├── node_config.json.example  # Local node configuration template
├── docker-compose.yml        # Docker orchestration
├── nginx/                    # Nginx reverse proxy + SSL
└── whitepaper.tex            # LaTeX manuscript

Ecosystem

This repository is part of the ResearchTwin Ecosystem project:

Repository Description
researchtwin Federated platform (this repo)
s-index S-Index formal specification and reference implementation

Embeddable S-Index Widget

Show your S-Index on your lab website, Google Sites page, or personal homepage:

<iframe
  src="https://researchtwin.net/embed.html?slug=YOUR-SLUG"
  width="440" height="180"
  style="border:none; border-radius:12px;"
  loading="lazy">
</iframe>

Replace YOUR-SLUG with your researcher slug (e.g. martin-frasch).

Google Sites: Edit page > Insert > Embed > "By URL" tab > paste https://researchtwin.net/embed.html?slug=YOUR-SLUG

WordPress: Add a Custom HTML block and paste the iframe code.

The widget displays the researcher's name, S-Index score, h-index, citation count, and paper count. Data updates automatically from live API sources.

See it in action | Full embed instructions


Documentation

Document Description
API Reference Full REST API documentation with schemas and examples
Self-Hosting Guide Tier 1 Local Node setup and configuration
Hub Federation Guide Tier 2 Hub architecture and setup (planned)
Security Policy Vulnerability reporting and security best practices

Contributing

Contributions welcome! See the project board for tracked issues.

  • New connectors (ORCID enrichment, PubMed, OpenAlex)
  • Affiliation-based geographic mapping
  • MCP server for inter-agentic discovery
  • UI/UX improvements
  • Bug fixes and optimizations

License

MIT License. See LICENSE.


Contact


Empowering researchers and AI agents to discover, collaborate, and innovate together.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured