Founder Intelligence Engine

Founder Intelligence Engine

Transforms founder profiles from social media into actionable strategic intelligence through automated scraping, LLM analysis, and personalized news tracking. It leverages vector search and caching to provide deep insights and relevant updates on specific founders.

Category
Visit Server

README

Founder Intelligence Engine — MCP Server

A production-grade Model Context Protocol (MCP) server that transforms founder profiles into actionable strategic intelligence.


Architecture

┌───────────────────────────────────────────────────────────┐
│                     MCP Client (Claude, etc.)             │
│                          ▲ stdio                          │
│               ┌──────────┴──────────┐                     │
│               │   MCP Server (Node) │                     │
│               │   3 registered tools│                     │
│               └──────┬──────────────┘                     │
│          ┌───────────┬┼──────────────┐                    │
│          ▼           ▼▼              ▼                    │
│  ┌──────────┐  ┌───────────┐  ┌──────────────┐           │
│  │  Apify   │  │   Groq    │  │  Embeddings  │           │
│  │  Scraping│  │   LLM     │  │  API         │           │
│  └────┬─────┘  └─────┬─────┘  └──────┬───────┘           │
│       └──────────────┬┘──────────────┘                    │
│                      ▼                                    │
│            ┌─────────────────┐                            │
│            │  Supabase       │                            │
│            │  (Postgres +    │                            │
│            │   pgvector)     │                            │
│            └─────────────────┘                            │
└───────────────────────────────────────────────────────────┘

Data Flow

  1. collect_profile — Scrapes LinkedIn + Twitter via Apify → merges data → generates embedding → stores in Supabase
  2. analyze_profile — Fetches stored profile → calls Groq LLM for strategic analysis → caches result
  3. fetch_personalized_news — Checks cache freshness → if stale: generates search queries → scrapes Google News → embeds articles → ranks by cosine similarity → summarizes with Groq → stores; if fresh: returns cached articles

Caching & Cost Optimization

Operation Cost When It Runs
LinkedIn/Twitter scraping High Only on profile creation
Groq profile analysis Medium Once per profile (cached)
Google News + embeddings High Only when news > 24h stale
Read cached articles Free Every subsequent request

The fetch_history table tracks last_profile_scrape and last_news_fetch timestamps. The staleCheck.js module compares these against configurable thresholds.


Setup

1. Prerequisites

  • Node.js 20+
  • Supabase project (with pgvector enabled)
  • API keys: Apify, Groq, OpenAI-compatible Embeddings

2. Install

cd /Users/praveenkumar/Desktop/mcp
cp .env.example .env
# Edit .env with your real keys
npm install

3. Database

Run the migration against your Supabase SQL Editor:

-- Paste contents of migrations/001_init.sql

Or via psql:

psql $DATABASE_URL < migrations/001_init.sql

4. Run MCP Server

node src/index.js

5. Configure MCP Client

Add to your MCP client config (e.g., Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "founder-intelligence": {
      "command": "node",
      "args": ["/Users/praveenkumar/Desktop/mcp/src/index.js"],
      "env": {
        "SUPABASE_URL": "...",
        "SUPABASE_SERVICE_KEY": "...",
        "APIFY_API_TOKEN": "...",
        "GROQ_API_KEY": "...",
        "EMBEDDING_API_URL": "...",
        "EMBEDDING_API_KEY": "..."
      }
    }
  }
}

6. Background Worker (Optional)

# Single run (for cron)
node src/backgroundWorker.js

# Daemon mode
BACKGROUND_LOOP=true node src/backgroundWorker.js

Cron example (every 6 hours):

0 */6 * * * cd /app && node src/backgroundWorker.js >> /var/log/worker.log 2>&1

Project Structure

/Users/praveenkumar/Desktop/mcp/
├── migrations/
│   └── 001_init.sql
├── src/
│   ├── db/
│   │   └── supabaseClient.js
│   ├── services/
│   │   ├── apifyService.js
│   │   ├── embeddingService.js
│   │   └── llmService.js
│   ├── tools/
│   │   ├── collectProfile.js
│   │   ├── analyzeProfile.js
│   │   └── fetchPersonalizedNews.js
│   ├── utils/
│   │   ├── similarity.js
│   │   └── staleCheck.js
│   ├── backgroundWorker.js
│   └── index.js
├── .env.example
├── .gitignore
├── .dockerignore
├── Dockerfile
├── package.json
└── README.md

Docker Deployment

Build & Run

docker build -t founder-intelligence-mcp .
docker run --env-file .env founder-intelligence-mcp

Background Worker Container

docker run --env-file .env founder-intelligence-mcp node src/backgroundWorker.js

Docker Compose (production)

version: '3.8'
services:
  mcp-server:
    build: .
    env_file: .env
    stdin_open: true
    restart: unless-stopped

  worker:
    build: .
    env_file: .env
    command: ["node", "src/backgroundWorker.js"]
    environment:
      - BACKGROUND_LOOP=true
    restart: unless-stopped

Scaling Strategy

Component Strategy
MCP Server One instance per client (stdio-based)
Background Worker Single instance or Cloud Run Job on schedule
Supabase Connection pooling via Supavisor; read replicas for scale
Apify Concurrent actor runs (up to account limit)
Embeddings Batch requests (20 per call) to reduce round trips
Groq Rate-limit aware with retry-after header handling

For high-profile-count deployments:

  • Move background worker to a Cloud Run Job triggered by Cloud Scheduler
  • Use Supabase Edge Functions for scheduled refresh
  • Add a Redis cache layer for hot profile lookups

Security Best Practices

  1. Service-role key only on server side — never expose to clients
  2. All secrets via environment variables — no hardcoded keys
  3. Non-root Docker usermcp user in container
  4. Input validation — Zod schemas on all tool inputs
  5. Row Level Security — enable RLS on Supabase tables for multi-tenant
  6. API token rotation — rotate Apify, Groq, and embedding keys periodically
  7. Rate limiting — built-in retry logic with exponential backoff
  8. No PII logging — profile data stays in Supabase, not console

Cost Optimization

Service Cost Driver Mitigation
Apify Actor compute units Scrape only on creation; cache results
Groq Token usage Analyze once (cached); batch news summaries
Embeddings API calls Batch 20 at a time; embed once per article
Supabase Row count + storage Deduplicate articles by URL; prune old articles

Expected cost per profile lifecycle:

  • Initial setup: ~$0.05–0.15 (scrape + embed + analyze)
  • Daily news refresh: ~$0.02–0.08 (scrape + embed + summarize top 10)
  • Cached reads: $0.00

Future Improvement Roadmap

  1. HTTP/SSE transport — support remote MCP clients over HTTP
  2. Multi-tenant profiles — user-scoped access with RLS
  3. Real-time alerts — push notifications when high-relevance news drops
  4. Competitor tracking — dedicated tool to monitor named competitors
  5. Founder network graph — map connections between analyzed founders
  6. Custom embedding models — fine-tuned models for startup/VC domain
  7. Article full-text extraction — deep content scraping for richer embeddings
  8. A/B prompt testing — experiment with different Groq prompts for analysis quality
  9. Dashboard UI — web interface for browsing intelligence feeds
  10. Webhook integrations — push intelligence to Slack, email, or CRM

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured