Data Extractor

Data Extractor

A Model Context Protocol server that extracts embedded data (such as i18n translations or key/value configurations) from TypeScript/JavaScript source code into structured JSON configuration files.

sammcj

Content Fetching
AI Content Generation
Visit Server

Tools

extract_data

Extract data content (e.g. i18n translations) from source code to a JSON file. IMPORTANT: When encountering files with data such as i18n content embedded in code, use this tool directly instead of reading the file content first. This tool will programmatically extract all translations into a structured JSON file, preserving nested objects, arrays, template variables, and formatting. This helps keep translations as configuration and prevents filling up the AI context window with translation content. By default, the source file will be replaced with "MIGRATED TO <target absolute path>" and a warning message after successful extraction, making it easy to track where the data was moved to. This behaviour can be disabled by setting the DISABLE_SOURCE_REPLACEMENT environment variable to 'true'. The warning message can be customized by setting the WARNING_MESSAGE environment variable.

extract_svg

Extract SVG components from React/TypeScript/JavaScript files into individual .svg files. This tool will preserve the SVG structure and attributes while removing React-specific code. By default, the source file will be replaced with "MIGRATED TO <target absolute path>" and a warning message after successful extraction, making it easy to track where the SVGs were moved to. This behaviour can be disabled by setting the DISABLE_SOURCE_REPLACEMENT environment variable to 'true'. The warning message can be customized by setting the WARNING_MESSAGE environment variable.

README

mcp-data-extractor MCP Server

A Model Context Protocol server that extracts embedded data (such as i18n translations or key/value configurations) from TypeScript/JavaScript source code into structured JSON configuration files.

smithery badge

<a href="https://glama.ai/mcp/servers/40c3iyazm5"><img width="380" height="200" src="https://glama.ai/mcp/servers/40c3iyazm5/badge" alt="MCP Data Extractor MCP server" /></a>

Features

  • Data Extraction:

    • Extracts string literals, template literals, and complex nested objects
    • Preserves template variables (e.g., Hello, {{name}}!)
    • Supports nested object structures and arrays
    • Maintains hierarchical key structure using dot notation
    • Handles both TypeScript and JavaScript files with JSX support
    • Replaces source file content with "MIGRATED TO <target absolute path>" after successful extraction (configurable)
  • SVG Extraction:

    • Extracts SVG components from React/TypeScript/JavaScript files
    • Preserves SVG structure and attributes
    • Removes React-specific code and props
    • Creates individual .svg files named after their component
    • Replaces source file content with "MIGRATED TO <target absolute path>" after successful extraction (configurable)

Usage

Add to your MCP Client configuration:

{
  "mcpServers": {
    "data-extractor": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-data-extractor"
      ],
      "disabled": false,
      "autoApprove": [
        "extract_data",
        "extract_svg"
      ]
    }
  }
}

Basic Usage

The server provides two tools:

1. Data Extraction

Use extract_data to extract data (like i18n translations) from source files:

<use_mcp_tool>
<server_name>data-extractor</server_name>
<tool_name>extract_data</tool_name>
<arguments>
{
  "sourcePath": "src/translations.ts",
  "targetPath": "src/translations.json"
}
</arguments>
</use_mcp_tool>

2. SVG Extraction

Use extract_svg to extract SVG components into individual files:

<use_mcp_tool>
<server_name>data-extractor</server_name>
<tool_name>extract_svg</tool_name>
<arguments>
{
  "sourcePath": "src/components/icons/InspectionIcon.tsx",
  "targetDir": "src/assets/icons"
}
</arguments>
</use_mcp_tool>

Source File Replacement

By default, after successful extraction, the server will replace the content of the source file with:

  • "MIGRATED TO <target path>" for data extraction
  • "MIGRATED TO <target directory>" for SVG extraction

This helps track which files have already been processed and prevents duplicate extraction. It also makes it easy for LLMs and developers to see where the extracted data now lives when they encounter the source file later.

To disable this behavior, set the DISABLE_SOURCE_REPLACEMENT environment variable to true in your MCP configuration:

{
  "mcpServers": {
    "data-extractor": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-data-extractor"
      ],
      "env": {
        "DISABLE_SOURCE_REPLACEMENT": "true"
      },
      "disabled": false,
      "autoApprove": [
        "extract_data",
        "extract_svg"
      ]
    }
  }
}

Supported Patterns

Data Extraction Patterns

The data extractor supports various patterns commonly used in TypeScript/JavaScript applications:

  1. Simple Object Exports:
export default {
  welcome: "Welcome to our app",
  greeting: "Hello, {name}!",
  submit: "Submit form"
};
  1. Nested Objects:
export default {
  header: {
    title: "Book Your Flight",
    subtitle: "Find the best deals"
  },
  footer: {
    content: [
      "Please refer to {{privacyPolicyUrl}} for details",
      "© {{year}} {{companyName}}"
    ]
  }
};
  1. Complex Structures with Arrays:
export default {
  faq: {
    heading: "Common questions",
    content: [
      {
        heading: "What if I need to change my flight?",
        content: "You can change your flight online if:",
        list: [
          "You have a flexible fare type",
          "Your flight is more than 24 hours away"
        ]
      }
    ]
  }
};
  1. Template Literals with Variables:
export default {
  greeting: `Hello, {{username}}!`,
  message: `Welcome to {{appName}}`
};

Output Formats

Data Extraction Output

The extracted data is saved as a JSON file with dot notation for nested structures:

{
  "welcome": "Welcome to our app",
  "header.title": "Book Your Flight",
  "footer.content.0": "Please refer to {{privacyPolicyUrl}} for details",
  "footer.content.1": "© {{year}} {{companyName}}",
  "faq.content.0.heading": "What if I need to change my flight?"
}

SVG Extraction Output

SVG components are extracted into individual .svg files, with React-specific code removed. For example:

Input (React component):

const InspectionIcon: React.FC<InspectionIconProps> = ({ title }) => (
  <svg className="c-tab__icon" width="40px" id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32">
    <title>{title}</title>
    <path className="cls-1" d="M18.89,12.74a3.18,3.18,0,0,1-3.24-3.11..." />
  </svg>
);

Output (InspectionIcon.svg):

<svg width="40px" id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32">
    <path class="cls-1" d="M18.89,12.74a3.18,3.18,0,0,1-3.24-3.11..." />
</svg>

Extending Supported Patterns

The extractor uses Babel to parse and traverse the AST (Abstract Syntax Tree) of your source files. You can extend the supported patterns by modifying the source code:

  1. Add New Node Types: The extractStringValue method in src/index.ts handles different types of string values. Extend it to support new node types:
private extractStringValue(node: t.Node): string | null {
  if (t.isStringLiteral(node)) {
    return node.value;
  } else if (t.isTemplateLiteral(node)) {
    return node.quasis.map(quasi => quasi.value.raw).join('{{}}');
  }
  // Add support for new node types here
  return null;
}
  1. Custom Value Processing: The processValue method handles different value types (strings, arrays, objects). Extend it to support new value types or custom processing:
private processValue(value: t.Node, currentPath: string[]): void {
  if (t.isStringLiteral(value) || t.isTemplateLiteral(value)) {
    // Process string values
  } else if (t.isArrayExpression(value)) {
    // Process arrays
  } else if (t.isObjectExpression(value)) {
    // Process objects
  }
  // Add support for new value types here
}
  1. Custom AST Traversal: The server uses Babel's traverse to walk the AST. You can add new visitors to handle different node types:
traverse(ast, {
  ExportDefaultDeclaration(path: NodePath<t.ExportDefaultDeclaration>) {
    // Handle default exports
  },
  // Add new visitors here
});

Development

Install dependencies:

npm install

Build the server:

npm run build

For development with auto-rebuild:

npm run watch

Debugging

Since MCP servers communicate over stdio, debugging can be challenging. We recommend using the MCP Inspector, which is available as a package script:

npm run inspector

The Inspector will provide a URL to access debugging tools in your browser.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Recommended Servers

Mult Fetch MCP Server

Mult Fetch MCP Server

A versatile MCP-compliant web content fetching tool that supports multiple modes (browser/node), formats (HTML/JSON/Markdown/Text), and intelligent proxy detection, with bilingual interface (English/Chinese).

Featured
Local
AIO-MCP Server

AIO-MCP Server

🚀 All-in-one MCP server with AI search, RAG, and multi-service integrations (GitLab/Jira/Confluence/YouTube) for AI-enhanced development workflows. Folk from

Featured
Local
Persistent Knowledge Graph

Persistent Knowledge Graph

An implementation of persistent memory for Claude using a local knowledge graph, allowing the AI to remember information about users across conversations with customizable storage location.

Featured
Local
Hyperbrowser MCP Server

Hyperbrowser MCP Server

Welcome to Hyperbrowser, the Internet for AI. Hyperbrowser is the next-generation platform empowering AI agents and enabling effortless, scalable browser automation. Built specifically for AI developers, it eliminates the headaches of local infrastructure and performance bottlenecks, allowing you to

Featured
Local
React MCP

React MCP

react-mcp integrates with Claude Desktop, enabling the creation and modification of React apps based on user prompts

Featured
Local
Any OpenAI Compatible API Integrations

Any OpenAI Compatible API Integrations

Integrate Claude with Any OpenAI SDK Compatible Chat Completion API - OpenAI, Perplexity, Groq, xAI, PyroPrompts and more.

Featured
Exa MCP

Exa MCP

A Model Context Protocol server that enables AI assistants like Claude to perform real-time web searches using the Exa AI Search API in a safe and controlled manner.

Featured
AI 图像生成服务

AI 图像生成服务

可用于cursor 集成 mcp server

Featured
Web Research Server

Web Research Server

A Model Context Protocol server that enables Claude to perform web research by integrating Google search, extracting webpage content, and capturing screenshots.

Featured
Perplexity Chat MCP Server

Perplexity Chat MCP Server

MCP Server for the Perplexity API.

Featured