inbound-mcp

MCP.Pizza Chef: bashirk

Inbound-mcp is a production-grade MCP server designed for automated lead generation by scraping and analyzing web data. It leverages the MCP Python SDK for protocol compliance, Crawl4AI for intelligent web crawling, and AsyncIO for high concurrency, enabling efficient and scalable inbound sales lead extraction and generation workflows.

Use This MCP server To

Automate scraping of potential sales leads from targeted websites Generate structured lead data for inbound sales pipelines Integrate with CRM systems to enrich lead information Run high-concurrency web crawling for large-scale lead discovery Provide real-time lead data to AI agents via MCP protocol Enable AI-driven filtering and scoring of inbound leads Support asynchronous lead generation workflows for efficiency

README

Lead Generation Server Documentation

MCP SDK Crawl4AI Python

Table of Contents

  1. Overview
  2. Features
  3. Architecture
  4. Prerequisites
  5. Installation
  6. Configuration
  7. Running the Server
  8. API Documentation
  9. Examples
  10. Advanced Configuration
  11. Troubleshooting
  12. Contributing
  13. License
  14. Roadmap
  15. Support

Overview

A production-grade lead generation system built on:

  • MCP Python SDK for protocol-compliant AI services
  • Crawl4AI for intelligent web crawling
  • AsyncIO for high-concurrency operations

Implements a full lead lifecycle from discovery to enrichment with:

  • UUID-based lead tracking
  • Multi-source data aggregation
  • Smart caching strategies
  • Enterprise-grade error handling

Features

Feature Tech Stack Throughput
Lead Generation Google CSE, Crawl4AI 120 req/min
Data Enrichment Hunter.io, Clearbit [Hubspot Breeze] 80 req/min
LinkedIn Scraping Playwright, Stealth Mode 40 req/min
Caching aiocache, Redis 10K ops/sec
Monitoring Prometheus, Custom Metrics Real-time

Architecture

graph TD
    A[Client] --> B[MCP Server]
    B --> C{Lead Manager}
    C --> D[Google CSE]
    C --> E[Crawl4AI]
    C --> F[Hunter.io]
    C --> G[Clearbit]
    C --> H[LinkedIn Scraper]
    C --> I[(Redis Cache)]
    C --> J[Lead Store]
Loading

Prerequisites

  • Python 3.10+
  • API Keys:
    export HUNTER_API_KEY="your_key"
    export CLEARBIT_API_KEY="your_key"
    export GOOGLE_CSE_ID="your_id"
    export GOOGLE_API_KEY="your_key"
  • LinkedIn Session Cookie (for scraping)
  • 4GB+ RAM (8GB recommended for heavy scraping)

Installation

Production Setup

# Create virtual environment
python -m venv .venv && source .venv/bin/activate

# Install with production dependencies
pip install mcp crawl4ai[all] aiocache aiohttp uvloop

# Set up browser dependencies
python -m playwright install chromium

Docker Deployment

FROM python:3.10-slim

RUN apt-get update && apt-get install -y \
    gcc \
    libpython3-dev \
    chromium \
    && rm -rf /var/lib/apt/lists/*

COPY . /app
WORKDIR /app

RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "-m", "mcp", "run", "lead_server.py"]

Configuration

config.yaml

services:
  hunter:
    api_key: ${HUNTER_API_KEY}
    rate_limit: 50/60s
    
  clearbit:
    api_key: ${CLEARBIT_API_KEY}
    cache_ttl: 86400

scraping:
  stealth_mode: true
  headless: true
  timeout: 30
  max_retries: 3

cache:
  backend: redis://localhost:6379/0
  default_ttl: 3600

Running the Server

Development Mode

mcp dev lead_server.py --reload --port 8080

Production

gunicorn -w 4 -k uvicorn.workers.UvicornWorker lead_server:app

Docker

docker build -t lead-server .
docker run -p 8080:8080 -e HUNTER_API_KEY=your_key lead-server

API Documentation

1. Generate Lead

POST /tools/lead_generation
Content-Type: application/json

{
  "search_terms": "OpenAI"
}

Response:

{
  "lead_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "estimated_time": 15
}

2. Enrich Lead

POST /tools/data_enrichment
Content-Type: application/json

{
  "lead_id": "550e8400-e29b-41d4-a716-446655440000"
}

3. Monitor Leads

GET /tools/lead_maintenance

Examples

Python Client

from mcp.client import Client

async with Client() as client:
    # Generate lead
    lead = await client.call_tool(
        "lead_generation",
        {"search_terms": "Anthropic"}
    )
    
    # Enrich with all services
    enriched = await client.call_tool(
        "data_enrichment",
        {"lead_id": lead['lead_id']}
    )
    
    # Get full lead data
    status = await client.call_tool(
        "lead_status",
        {"lead_id": lead['lead_id']}
    )

cURL

# Generate lead
curl -X POST http://localhost:8080/tools/lead_generation \
  -H "Content-Type: application/json" \
  -d '{"search_terms": "Cohere AI"}'

Advanced Configuration

Caching Strategies

from aiocache import Cache

# Configure Redis cluster
Cache.from_url(
    "redis://cluster-node1:6379/0",
    timeout=10,
    retry=True,
    retry_timeout=2
)

Rate Limiting

from mcp.server.middleware import RateLimiter

mcp.add_middleware(
    RateLimiter(
        rules={
            "lead_generation": "100/1m",
            "data_enrichment": "50/1m"
        }
    )
)

Troubleshooting

Error Solution
403 Forbidden from Google Rotate IPs or use official CSE API
429 Too Many Requests Implement exponential backoff
Playwright Timeout Increase scraping.timeout in config
Cache Miss Verify Redis connection and TTL settings

Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/new-enrichment
  3. Commit changes: git commit -am 'Add Clearbit alternative'
  4. Push to branch: git push origin feature/new-enrichment
  5. Submit pull request

License

Apache 2.0 - See LICENSE for details.


Roadmap

  • Q2 2025: AI-powered lead scoring
  • Q3 2025: Distributed crawling cluster support

Support

For enterprise support and custom integrations:
📧 Email: hi@kobotai.co
🐦 Twitter: @KobotAIco


# Run benchmark tests
pytest tests/ --benchmark-json=results.json

Benchmark Results

inbound-mcp FAQ

How do I install inbound-mcp?
Follow the installation guide in the README, requiring Python 3.10+, MCP SDK, and Crawl4AI.
What technologies does inbound-mcp use?
It uses MCP Python SDK for AI protocol compliance, Crawl4AI for web crawling, and AsyncIO for concurrency.
Can inbound-mcp handle high volumes of web crawling?
Yes, it uses AsyncIO to support high-concurrency operations for scalable lead scraping.
How do I configure inbound-mcp for my target websites?
Configuration details are in the README under 'Configuration' and 'Advanced Configuration' sections.
Is inbound-mcp compatible with multiple LLM providers?
Yes, it supports integration with models like OpenAI, Anthropic Claude, and Google Gemini via MCP.
How do I run the inbound-mcp server?
Use the provided commands in the 'Running the Server' section of the documentation.
Can inbound-mcp output structured lead data?
Yes, it generates structured lead information suitable for CRM ingestion or further AI processing.
How can I troubleshoot issues with inbound-mcp?
Refer to the 'Troubleshooting' section in the documentation for common problems and solutions.