Lead Generation Server Documentation

Overview
Features
Architecture
Prerequisites
Installation
Configuration
Running the Server
API Documentation
Examples
Advanced Configuration
Troubleshooting
Contributing
License
Roadmap
Support

Overview

A production-grade lead generation system built on:

MCP Python SDK for protocol-compliant AI services
Crawl4AI for intelligent web crawling
AsyncIO for high-concurrency operations

Implements a full lead lifecycle from discovery to enrichment with:

UUID-based lead tracking
Multi-source data aggregation
Smart caching strategies
Enterprise-grade error handling

Features

Feature	Tech Stack	Throughput
Lead Generation	Google CSE, Crawl4AI	120 req/min
Data Enrichment	Hunter.io, Clearbit [Hubspot Breeze]	80 req/min
LinkedIn Scraping	Playwright, Stealth Mode	40 req/min
Caching	aiocache, Redis	10K ops/sec
Monitoring	Prometheus, Custom Metrics	Real-time

Architecture

graph TD
    A[Client] --> B[MCP Server]
    B --> C{Lead Manager}
    C --> D[Google CSE]
    C --> E[Crawl4AI]
    C --> F[Hunter.io]
    C --> G[Clearbit]
    C --> H[LinkedIn Scraper]
    C --> I[(Redis Cache)]
    C --> J[Lead Store]

Prerequisites

Python 3.10+

API Keys:

export HUNTER_API_KEY="your_key"
export CLEARBIT_API_KEY="your_key"
export GOOGLE_CSE_ID="your_id"
export GOOGLE_API_KEY="your_key"

LinkedIn Session Cookie (for scraping)
4GB+ RAM (8GB recommended for heavy scraping)

Installation

Production Setup

# Create virtual environment
python -m venv .venv && source .venv/bin/activate

# Install with production dependencies
pip install mcp crawl4ai[all] aiocache aiohttp uvloop

# Set up browser dependencies
python -m playwright install chromium

Docker Deployment

FROM python:3.10-slim

RUN apt-get update && apt-get install -y \
    gcc \
    libpython3-dev \
    chromium \
    && rm -rf /var/lib/apt/lists/*

COPY . /app
WORKDIR /app

RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "-m", "mcp", "run", "lead_server.py"]

Configuration

config.yaml

services:
  hunter:
    api_key: ${HUNTER_API_KEY}
    rate_limit: 50/60s
    
  clearbit:
    api_key: ${CLEARBIT_API_KEY}
    cache_ttl: 86400

scraping:
  stealth_mode: true
  headless: true
  timeout: 30
  max_retries: 3

cache:
  backend: redis://localhost:6379/0
  default_ttl: 3600

Running the Server

Development Mode

mcp dev lead_server.py --reload --port 8080

Production

gunicorn -w 4 -k uvicorn.workers.UvicornWorker lead_server:app

Docker

docker build -t lead-server .
docker run -p 8080:8080 -e HUNTER_API_KEY=your_key lead-server

API Documentation

1. Generate Lead

POST /tools/lead_generation
Content-Type: application/json

{
  "search_terms": "OpenAI"
}

Response:

{
  "lead_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "estimated_time": 15
}

2. Enrich Lead

POST /tools/data_enrichment
Content-Type: application/json

{
  "lead_id": "550e8400-e29b-41d4-a716-446655440000"
}

3. Monitor Leads

GET /tools/lead_maintenance

Examples

Python Client

from mcp.client import Client

async with Client() as client:
    # Generate lead
    lead = await client.call_tool(
        "lead_generation",
        {"search_terms": "Anthropic"}
    )
    
    # Enrich with all services
    enriched = await client.call_tool(
        "data_enrichment",
        {"lead_id": lead['lead_id']}
    )
    
    # Get full lead data
    status = await client.call_tool(
        "lead_status",
        {"lead_id": lead['lead_id']}
    )

cURL

# Generate lead
curl -X POST http://localhost:8080/tools/lead_generation \
  -H "Content-Type: application/json" \
  -d '{"search_terms": "Cohere AI"}'

Advanced Configuration

Caching Strategies

from aiocache import Cache

# Configure Redis cluster
Cache.from_url(
    "redis://cluster-node1:6379/0",
    timeout=10,
    retry=True,
    retry_timeout=2
)

Rate Limiting

from mcp.server.middleware import RateLimiter

mcp.add_middleware(
    RateLimiter(
        rules={
            "lead_generation": "100/1m",
            "data_enrichment": "50/1m"
        }
    )
)

Troubleshooting

Error	Solution
`403 Forbidden` from Google	Rotate IPs or use official CSE API
`429 Too Many Requests`	Implement exponential backoff
`Playwright Timeout`	Increase `scraping.timeout` in config
`Cache Miss`	Verify Redis connection and TTL settings

Contributing

Fork the repository
Create feature branch: git checkout -b feature/new-enrichment
Commit changes: git commit -am 'Add Clearbit alternative'
Push to branch: git push origin feature/new-enrichment
Submit pull request

License

Apache 2.0 - See LICENSE for details.

Roadmap

Q2 2025: AI-powered lead scoring
Q3 2025: Distributed crawling cluster support

Support

For enterprise support and custom integrations:
📧 Email: hi@kobotai.co
🐦 Twitter: @KobotAIco

# Run benchmark tests
pytest tests/ --benchmark-json=results.json

inbound-mcp

Use This MCP server To

README