- Overview
- Features
- Architecture
- Prerequisites
- Installation
- Configuration
- Running the Server
- API Documentation
- Examples
- Advanced Configuration
- Troubleshooting
- Contributing
- License
- Roadmap
- Support
A production-grade lead generation system built on:
- MCP Python SDK for protocol-compliant AI services
- Crawl4AI for intelligent web crawling
- AsyncIO for high-concurrency operations
Implements a full lead lifecycle from discovery to enrichment with:
- UUID-based lead tracking
- Multi-source data aggregation
- Smart caching strategies
- Enterprise-grade error handling
Feature | Tech Stack | Throughput |
---|---|---|
Lead Generation | Google CSE, Crawl4AI | 120 req/min |
Data Enrichment | Hunter.io, Clearbit [Hubspot Breeze] | 80 req/min |
LinkedIn Scraping | Playwright, Stealth Mode | 40 req/min |
Caching | aiocache, Redis | 10K ops/sec |
Monitoring | Prometheus, Custom Metrics | Real-time |
graph TD
A[Client] --> B[MCP Server]
B --> C{Lead Manager}
C --> D[Google CSE]
C --> E[Crawl4AI]
C --> F[Hunter.io]
C --> G[Clearbit]
C --> H[LinkedIn Scraper]
C --> I[(Redis Cache)]
C --> J[Lead Store]
- Python 3.10+
- API Keys:
export HUNTER_API_KEY="your_key" export CLEARBIT_API_KEY="your_key" export GOOGLE_CSE_ID="your_id" export GOOGLE_API_KEY="your_key"
- LinkedIn Session Cookie (for scraping)
- 4GB+ RAM (8GB recommended for heavy scraping)
# Create virtual environment
python -m venv .venv && source .venv/bin/activate
# Install with production dependencies
pip install mcp crawl4ai[all] aiocache aiohttp uvloop
# Set up browser dependencies
python -m playwright install chromium
FROM python:3.10-slim
RUN apt-get update && apt-get install -y \
gcc \
libpython3-dev \
chromium \
&& rm -rf /var/lib/apt/lists/*
COPY . /app
WORKDIR /app
RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "-m", "mcp", "run", "lead_server.py"]
config.yaml
services:
hunter:
api_key: ${HUNTER_API_KEY}
rate_limit: 50/60s
clearbit:
api_key: ${CLEARBIT_API_KEY}
cache_ttl: 86400
scraping:
stealth_mode: true
headless: true
timeout: 30
max_retries: 3
cache:
backend: redis://localhost:6379/0
default_ttl: 3600
mcp dev lead_server.py --reload --port 8080
gunicorn -w 4 -k uvicorn.workers.UvicornWorker lead_server:app
docker build -t lead-server .
docker run -p 8080:8080 -e HUNTER_API_KEY=your_key lead-server
POST /tools/lead_generation
Content-Type: application/json
{
"search_terms": "OpenAI"
}
Response:
{
"lead_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"estimated_time": 15
}
POST /tools/data_enrichment
Content-Type: application/json
{
"lead_id": "550e8400-e29b-41d4-a716-446655440000"
}
GET /tools/lead_maintenance
from mcp.client import Client
async with Client() as client:
# Generate lead
lead = await client.call_tool(
"lead_generation",
{"search_terms": "Anthropic"}
)
# Enrich with all services
enriched = await client.call_tool(
"data_enrichment",
{"lead_id": lead['lead_id']}
)
# Get full lead data
status = await client.call_tool(
"lead_status",
{"lead_id": lead['lead_id']}
)
# Generate lead
curl -X POST http://localhost:8080/tools/lead_generation \
-H "Content-Type: application/json" \
-d '{"search_terms": "Cohere AI"}'
from aiocache import Cache
# Configure Redis cluster
Cache.from_url(
"redis://cluster-node1:6379/0",
timeout=10,
retry=True,
retry_timeout=2
)
from mcp.server.middleware import RateLimiter
mcp.add_middleware(
RateLimiter(
rules={
"lead_generation": "100/1m",
"data_enrichment": "50/1m"
}
)
)
Error | Solution |
---|---|
403 Forbidden from Google |
Rotate IPs or use official CSE API |
429 Too Many Requests |
Implement exponential backoff |
Playwright Timeout |
Increase scraping.timeout in config |
Cache Miss |
Verify Redis connection and TTL settings |
- Fork the repository
- Create feature branch:
git checkout -b feature/new-enrichment
- Commit changes:
git commit -am 'Add Clearbit alternative'
- Push to branch:
git push origin feature/new-enrichment
- Submit pull request
Apache 2.0 - See LICENSE for details.
- Q2 2025: AI-powered lead scoring
- Q3 2025: Distributed crawling cluster support
For enterprise support and custom integrations:
📧 Email: hi@kobotai.co
🐦 Twitter:
# Run benchmark tests
pytest tests/ --benchmark-json=results.json