nova-act MCP Server

Use This MCP server To

Automate multi-step web browsing workflows via MCP agents Capture on-demand browser screenshots for visual context Control browser sessions programmatically using Amazon Nova Act SDK Reduce token usage by selectively capturing visual feedback Integrate browser automation into AI-enhanced workflows Improve agent performance with smaller response payloads

README

nova-act-mcp

nova‑act‑mcp‑server is a zero‑install Model Context Protocol (MCP) server that exposes Amazon Nova Act browser‑automation tools.

What's New in v3.0.0

On-Demand Screenshots: New inspect_browser tool to explicitly request screenshots only when needed
Reduced Token Usage: Browser actions no longer automatically include screenshots, saving context space
More Efficient Workflows: Agents can now control when to get visual feedback
Better Performance: Smaller response payloads improve overall agent experience

New `inspect_browser` Tool Example

# Start a browser session
start_result = await control_browser(action="start", url="https://example.com")
session_id = start_result["session_id"]

# Execute an action without getting a screenshot
execute_result = await control_browser(
    action="execute",
    session_id=session_id,
    instruction="Click on the 'More information...' link"
)

# Now explicitly request a screenshot to see the result
inspect_result = await inspect_browser(session_id=session_id)

# Example output from inspect_browser:
{
  "session_id": "f8a53291-b3a7-4e1e-8c9d-9a12b3c45d67",
  "current_url": "https://www.iana.org/domains/reserved",
  "page_title": "IANA — IANA-managed Reserved Domains",
  "content": [
    {
      "type": "image_base64",
      "data": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAMCA...",
      "caption": "Current viewport"
    },
    {
      "type": "text",
      "text": "Current URL: https://www.iana.org/domains/reserved\nPage Title: IANA — IANA-managed Reserved Domains"
    }
  ],
  "agent_thinking": [],
  "success": true
}

What's New in v0.2.9

Improved Screenshot Reliability: More dependable screenshot delivery in responses
Enhanced Log Path Discovery: Smart, efficient path tracking for logs and screenshots
Better Agent Communication: Clear messaging when screenshots can't be embedded
Improved Performance: Eliminated inefficient directory scanning for faster responses

What's New in v0.2.8

Enhanced Inline Screenshots: Screenshots now appear directly in the response content array
Improved compatibility with vision-capable models like Claude
Screenshots include descriptive captions based on the executed instruction
Each screenshot is delivered as { type: "image_base64", data: "..." } in the content array

What's New in v0.2.7

Automatic Inline Screenshots: Every browser action now includes an optimized screenshot
Improved screenshot quality and reliability for AI agents
Added environment variables to customize screenshot quality and size limits
Comprehensive test coverage ensuring screenshots work in all scenarios

New Feature: Inline Screenshots

Every successful execute response now contains inline_screenshot, a base64-encoded JPEG of the current viewport:

Quality ≈ 45, hard-capped at 250 KB (configurable via NOVA_MCP_MAX_INLINE_IMG env variable)
If the raw JPEG is larger than the cap, the field is null
No extra API calls needed - screenshots are included automatically
For full-resolution images and HAR/HTML logs, use the compress_logs tool

What's New in v0.2.6

Added compatibility with NovaAct SDK 0.9+ by normalizing log directory handling
Improved test organization with clear markers for unit, mock, smoke and e2e tests
Moved mock HTML creation logic from production code to test helpers
Fixed several syntax errors and incomplete code blocks
Added SCREENSHOT_QUALITY constant for consistent compression settings

Quick start (uvx)

Add it to your MCP client configuration:

{
  "mcpServers": {
    "nova-act-mcp-server": {
      "command": "uvx",
      "args": ["nova-act-mcp-server@latest"],
      "env": { "NOVA_ACT_API_KEY": "<your_api_key>" }
    }
  }
}

That's all you need to start controlling browsers from any MCP‑compatible client such as Claude Desktop or VS Code.

Local development (optional)

git clone https://github.com/madtank/nova-act-mcp.git
cd nova-act-mcp
uv sync
uv run nova_mcp.py

License

MIT

nova-act-mcp FAQ

How does nova-act-mcp reduce token usage during browser automation?

It allows agents to request screenshots only on demand, avoiding automatic screenshot capture and saving context space, improving token efficiency.

Can nova-act-mcp control multiple browser sessions simultaneously?

Yes, it supports managing multiple browser sessions via the Amazon Nova Act SDK, enabling complex multi-step workflows.

What is the inspect_browser tool in nova-act-mcp?

The inspect_browser tool lets agents explicitly request browser screenshots only when needed, optimizing workflow efficiency.

How does nova-act-mcp improve agent performance?

By reducing response payload sizes and controlling visual feedback capture, it enhances overall agent responsiveness and efficiency.

Is nova-act-mcp compatible with different LLM providers?

Yes, it is designed to work with various LLMs including OpenAI, Anthropic Claude, and Google Gemini, through the MCP protocol.

What programming languages or environments support nova-act-mcp?

It is primarily a Python-based MCP server but can be integrated into any environment supporting MCP clients and Python interoperability.

How do I start a browser session using nova-act-mcp?

You can start a session by calling the control_browser action with the 'start' command and specifying the URL, as shown in the example in the documentation.

Does nova-act-mcp require installation?

It is a zero-install MCP server, meaning it can be used without complex setup, simplifying deployment.

nova-act-mcp

Use This MCP server To

README

nova-act-mcp

What's New in v3.0.0

New inspect_browser Tool Example

What's New in v0.2.9

What's New in v0.2.8

What's New in v0.2.7

New Feature: Inline Screenshots

What's New in v0.2.6

Quick start (uvx)

Local development (optional)

License

nova-act-mcp FAQ

New `inspect_browser` Tool Example