Building Reliable Automation Agents with MCP and CloudBrowser AI

Aug 11, 2025

Modern agents need reliable ways to interact with web apps that don’t expose APIs. The Model Context Protocol (MCP) lets you define clear tools and schemas for agents, while CloudBrowser AI executes robust, auditable browser actions.

Design principles

Minimal, composable tools: one tool per intent (login, extract, submit).
Deterministic outputs: strict JSON schemas and validation.
Observability first: structured logs, metrics, and evidence on errors.
Idempotence: safe retries and state checks to avoid double actions.

Reference tool design

Define a tool that navigates to a URL, waits for stability, extracts fields, and returns typed JSON with evidence links.

{
  "name": "browse_extract",
  "description": "Navigate to a page and extract fields by selectors",
  "input_schema": {
    "type": "object",
    "properties": {
      "url": { "type": "string" },
      "fields": { "type": "object", "additionalProperties": { "type": "string" } },
      "assert": { "type": "string" }
    },
    "required": ["url", "fields"]
  }
}

The tool calls CloudBrowser AI’s execution API with a step list and returns validated data.

Error handling patterns

Distinguish transient errors (timeouts, navigation failures) from logical errors (missing field) and branch the agent’s plan accordingly.
Use a retry budget with exponential backoff.
Attach evidence to each failure for easy human triage.

State and sessions

Persist session cookies when the flow includes logins or MFA.
Create a session manager tool (get, set, clear) to isolate auth from business actions.
Use per-tenant storage when running multi-tenant agents.

Schema-driven prompts

Guide the LLM with concrete contracts:

{
  "type": "object",
  "properties": {
    "company": { "type": "string" },
    "price": { "type": "string" },
    "inStock": { "type": "boolean" }
  },
  "required": ["company"]
}

Ask the agent to return only valid JSON matching the schema. Reject and retry on validation failure.

Observability and SLOs

Track success rate, median/95p latency, error categories, and evidence rate.
Set SLOs (e.g., 99% success on stable targets; <10s p95 execution for extracts).
Alert on burn rates and regression spikes.

Security and governance

Keep secrets in a vault; never print them in traces.
Respect robots.txt and terms of service.
Apply allowlists for target domains and rate limits per site.

Example end-to-end flow

Agent receives a task ("get current price from /pricing").
Calls session_get → login_if_needed → browse_extract.
Validates schema; if missing fields, retries with a different selector strategy.
Returns structured JSON and attaches evidence.

FAQs

Can MCP agents fill forms and upload files?

Yes. Use CloudBrowser AI steps for clicks, inputs, file uploads, and assertions.

How do I prevent loops?

Set a maximum plan depth and a retry budget. Persist breadcrumbs in the agent memory and include abort conditions.

What about CAPTCHAs?

Design flows to avoid triggering them. If unavoidable, add a human-in-the-loop escalation.