Visualizing AI Agents with Showboat and Rodney

The evolution of Large Language Models (LLMs) has transitioned rapidly from simple chat interfaces to autonomous 'agents' capable of interacting with the physical and digital world. However, as these agents move from generating text to performing complex browser-based tasks—like booking flights, scraping data, or managing SaaS dashboards—a significant problem has emerged: observability. When an agent fails, or even when it succeeds, it often happens in a 'black box' of headless browser instances. Developers and stakeholders need a way to see what the agent is doing in real-time or via high-quality replays. This is where tools like Showboat and Rodney come into play, revolutionizing how we demo and debug agentic workflows.

The Observability Crisis in Agentic Workflows

Traditional LLM applications are easy to debug; you look at the prompt, the completion, and perhaps the RAG (Retrieval-Augmented Generation) context. But AI agents, particularly those using tools like Playwright or Selenium, operate across multiple stateful steps. If an agent clicks the wrong button because a CSS selector changed, a text log might simply say 'Element not found.' This is insufficient for modern enterprise-grade AI development.

To build reliable agents, developers are increasingly turning to high-performance API aggregators like n1n.ai. By utilizing n1n.ai, developers can switch between models like Claude 3.5 Sonnet (renowned for its computer use capabilities) and GPT-4o-mini to find the right balance of cost and visual reasoning. But even with the best models, the visual feedback loop remains the missing link.

Introducing Showboat: The Automated Demo Engine

Showboat is a specialized tool designed to solve the 'how do I show people what my agent did?' problem. Built on top of Playwright, Showboat allows developers to record the browser interactions of an AI agent and transform them into polished, shareable video demos.

Unlike standard screen recording, Showboat is 'agent-aware.' It can highlight the specific elements the AI is looking at, display the 'thoughts' or 'chain-of-thought' reasoning as subtitles, and handle the asynchronous nature of web navigation. For developers using n1n.ai to power their agents, Showboat provides the visual evidence needed to prove that the LLM is following instructions correctly across different domains.

Key Features of Showboat:

Headless-to-Video Pipeline: Automatically converts Playwright traces into MP4 or GIF formats.
Visual Annotations: Overlays the agent's intent and target elements directly on the video frame.
State Persistence: Captures the exact state of the DOM at every interaction point.

Rodney: The Agent Who Shows His Work

While Showboat is the engine, Rodney represents the implementation pattern. Rodney is a reference agent that demonstrates how to integrate browser-use capabilities with a feedback-driven UI. Rodney doesn't just perform a task; he 'performs' for the user. He explains why he is navigating to a specific URL and what he expects to find there.

This level of transparency is critical for user trust. In a corporate environment, a user is unlikely to let an agent touch a production database or a financial tool unless they can audit the process. Rodney shows that by combining robust LLM APIs from n1n.ai with visual logging, we can create agents that are both powerful and accountable.

Technical Implementation: Building a Demo-Ready Agent

To implement a system similar to Rodney using Showboat, you need a robust backend. Below is a conceptual implementation using Python, Playwright, and an LLM endpoint.

import asyncio
from playwright.async_api import async_playwright
# Ensure you have your API key from n1n.ai configured

async def run_agent_demo(prompt):
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        # Showboat hook: Start recording the context
        context = await browser.new_context(record_video_dir="videos/")
        page = await context.new_page()

        # Example: Using a model via n1n.ai to decide the next action
        # response = await call_n1n_api(model="claude-3-5-sonnet", prompt=prompt)

        await page.goto("https://example.com")
        await page.fill("input[name='q']", "AI Agents")
        await page.click("button[type='submit']")

        # Log the 'thought' process for Showboat to overlay
        print("Agent Thought: Searching for AI Agents to demonstrate capability.")

        await context.close()
        await browser.close()

# Pro Tip: Use n1n.ai for low-latency model switching during development

Why Latency and Reliability Matter

When recording a demo with Showboat, the 'vibe' of the video is dictated by the speed of the agent's decisions. If your LLM API takes 10 seconds to generate the next click action, the resulting video will feel sluggish and unrefined. This is why infrastructure matters.

By leveraging the global edge network of n1n.ai, developers can reduce the Time To First Token (TTFT), ensuring that the agent's movements in the recorded video appear fluid and human-like. Furthermore, n1n.ai provides a unified interface for models like DeepSeek-V3 and OpenAI o3, allowing you to test which model produces the most 'demo-worthy' browser logic without rewriting your entire integration layer.

Comparison: Showboat vs. Traditional Logging

Feature	Traditional Logging	Showboat + Rodney
Format	Text/JSON	MP4/GIF/Interactive Replay
Audience	Developers only	Stakeholders, Clients, QA
Context	Abstract (Selectors/Strings)	Visual (Actual UI Rendering)
Debugging	Hard (Requires reconstruction)	Easy (Visual confirmation)
API Integration	Manual tracing	Native Playwright integration

Pro Tips for High-Quality Agent Demos

Slow Down for Humans: AI agents can interact with DOM elements faster than the human eye can follow. When using Showboat, add artificial delays (e.g., 500ms) between actions so the video remains watchable.
Dynamic Subtitles: Pass the 'reasoning' field from your LLM (especially if using models with internal monologues like OpenAI o1 or o3 via n1n.ai) into the video overlay.
Resolution Consistency: Always set a fixed viewport size (e.g., 1280x720) to ensure your demos look professional across different platforms.
Failure Analysis: Configure Showboat to save recordings only when an agent fails. This saves storage while providing a 'black box recorder' for your AI's mistakes.

The Future of Agent Observability

As we move toward 'Small Language Models' (SLMs) running locally and massive 'Frontier Models' running in the cloud, the orchestration of these tools will become the primary challenge for AI engineers. Tools like Showboat and Rodney aren't just 'cool to have'; they are foundational to the deployment of autonomous systems. They bridge the gap between 'the code works' and 'the user trusts the code.'

By combining these visualization tools with a stable, high-speed API backbone like n1n.ai, you can build agents that don't just work in the dark, but shine in the spotlight.

Get a free API key at n1n.ai

Source: https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#atom-entries