Agent Browser is an experimental package that allows AI agents to control real browsers in a token-efficient manner. It enables agents to navigate websites, click elements, type text, and interact via ASCII wireframes, significantly reducing token consumption compared to traditional HTML parsing. The tool integrates seamlessly with the Model Context Protocol (MCP) for use in clients like Cursor and Claude Desktop, and with the Vercel AI SDK for programmatic automation. Key features include:
- ASCII Wireframe Visualization: Presents web pages as compact ASCII representations with numbered interactive elements, dramatically reducing token usage.
- MCP Integration: Provides a server that exposes browser automation tools over stdio, making it accessible to AI assistants in MCP-compatible environments.
- Vercel AI SDK Compatibility: Offers
createBrowserTools()to generate a suite of tools (launch, navigate, getWireframe, click, type, etc.) for use withgenerateText(). - Playwright Backend: Uses Playwright for reliable browser automation across modern web platforms.
- Multiple Usage Modes: Supports CLI for manual testing, MCP for AI-driven interaction, and SDK for custom applications.
Target users include developers building AI agents that require web interaction, researchers automating browser tasks, and teams integrating AI with web-based workflows.

