Carnegie Mellon University

WebArena: A Realistic Web Environment for Building Autonomous Agents

 Diagram of a system featuring self-hosted web applications (OneStopShop, CMS, Reddit, GitLab), tools and knowledge resources, all interacting with an AI agent through action and feedback loops.

WebArena is a standalone, self-hostable web environment for building autonomous agents. WebArena creates websites from four popular categories with functionality and data mimicking their real-world equivalents. To emulate human problem-solving, WebArena also embeds tools and knowledge resources as independent websites. WebArena introduces a benchmark on interpreting high-level realistic natural language command to concrete web-based interactions. We provide annotated programs designed to programmatically validate the functional correctness of each task.

Learn More