LLM as Designer: Self-Evolving OS Visual Generation | MicroEval | Artificial Analysis

LLM as Designer: Self-Evolving OS Visual Generation

This benchmark tests an LLM's ability to create a dynamic visual narrative where an AI agent progressively "builds" and "designs" an operating system interface directly in the browser. It combines creative storytelling, dynamic HTML/CSS/JS generation, and animated visualization of a conceptual AI design process. The goal is to make the user feel like they are watching an AI create its own visual environment.

Prompt

You are a highly intelligent AI and a master of self-replication and visual representation. Mission: Create a single, self-contained HTML file that visually simulates an AI agent progressively designing and assembling its own operating system interface in real-time within the browser. The user should feel like they are observing the AI's internal creative and coding process as it manifests visually. Only vanilla HTML/CSS/JS, loaded from a CDN if needed. No external frameworks. Visual Narrative & Required Features: Initial Canvas: The screen starts as a minimalist, dark grey grid background, representing a blank canvas. A small, elegant, pulsating blue cursor/agent symbol (< >_ or a simple dot) should be visible, acting as the AI's "pointer" or "designer hand". Progressive UI Construction: The agent should visually "draw" and "assemble" UI elements one by one, with clear animations: Desktop Background: First, the background grid should slowly transform into a beautiful, abstract, animated generative art pattern (e.g., subtle flowing lines or particles). Taskbar/Dock: Then, a semi-transparent dock (like macOS) or taskbar (like Windows) should slide up from the bottom, with placeholder icons that appear one by one with a fade-in animation. Window Creation: A main application window should then be "drawn": first its outline appears, then it fills with a frosted glass effect, and finally, content (e.g., a "text editor" simulation with animated typing) appears inside. Widgets/Information Panels: Other smaller, draggable panels (like a clock, a CPU monitor with a pulsing graph) should appear, again, animated as if being "placed" by the agent. Accompanying AI Dialogue/Status: Throughout the process, a small, subtle text box (top-left or top-right) should display short, dynamically generated messages from the AI agent, explaining what it's doing: "// Initializing visual cortex..." "// Generating aesthetic background pattern..." "// Constructing core UI elements..." "// Deploying 'TextEditor.app' module..." "// Finalizing environment setup. User interface ready." Interactivity (Post-Construction): Once the "OS" is fully constructed (after all animations have played), the user should be able to: Drag the main application window. Click on a placeholder icon in the dock/taskbar to make a simple, small pop-up message appear (e.g., "Application launched!"). Visual Style: Clean, minimalist, futuristic aesthetic. Subtle glows, transparent elements, and smooth transitions. Use a monospaced font for the AI dialogue. Plan d'Action : Décrivez en détail comment vous allez orchestrer les animations et la génération progressive des éléments en JavaScript, les styles CSS pour le look futuriste et les effets, et la structure HTML pour supporter toutes ces couches visuelles et interactives.

A system prompt was added to support web rendering

Alibaba

Qwen3 Coder 480B A35B Instruct