Understudy is an open-source, teachable desktop AI agent for macOS that operates your computer like a human colleague across GUI, browser, shell, and messaging interfaces. It uses a five-layer architecture: native computer operation, learning from demonstrations, crystallized memory of repeated workflows, route optimization (preferring faster API/CLI paths over GUI over time), and eventual proactive autonomy. Layers 1-2 are fully implemented today. You teach it a task once via screen recording, it extracts intent (not just coordinates), and publishes a reusable SKILL.md. Built on Node.js, model-agnostic (supports Anthropic, OpenAI, Gemini, etc.), with 47 built-in skills and 8 messaging channel adapters.

17m read timeFrom github.com
Post cover image
Table of contents
The Five LayersShowcaseWhat It Can Do TodayArchitectureQuick StartPlatform SupportSupported ModelsmacOS PermissionsSkillsChannelsRepository LayoutDevelopmentStar HistoryAcknowledgmentsContributingLicense

Sort: