MARVIN — AI Chief of Staff

The Problem

Managing multiple projects simultaneously — CREOpsDesk, Burrfect, Woordjes, the overnight briefing pipeline, and more — creates relentless context-switching. Staying on top of overnight activity across repos, coordinating autonomous work, catching up on what happened while I was asleep or with the kids — it was too much cognitive overhead for one person.

I needed something that could act as a chief of staff: handle the morning catch-up, know the state of every project, coordinate work across repos, and run autonomous agents with proper guardrails — so I could spend my limited focus time on product decisions instead of project management overhead.

My Role

I did not build MARVIN from scratch. I downloaded an open-source AI assistant and then heavily customized it — adding 17 Skills, production workflows, persistent memory integration, and the operational patterns that make it useful for daily work. The customization IS the project. Taking a generic tool and turning it into a domain-specific AI chief of staff is the interesting part.

What I Built

The core contribution is 17 Claude Code Skills — each one a structured workflow that MARVIN can invoke for specific tasks. These aren't simple prompts; they're multi-step procedures with guardrails, validation checkpoints, and domain-specific knowledge baked in.

Overnight Briefing Skill

Processes email overnight, translates Dutch communications, reconciles against my task list in Blitzit, and produces an interactive HTML briefing I can review on my phone before the day starts. This Skill orchestrates the full pipeline — triggering the local LLM for privacy-sensitive email reading, syncing task state, and publishing the result.

Project Orchestration

Coordinates work across multiple repos. MARVIN understands the relationships between projects (e.g., CREOpsDesk's Django app, its n8n workflows, and its Puppeteer scripts live in three different repos but need to stay in sync). The orchestration Skill handles cross-repo awareness so changes in one repo don't break assumptions in another.

Autonomous Task Execution

Delegates work to sub-agents with proper guardrails. MARVIN can spin up agents to handle implementation tasks, but with structured review checkpoints — it doesn't just fire-and-forget. The Skill enforces test-driven development, verification before completion, and explicit approval gates for destructive operations.

Development Workflow Skills

A suite of Skills encoding best practices I've learned the hard way:

Systematic Debugging — structured diagnosis before proposing fixes, not shotgun troubleshooting
Test-Driven Development — write tests first, then implementation, enforced by the Skill workflow
Code Review — both requesting and receiving reviews with technical rigor, not performative agreement
Verification Before Completion — run the actual verification commands and confirm output before claiming anything is done
Plan Writing and Execution — break work into plans before coding, execute with review checkpoints
Brainstorming — explore intent, requirements, and design before jumping to implementation

Persistent Memory via MCP

MARVIN connects to a shared memory server (Hindsight) via Model Context Protocol. This means context persists across sessions — decisions made last week, architecture choices from last month, project-specific knowledge — all recallable without me re-explaining. Each session starts with MARVIN recalling relevant context instead of starting from zero.

The Result

MARVIN has been in daily production use since January 2026. Every morning starts with a briefing that's already processed overnight email, translated Dutch communications, and reconciled my task list. Multi-repo work is coordinated without me manually tracking which changes affect which projects. Autonomous agents handle implementation tasks with proper guardrails while I focus on product decisions.

The 17 Skills encode operational patterns I've refined over months of daily use. They're not theoretical — each one exists because I hit a real problem (agents skipping tests, changes breaking other repos, debug sessions going in circles) and built a structured workflow to prevent it from happening again.

Tech Stack

Foundation: Claude Code (open-source AI assistant)
Customization: 17 Claude Code Skills (structured multi-step workflows)
Integration: MCP (Model Context Protocol) servers — Hindsight memory, Gmail, Blitzit, Google Calendar, Google Drive
Orchestration: Python scripts, bash automation, macOS LaunchAgents
AI Models: Claude (cloud), Qwen 35B (local via LM Studio for privacy-sensitive tasks)