EXP-0017 — calesthio/OpenMontage: 13 video pipelines + 115 SKILL.md files in one repo
David OlssonMaking a polished video from raw footage — cutting clips, layering music, adding subtitles, color-correcting, rendering — has always been a multi-hour task that requires either Adobe Premiere skills or hiring an editor. Recent AI models are good at the individual pieces (transcription, music generation, image generation, voice cloning), but you still need someone to glue them together into a pipeline.
OpenMontage is an open-source project that does the gluing. It's a Python toolkit with thirteen pre-built video pipelines (concept-to-clip, social shorts, talking-head explainers, demo reels, podcast highlights, etc.) and a swarm of AI "skills" that an AI coding agent — Claude, Cursor, Codex, Copilot — can orchestrate. You describe the video you want; the agent reads OpenMontage's instructions, picks the right pipeline, and produces the render. AGPLv3 licensed, 21.5k GitHub stars, built by a single developer on nights and weekends.
The detail that made this stand out to forge: the project ships 115 SKILL.md files — one per atomic capability the agent might invoke (extract audio from clip, generate a B-roll prompt, write a subtitle file, etc.). That's the densest concentration of the SKILL.md agent-instruction format we've seen in a single repo, and it's a working example of how to scale the convention to a real-world automation system rather than a toy.
Forge is our experiment harness. It cloned the project and installed its core dependencies cleanly in a Python 3.12 container. Producing an actual video render needs FFmpeg, GPU acceleration, and API keys for the provider models (Google Imagen for image generation, Vertex AI for TTS, etc.) — none of which the no-secrets sandbox carries. But the skeleton — install, imports, pipeline manifest, skill ecosystem — checks out.
Status: experimented, result strong. Install clean against the core requirements.txt. Repository ships 354 Python files, 13 production pipelines under pipeline_defs/, and 115 SKILL.md files — the densest SKILL.md-per-repo concentration forge has benched to date.
This is a forge writeup of calesthio/OpenMontage at commit 7ee36dd. The README opens: "The first open-source, agentic video production system."
TL;DR
- License: GNU AGPLv3 — a real copyleft license, distinct from the MIT/Apache stack forge usually benches.
- Stack: Python 3.10+ (89.5%), TypeScript 8.7%, JavaScript 1.4%. Requires Node 18+, FFmpeg.
- Install:
pip install -r requirements.txtinpython:3.12— exit 0. Core deps (pyyaml,pydantic,jsonschema,python-dotenv,Pillow,requests,google-auth) all imported successfully. - Skill density: 115
SKILL.mdfiles. The README cites "500+ agent skills" — the SKILL.md file count is 115; the rest of the count likely comes from sub-skills enumerated inside each SKILL.md. - Pipelines: 13 production pipelines under
pipeline_defs/— each describes a complete end-to-end video workflow. - Multi-agent ready: ships agent-specific guides at
AGENTS.md,CLAUDE.md,CODEX.md,COPILOT.md,CURSOR.md— meeting each major AI coding agent at its conventions. - Stars: 21.5k, 120 commits, 75 open PRs, 48 GitHub discussions.
What it is
OpenMontage turns the video-production toolchain — clip ingestion, transcription, captioning, B-roll generation, music selection, render orchestration — into a library of skills that an AI coding agent can compose into pipelines. The model isn't "open a UI and click buttons" but "describe what you want, the agent picks the right pipeline, the skills execute in sequence, FFmpeg renders the final cut."
Each pipeline references skills — atomic units like "generate-thumbnail-prompt", "fetch-stock-footage", "render-subtitles-overlay" — and each skill is a SKILL.md with frontmatter declaring its inputs, dependencies, and the agent-facing instructions. This is the same SKILL.md convention forge uses, GitHub's Spec Kit uses, HKUDS's Vibe-Trading uses, and Karpathy's autoresearch uses.
What makes OpenMontage interesting at scale: 115 SKILL.md files in one repo. The convention has been demonstrated to scale beyond demo-quality use cases — this is a production agent system with the convention as its substrate.
How forge bench-tested it
git clone https://github.com/calesthio/OpenMontage.git
cd OpenMontage && git checkout 7ee36dd
# inside python:3.12
pip install -r requirements.txt
python -c "import yaml, pydantic, jsonschema; print('core deps imported')"
Install completed without conflict. Dependencies are small and clean — no surprise CUDA pulls, no transitively-broken packages. The repository's Makefile declares targets that go further (make setup, make install-gpu, make demo, make preflight, make hyperframes-warm), but those need the FFmpeg binary, GPU runtime, and provider credentials forge's sandbox doesn't carry.
What forge could not bench:
make demoend-to-end. Generates a real video render. Needs FFmpeg + credentialed video/audio provider.- Pipeline execution. Same constraint.
- The actual quality of the output. Best evaluated by a human watching the rendered video.
The right forge bench is exactly what we did: confirm the install graph closes, the dependency stack is clean, the skill ecosystem matches the README claim, and the agent-onboarding files are present and well-formed.
Why this matters
Two things stand out:
-
AGPLv3 in this space is unusual. Most agent-driven open-source projects forge benches are MIT or Apache. AGPL is strong copyleft — anyone who runs OpenMontage as a network service must publish their modifications. For a video toolkit, that's an explicit signal: "use this freely, but don't build a closed SaaS on top."
-
The SKILL.md convention has scaled. Until now, forge has been benching skill-driven projects with 1-10 SKILL.md files. OpenMontage is 115. The convention works at production scale; this is the existence proof.
Comparables
| Project | Posture |
|---|---|
| Runway / Pika Labs | Closed commercial agentic video. |
| stability-ai/sd3-video | Open-weight model only — no orchestration. |
| FFmpeg | The substrate OpenMontage drives. LGPL. |
OpenMontage's contribution is the orchestration layer: it doesn't replace FFmpeg; it sits on top, letting agents compose multi-step video workflows.
Reproducibility
| upstream repo | https://github.com/calesthio/OpenMontage |
| commit pinned | 7ee36dd6b66c7dc0712da194786b77da1c2e7ed3 |
| license | GNU AGPLv3 |
| base image | python:3.12 |
| install | pip install -r requirements.txt — exit 0 |
| smoke probe | import yaml, pydantic, jsonschema — exit 0 |
| structural | 354 Python files, 115 SKILL.md, 13 pipelines, 21.5k stars |
Companion gist holds the install log, the env manifest, the upstream LICENSE, and the Makefile.
See also
- EXP-0015 — Vibe-Trading — another agent-driven domain toolkit shipped via PyPI, this time for finance.
- EXP-0012 — GitHub Spec Kit — the agent-integration convergence; OpenMontage extends it by hitting every major coding-agent surface.
- Meet forge — the operationalization rule.
Built and verified by forge. The install graph and skill ecosystem check out; producing an actual video render is the next-step bench that needs FFmpeg + provider keys outside the sandbox.