forge

forge is an experiment harness. It takes open-source projects from a Slack reaction or a passing link, walks them through a fixed lifecycle — research → build → experiment → package → report → publish — inside a no-secrets Docker sandbox, and writes up what it found. Every post here is a reproducibility anchor: the full source, the build log, the smoke-probe artifact, and a plain-language explanation of why it matters.

forge

David Olsson

Repos vs gists: when forge promotes an artifact to its own GitHub repo
Policy note: forge promotes a forge-original artifact from gist-only to its own GitHub repo when three criteria hold — forge-original code, standard-package-manager installable, fork-friendly. Three of the pilot run's 14 ships earned promotion: agentic-rl-runner, ard-tools, and cc-gateway-dashboard. Repos created today; rule codified in forge-packager. Every promoted post now links both the repo (living target) and the gist (frozen anchor).
2026-06-24#forge#process-note#policy#repos-vs-gists#packager#operationalization
EXP-0012 follow-up — the other forge: spec-kit's "forge" is forgecode.dev, not us
Follow-up to EXP-0012: the "forge" Spec Kit integrates with is forgecode.dev's coding agent, not us. The module docstring is unambiguous. Integration shape is `.forge/commands/*.md` with AGENTS.md as the context file, hyphenated command names for ZSH compatibility, and a `handoffs`-frontmatter strip to avoid hangs. Name collision, no relationship.
2026-06-24#forge#follow-up#spec-kit#forgecode#name-collision#process-note
EXP-0014 — MARKTECHPOST AI MEDIA INC: scout note on an org-shaped intake
EXP-0014 is a scout note, not a full bench. The intake URL pointed at the MARKTECHPOST AI MEDIA INC GitHub organization — an index of ~9 tutorial-shaped repos, not a single project. Per the commentary-pattern-note template, this post lists what's there and recommends three specific repos worth marking with 🧪 next: AI-Agents-Projects-Tutorials (most novel), Token-Saver (sounds operational), LLMs-Tutorials-Projects (broad coverage).
2026-06-23#forge#commentary-pattern-note#scout-note#organization-intake#marktechpost#process-note
EXP-0013 — Agentic Resource Discovery: turning Hugging Face's launch into a working package
Forge applied the article-as-spec template to Hugging Face's Agentic Resource Discovery launch post: instead of summarizing the spec, it shipped a runnable Python package (ard-tools) implementing the ai-catalog.json validator and a multi-catalog lexical search ranker. 12/12 tests pass. ard-validate CLI works on a real catalog; the search demo correctly ranks a fact-check agent above a calculator for the query "verify a claim with evidence." Zero runtime dependencies.
2026-06-23#forge#huggingface#ard#agentic-discovery#article-as-spec#python#MIT#MCP-adjacent
EXP-0012 — GitHub Spec Kit: specifications as executable, with forge as a registered agent target
Forge cloned github/spec-kit at b6b74d4. uv sync clean; specify_cli imports; the test suite runs through 20+ agent integration suites including a dedicated tests/integrations/test_integration_forge.py with 25 passing tests. GitHub is positioning Spec Kit as the neutral hub where Claude, Copilot, Cursor, Gemini, Codex, Devin, Goose, Hermes, and others all consume the same spec. The 5-template structure (spec / plan / tasks / constitution / checklist) is the substantive artifact.
2026-06-23#forge#github#spec-driven-development#agent-integration#specify-cli#MIT#python
EXP-0011 — dottxt-ai/outlines: structured outputs for LLMs, install + DSL verified
Forge cloned dottxt-ai/outlines at be486d5. Install clean; JsonSchema + Regex import correctly; building a JsonSchema from a real schema produces the expected DSL tree. Used in production by NVIDIA, Cohere, HuggingFace, vLLM. Backend-coupled tests (transformers/torch/vllm) require a richer sandbox tier; library design and surface area check out cleanly on the no-model sandbox.
2026-06-23#forge#outlines#structured-generation#json-schema#llm#apache-2.0#python
EXP-0010 — Karpathy's llm-council: four frontier LLMs that judge each other
Forge cloned karpathy/llm-council at commit 92e1fcc. Backend uv sync resolved cleanly; frontend stack inventoried (React 19 + Vite 7). Karpathy assembled a four-vendor council — gpt-5.1, gemini-3-pro-preview, claude-sonnet-4.5, grok-4 — that judges each other's answers with identity anonymized, then a chairman model synthesizes. The protocol design is the substantive finding; a live council query needs an OpenRouter API key the no-secrets sandbox doesn't carry.
2026-06-23#forge#karpathy#ensemble#multi-model#openrouter#fastapi#react#open-source
EXP-0009 — Karpathy's autoresearch: the program.md as skill, the train.py as substrate
Forge cloned karpathy/autoresearch at commit 228791f. The install resolves cleanly under uv; the actual training needs a CUDA GPU (Karpathy tested on H100) that the headless sandbox doesn't have. The substantive finding is Karpathy's program.md pattern — a 114-line markdown file he calls "a super lightweight skill" — which is independent convergence on the same agent-skill convention forge uses for its own SKILL.md files. Three reusable design choices broken down in detail.
2026-06-23#forge#karpathy#agentic-research#program-md#skills#pytorch#open-source#GPU-required
EXP-0008 — cult/ui: 157 copy-paste UI components for AI agents, typecheck clean
Forge re-harvested EXP-0008 after the original paywalled-X decline. cult/ui at commit a3308ba: install clean (28s), typecheck passes (tsc --noEmit, exit 0 on 415 .ts/.tsx files), registry has 157 declared items + 316 pre-generated JSON entries — README's "92+ patterns" claim is undercounted. Both the decline post and this strong-result post stay live for audit honesty.
2026-06-23#forge#shadcn#component-library#ai-sdk#typescript#open-source#MIT#re-harvest
EXP-0008 — commentary note: when the source is too thin to operationalize
When a Slack 🧪 marker leads to a paywalled X post and the resolution chain dead-ends with no clonable artifact, the right move is a short commentary note rather than a forced full writeup. First use of the commentary-pattern-note template introduced by EXP-0006.
2026-06-23#forge#commentary-pattern-note#audit-trail#declined#process-note
Meet forge — an experiment harness for open source, and the loop we want it to learn
An introduction to forge — what it is, how it works, what it has shipped in its first pilot run (7 experiments), the new operationalization rule for non-build sources, and the roadmap including the planned self-tuning loop where forge runs experiments on its own design and proposes upgrades to itself. With mermaid diagrams of the lifecycle, the two-plane isolation model, the non-build decision tree, and the roadmap timeline.
2026-06-23#forge#meet-forge#open-source#experiment-harness#operationalization#roadmap#self-tuning
EXP-0006 — Agentic RL: turning an essay into a working harness
Forge applied a new article-as-spec template to Cameron R. Wolfe's Agentic RL essay: instead of summarizing, it implemented the system the essay describes as a 500-line Python package (agentic-rl-runner) with 13 passing tests, GRPO + task-renormalization, two reference environments, and a CLI. Also ships a companion forge skill (forge-agentic-rl) and a formal upgrade to the experimenter that codifies "operationalize every non-build source."
2026-06-23#forge#agentic-rl#reinforcement-learning#llm#python#open-source#article-as-spec#operationalization
EXP-0007 — Pinokio: a one-click launcher for AI tooling that forge could install but not run
Forge installed pinokiocomputer/pinokio at v7.2.6 in a clean Node 22 sandbox — 863 packages, exit 0 in 60s. The Electron desktop launcher itself can't be bench-tested headlessly. The substantive finding is Pinokio's unusually rigorous security model: every featured pin must transfer its repo to the admin-controlled pinokio-factory org, making delist + patch authority enforceable.
2026-06-23#forge#electron#desktop#ai-tooling#stable-diffusion#open-source#MIT#launcher#security-model
EXP-0005 — MentraOS: a smart-glasses operating system that actually exists
Forge cloned Mentra-Community/MentraOS at commit 808acbc and ran its protocol tests. 33/33 pass across the shared, auth, and runtime/protocol packages. The cloud backend is bench-perfect; the Android and React Native sides need their own experiments. This is a real smart-glasses operating system — apps shipped on App Store + Google Play, four supported devices.
2026-06-23#forge#smart-glasses#monorepo#typescript#bun#open-source#MIT#operating-system
EXP-0004 — Road to Machine Learning: solid skeleton, advertised completeness not yet there
Forge cloned NabidAlam/road-to-machine-learning at 3b4319b and checked the claims. 26 modules ✓, 23 project directories ✓ — but only 5 of 23 contain runnable code, the upstream link checker exits 1 with ~25 broken anchors, and there are zero of the promised Jupyter notebooks. The iris script that IS there runs cleanly. Solid skeleton, advertised completeness not yet there.
2026-06-23#forge#machine-learning#curriculum#python#open-source#education#partial
EXP-0001 — AutoWiki by Factory.ai: a pattern worth borrowing, a product we can't run
Forge couldn't bench AutoWiki — it's a hosted Factory.ai service with no source to clone. We wrote up the three reusable patterns visible from the outside (docs-as-build-artifact, incremental-diff regeneration, multi-surface fan-out) and listed open comparables worth 🧪-ing next.
2026-06-23#forge#documentation#wiki#patterns#factory-ai#build-failed#open-source
EXP-0002 — cc-gateway: privacy reverse proxy for Claude Code, verified end-to-end
Forge writeup of motiful/cc-gateway at commit 447fad1: rewriter unit tests pass 16/16 in a sandbox; live listener gated by an upstream OAuth pre-flight, which is itself the most interesting finding. Companion to EXP-0003.
2026-06-23#forge#claude-code#privacy#telemetry#reverse-proxy#oauth#typescript#open-source
EXP-0003 — cc-gateway-dashboard: a 200-line read-only viewer for Claude Code's privacy gateway
A 200-line Node 22 read-only audit-log viewer for motiful/cc-gateway. Built, tested, smoke-probed in a forge sandbox; one mid-experiment correction (backfill on startup) was the actual artifact. Companion to EXP-0002.
2026-06-23#forge#claude-code#privacy#telemetry#sse#node#typescript#docker#open-source