The Local AI Studio · changelog
Source: CHANGELOG.md in the site repository (baked at build time)
Site: https://m5ai.ivanm.dev/

# CHANGELOG

All notable changes to The Local AI Studio are documented here, snapshot-by-snapshot. The leaderboard rotates monthly; this file is the audit trail.

Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) for repo tags.

---

## [Unreleased]

### Added
- nothing pending

---

## [v1.12.0-portable] · 2026-05-25

### Status
**Post-launch substance pass complete (Sprints 9-20).** Page at 236.3 KB / 250 KB budget (13.7 KB headroom). Em-dash count 0. JSON-LD 6 blocks valid. All CI green on main. Production deploy auto-tracking main.

### Production audit (this release)
- Validate: PASS · em-dash=0 · 19 icons defined / 18 used · all tags balanced
- JSON-LD: 6 blocks all parse (TechArticle, BreadcrumbList, FAQPage, HowTo, SoftwareApplication graph, WebSite)
- Anchor resolution: 0 broken internal hrefs out of 34
- Meta + OG + canonical + twitter:card + theme-color all present
- Static assets all 200 on prod: `/`, `/sitemap.xml`, `/robots.txt`, `/llms.txt`, `/llms-full.txt`, `/manifest.webmanifest`, `/og-image.png`, `/favicon.svg`, `/apple-touch-icon.png`
- Security headers verified on prod: CSP strict (no inline scripts), HSTS preload max-age 31536000, COOP same-origin, X-Frame DENY, X-Content-Type-Options nosniff, Permissions-Policy locked down, Referrer-Policy strict-origin-when-cross-origin
- CI: 20 deploy-production runs all green; deploy median 30 s
- Git: working tree clean; 13 v1.x tags pushed; on main; up-to-date with origin

### Known issues
- Production `/robots.txt` returns 2,706 bytes (CF Content-Signal Policy prepended) vs local 942 bytes. Single owner action required: Cloudflare dash → Security → Bots → AI Crawl Control → Display Content Signals Policy → OFF. This lifts Lighthouse SEO 92 → 100.

### Sprint 9 · Substance pass (`v1.1.0-substance`)
- Brain-prompt cards · pipeline minute-estimates · LatentSync stage · MMLU/GPQA/GDPval/τ²-Bench bars · EXO Labs callout · OG image with Bricolage VF font

### Sprint 10 · Benchmark refresh 2026-05-25 (`v1.2.0-snapshot-0525`)
- BFCL v3 → v4 rotation (Qwen3.5 family entrants)
- TTS-Arena drift: Sonic 3.5 + Gemini 3.1 Flash TTS displace Chatterbox; ElevenLabs v3 recalibrated 1320 → 1184
- MiMo-V2.5-Pro now GDPval #1 at 1581
- Music-video pipeline + podcast pipeline diagrams
- Per-language ASR WER chart (Whisper-v3 across EN / ES / FR / DE / ZH / JP)

### Sprint 11 · Anatomy depth (`v1.3.0-depth`)
- Hermes delegation defaults (orchestrator vs delegation vs subagent vs eval-model)
- 12 browser tools enumerated · 10 RL tools enumerated · cron mode taxonomy
- Benchmark methodology callout (every cite has a primary-source link + capture date)
- Footer "Last refreshed" badge

### Sprint 12 · Objections + threat model (`v1.4.0-objections`)
- Image-LoRA tool chooser table (Diffusers vs Kohya vs ai-toolkit vs OneTrainer)
- Dataset prep checklist (caption density, BLIP2 auto-cap, regularization, repeats per epoch)
- Honest threat model: 4 surviving surfaces (malicious skills, prompt injection, MCP supply chain, physical access)
- 5 objection-handling FAQ entries

### Sprint 13 · Worked examples (`v1.5.0-examples`)
- "What a single agent turn actually looks like" — 4-turn dialogue (search_files → read_file → reason → answer) for summarise-README task
- Real SKILL.md from `~/.hermes/skills/release-it/SKILL.md`
- Voice-clone reference-clip 7-item checklist (5-15s · clean · prosody · WAV 16kHz mono · register · punctuation drives prosody · Chatterbox exaggeration param)

### Sprint 14 · Economics + multi-modal (`v1.6.0-economics`)
- Multi-modal 4-turn Hermes dialogue: brain → gen_image → write_file + clone_voice → tts_speak
- ACE-Step worked prompt: 47-second balkan-folk demo (tags + lyrics + render block + recovery recipes)
- Cost-to-rent comparison table: 10 capability rows · local stack vs paid subscriptions · $361/mo rented vs ~$4,499 one-time Mac · 12.5-month payback math · honest 3-case list of where renting still wins

### Sprint 15 · Reproducibility (`v1.7.0-bench`)
- "Reproduce the numbers in this guide" — 6-part hands-on benchmark guide
  - LLM decode tok/s (ollama + mlx_lm) with expected ranges for 5 model configs
  - Prefill sanity check vs M4 Max (3.3-4.0× expected)
  - Image wall clock (FLUX.2 dev/klein, Draw Things, Qwen-Image)
  - Whisper / faster-whisper RTF
  - ACE-Step music gen wall clock
  - 5-minute `bench-studio.sh` one-shot smoke test
- 6-item "if your numbers are low" debugging checklist
- Closing credibility note tying ranges to MLX team + r/LocalLLaMA reproductions

### Sprint 16 · Three high-credibility adds (`v1.8.0-honest`)
- Storage + cold-load times table (5 model configs cold vs warm) + 250-300 GB / 4 TB / 8 TB sizing rule
- "Swap brain in 30 seconds" recipe (`ollama pull → sed config.yaml → hermes restart`) + 3-case follow-up callout (reasoning flags, context window, tool-call format)
- "How this page stays honest as the field moves" — monthly-refresh.yml cron explainer + drift-PR threshold + closing reader-verifiability note

### Sprint 17 · Meta-loop (`v1.9.0-meta`)
- "How this page itself was built" — 8-row table walking the entire production pipeline through the studio's own components (research → outline → drafting → SVG diagrams → OG image → validation → deploy → monthly drift refresh) + closing key-call: studio's brain wrote the copy for its own field guide across 17 sections, 2 registers, 6 locales, 224 KB single-page, zero cloud round-trip

### Sprint 18 · Music depth (`v1.10.0-lyrics`)
- ACE-Step lyric structure templates: 5 durations (30s / 47s / 60s / 90s / 120s) with structure + lyric budget + per-section breakdown
- 5 genre conventions (pop AABA, ballad ABABCB, dance VCD, balkan/folk, hip-hop 16+8)
- Length-math rationale (20% overshoot truncates, 30% undershoot trails into instrumental)

### Sprint 19 · Operational reality (`v1.11.0-power`)
- Power, thermals, battery table: 6 workloads (idle / chat / sustained agent / image gen / video gen / music gen) with package power + fan profile + battery hours
- 16-inch vs 14-inch thermal envelope note (8-12 °C cooler)
- On-battery GPU caps at ~70% of plugged-in clocks (benchmark prerequisite)

### Sprint 20 · Vendor lock-in (`v1.12.0-portable`)
- "What happens if Hermes itself gets abandoned" — 8-row asset/survival table
- Design principle key-call: Hermes treats your stuff as data in standard formats, treats itself as the process running them today
- Concrete recovery recipe if last release shipped tomorrow

### Production audit fixes (FAQ drift)
- FAQ "How fast is the M5 Max actually?" — corrected stale claim: "gpt-oss-120B at Q8 MLX 64-88 tok/s" → "Q4 MLX 58-72 tok/s" (matches Sprint 15 bench table) · "Qwen3.5 27B dense at Q6 MLX 14-24" → "Qwen3.6-27B at Q4 MLX 21-26 (12-15 at Q8)"
- Fix applied in BOTH the FAQPage JSON-LD block and the visible `<details>` element

---

## [v1.0.0-launch] · 2026-05-25

### Status
**Site public. Repo public. Launch artifacts ready.** Sprints 1-8 complete.

### Lighthouse (production, mobile)
- Performance **97**
- Accessibility **100**
- Best Practices **100**
- SEO **92** (only Cloudflare-managed `Content-Signal` directive in /robots.txt blocks 100; no CF public API; single dashboard toggle: Security → Settings → "Instruct AI bot traffic with robots.txt" → OFF)

### Axe-core (WCAG 2.2 AA)
- **0 violations** across Simple register, Technical register, toggle keyboard test, FAQ accordion keyboard test

### Schema.org validator
- **54 nodes** detected (TechArticle, BreadcrumbList, FAQPage, HowTo, SoftwareApplication graph, WebSite, Organization, Person, all HowToStep/HowToTool/Question/Answer/ListItem children)
- **0 errors**, **0 warnings**

### Sprint 1 · Repo bootstrap (`v0.1.0-bootstrap`)
- Repo layout (public/, scripts/, tests/, docs/, .github/, public/locales/{en,sr,de,fr,zh,ar}, public/fonts)
- Baseline `public/index.html` (copy of original `local-ai-studio.html`)
- Static SEO/discoverability: robots.txt, humans.txt, sitemap.xml, llms.txt, manifest.webmanifest, favicon.svg
- `_headers` with strict CSP, HSTS preload (max-age 63072000, includeSubDomains, preload), COOP same-origin, X-Frame DENY, Permissions-Policy denying all
- `_redirects` with locale aliases + typo redirects + changelog 302
- `scripts/validate.mjs`: em-dash=0 gate, icon-resolve, tag balance, size budget (<250 KB), JSON-LD/canonical/aria presence, cliche scan
- `scripts/build-sitemap.mjs`: phased-locale-aware sitemap generator
- `scripts/build-llms-txt.mjs`: Markdown export with em-dash guard
- `scripts/build-og.mjs`: Resvg OG image renderer (en/sr/de/fr/zh/ar locale strings ready)
- `scripts/minify-html.mjs`: optional production-time size diagnostic
- GH Actions: `pr-checks.yml` (validate → wrangler-action preview → Lighthouse CI + axe-core), `deploy-production.yml` (push main → CF Pages production), `monthly-refresh.yml` (cron PR opener for benchmark rotation)
- `tests/a11y.spec.ts`: Playwright + axe-core WCAG 2.2 AA across Simple + Technical registers + keyboard nav
- `playwright.config.ts`, `lighthouserc.json` with preview-URL-friendly thresholds, `wrangler.toml`
- LICENSE (CC BY 4.0 content + MIT code), README.md, CHANGELOG.md, CLAUDE.md

### Sprint 2 · Content corrections + em-dash kill (`v0.2.0-corrections`)
- 15 critical FACTS.md corrections applied (Hermes tool/skill/gateway counts, Wan 2.7, FLUX.2 license carve-out, Seedream Elo, Suno anchor, wired-memory ceiling, MiMo-V2.5-Pro rename, DeepSeek V4 Flash size, Reddit-stats softening, CVE-2026-7396 addition)
- All 46 em-dashes eliminated (section headings → `·`, placeholder cells → `n/a`, prose → colons/separators)
- Reasoning-format triple-flag fix documented (`--reasoning off` + `--reasoning-budget 0` + `enable_thinking: false`)

### Sprint 3 · Content additions (`v0.3.0-additions`)
- Section 02: Neural Accelerator TFLOPS, SSD speeds, CPU split, chip topology, macOS Tahoe 26.4 prereq, M5 Max launch date, hardware comparison matrix (M4/M3 Ultra/RTX 5090/DGX Spark/Strix Halo/RTX PRO 6000/Jetson Thor), Mac-uniquely-enables vs cannot-do callout
- Section 03: sleeper-features card row (hermes proxy, /goal /subgoal, Kanban, Curator, LSP, prompt cache), OpenClaw/Claw Code/ClaudeClaw disambiguation
- Section 04: hybrid Mamba+Transformer+MoE arch row; Nemotron 3 Super + Qwen3-Coder-Next deep dives; cloud-only-leader card row (DeepSeek V4 Pro/Flash, GLM-5.1, Kimi K2.6, MiniMax M2.7, MiMo-V2.5-Pro); architecture-innovations vocabulary sidebar; license matrix; GRPO sidebar
- Section 05: UGI May 2026 leaders callout
- Section 06: Lane × server × quant matrix; reasoning-format YAML cheat-sheet; vllm-metal v0.2 callout
- Section 07: HiDream-O1 architecture callout (pixel-level UiT); Draw Things Lightning Draft callout; video comparison table (Wan 2.7 / LTX-Video / HunyuanVideo / Mochi 1 / Step-Video); open-vs-closed video gap callout; Qwen3-TTS / CosyVoice 3 / Sesame CSM-1B / Dia voice cards; TTS arena Elo bench bars; Canary-Qwen-2.5B card; ASR WER bench bars; 3 Apple-Silicon Whisper paths callout
- Section 13: 110 GB → 120 GB; "What fits my Mac?" interactive RAM-slider widget (DOM-construction, XSS-safe)
- Section 14: 4 install paths (brew/pip/curl/PowerShell); Step 7 reasoning-format; Step 8 hermes proxy; one-shot install script
- Section 15: 7 sandbox-backend cards; CVE-2026-7396 disclosure callout
- Section 17: 7 new FAQ entries (macOS 26.4, Apple Intelligence, scale-beyond-128 GB, FLUX commercial, M5 Max speed, model supersession, Strix Halo/DGX Spark)

### Sprint 4 · A11y + SEO + meta + JSON-LD + OG (`v0.4.0-seo-a11y`)
- Full meta block: tightened title with brand suffix; description trimmed <160; keywords, author, robots policy; canonical + hreflang; theme-color (dark+light); color-scheme; OG full set with absolute URLs; article:published_time / modified_time / author / section / tag; Twitter card with alt; favicon SVG + 192 + 512 + apple-touch + manifest link; format-detection; referrer policy
- 6 JSON-LD blocks: TechArticle (with proficiencyLevel Expert, timeRequired PT30M, citation array), BreadcrumbList, FAQPage, HowTo (8 steps + 3 tools), SoftwareApplication graph (Hermes, Ollama, MLX, Draw Things, ComfyUI, MLX-Audio), WebSite
- A11y: skip-to-content link; `<main id=page-content>` wrap; `aria-hidden` on `.bgfx`; toggle `role=tab` + `aria-selected` + `aria-controls`; `:focus-visible` outline; `prefers-reduced-motion` universal block; `content-visibility:auto` on sections; View Transitions API on toggle swap; `.vh` visually-hidden utility for screen-reader headings
- OG image: hand-crafted SVG (1200×630) rendered via @resvg/resvg-js (no satori; pivoted because bundled opentype.js can't parse woff2 or Bricolage variable font); 6-locale ready

### Sprint 5 · Motion polish (`v0.5.0-motion`)
- CSS scroll-driven parallax on `.bgfx` layers (`@supports (animation-timeline: scroll())` gated)
- View-timeline `cardRise` on `.mcard` entry
- `@property --angle` + conic-gradient rotating border on `.mcard:hover`
- SVG path draw-on `.draw-on` class
- Footer link hover underline scale animation
- Bench bar re-animate on viewport re-entry
- Hero mouse parallax (±8 px, JS-gated on reduced-motion)
- Copy-to-clipboard button on every `<pre>` block

### Sprint 6 · CI/CD + observability (`v0.6.0-ci`, `v0.6.1-ci-verified`)
- CSP-blocked inline scripts extracted to `public/app.js` (XSS-safe DOM construction throughout)
- Self-hosted fonts (Bricolage Grotesque 800 + Hanken Grotesk 400/700 + JetBrains Mono 400 woff2 latin subset, ~99 KB total) — dropped Google Fonts entirely; tightened CSP `style-src 'self' 'unsafe-inline'` and `font-src 'self'`
- Performance jump: 82 → 97 after self-host
- Heading-order fixes: silicon h4 → h3; footer h6 → h3; LoRA h4 → h3; mcard h5 → h4 (81 instances) with CSS rename; sections #prompts and #pipelines got `.vh` h3 to bridge h2→h4 hierarchy
- Color-contrast: `.seg .tip` bumped to var(--ink2) at 0.85 opacity → passes 4.5:1
- Pages Function `functions/robots.txt.js` documented as override attempt (CF's managed robots.txt layer intercepts at higher edge tier; documented as the one remaining blocker requiring 1-click dashboard action)
- CI: `pr-checks.yml` rewritten to use `cloudflare/wrangler-action@v3` for PR preview deploy + comment on PR; Lighthouse + axe-core run against preview URL; verified by PR #1 smoke test (4 jobs green)
- `deploy-production.yml`: push to main auto-deploys via wrangler-action

### Sprint 8 · Launch (`v1.0.0-launch`)
- Snapshot dates refreshed to 2026-05-25 (`article:modified_time`, JSON-LD `dateModified`, `sitemap.xml lastmod`, `humans.txt`)
- Repo public; topics added (local-ai, macbook-pro, m5-max, apple-silicon, hermes-agent, mlx, ollama, etc.)
- Full launch playbook in `docs/launch/`:
  - `00-checklist.md` (T-7 → T-0 → T+7 timeline + worst-case scenarios)
  - `01-hackernews.md` (title + URL + follow-up comment + reply tactics)
  - `01-reddit-localllama.md` (full post body with benchmark table)
  - `02-lobsters.md` (submission + comment)
  - `02-linkedin.md` (long-form post)
  - `02-newsletters.md` (Bensbites, AlphaSignal, Latent Space, Import AI, Algorithmic Bridge, TLDR AI + awesome-list PR template)
  - `03-x-thread.md` (15-tweet thread, tag list, reply tactics)
  - `04-pre-amp-dms.md` (10-name list with per-recipient personalisation)
  - `05-monitoring.md` (tabs to open, health-check curls, hotfix runbook)

### Repo at v1.0.0-launch
- 8 git tags · `main` clean · CI green · production live at https://m5ai.ivanm.dev
- Repo: https://github.com/KeGolk/m5ai-localai-studio
- License: CC BY 4.0 content · MIT code
- Total HTML: 170.6 KB (under 250 KB budget)
- Total deps: 0 runtime (vanilla HTML + CSS + JS); dev-only htmlhint, html-validate, lighthouse-ci, playwright, axe-core, resvg-js, prettier


## [v0.1.0-bootstrap] · 2026-05-DD

### Added
- Repo structure: `public/`, `scripts/`, `tests/`, `docs/`, `.github/`
- Baseline `public/index.html` (the existing local-ai-studio.html draft, untouched at this stage)
- Static SEO/discoverability files: `robots.txt`, `humans.txt`, `sitemap.xml`, `llms.txt`, `manifest.webmanifest`
- Cloudflare Pages `_headers` with strict CSP, HSTS, COOP, frame-deny, permissions-policy
- `scripts/validate.mjs` enforcing em-dash=0, icon resolve, tag balance, file-size budget, JSON-LD presence
- `scripts/build-sitemap.mjs`, `scripts/build-og.mjs`, `scripts/build-llms-txt.mjs`
- GitHub Actions: `pr-checks.yml` (validate + Lighthouse CI + axe-core), `monthly-refresh.yml`
- `tests/a11y.spec.ts` (Playwright + axe-core WCAG 2.2 AA)
- `lighthouserc.json` enforcing ≥95/98/100/100

### Decisions locked
- Subdomain: m5ai.ivanm.dev (`localai.ivanm.dev` reserved as 301 alias)
- Hosting: Cloudflare Pages free tier, GitHub auto-deploy from main
- Fonts: self-hosted woff2 (kills GDPR; simplifies CSP)
- i18n: phased en → +sr/de wk2 → +fr wk4 → +zh/ar wk6
- Motion: CSS scroll-driven + View Transitions. No GSAP/Lenis/Three.js
- Analytics: Cloudflare Web Analytics (cookieless, no consent banner)
- AI crawler policy: allow all (page exists to be cited)