The Local AI Studio · changelog Source: CHANGELOG.md in the site repository (baked at build time) Site: https://m5ai.ivanm.dev/ # CHANGELOG All notable changes to The Local AI Studio are documented here, snapshot-by-snapshot. The leaderboard rotates monthly; this file is the audit trail. Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) for repo tags. --- ## [Unreleased] ### Added - nothing pending --- ## [v1.12.0-portable] · 2026-05-25 ### Status **Post-launch substance pass complete (Sprints 9-20).** Page at 236.3 KB / 250 KB budget (13.7 KB headroom). Em-dash count 0. JSON-LD 6 blocks valid. All CI green on main. Production deploy auto-tracking main. ### Production audit (this release) - Validate: PASS · em-dash=0 · 19 icons defined / 18 used · all tags balanced - JSON-LD: 6 blocks all parse (TechArticle, BreadcrumbList, FAQPage, HowTo, SoftwareApplication graph, WebSite) - Anchor resolution: 0 broken internal hrefs out of 34 - Meta + OG + canonical + twitter:card + theme-color all present - Static assets all 200 on prod: `/`, `/sitemap.xml`, `/robots.txt`, `/llms.txt`, `/llms-full.txt`, `/manifest.webmanifest`, `/og-image.png`, `/favicon.svg`, `/apple-touch-icon.png` - Security headers verified on prod: CSP strict (no inline scripts), HSTS preload max-age 31536000, COOP same-origin, X-Frame DENY, X-Content-Type-Options nosniff, Permissions-Policy locked down, Referrer-Policy strict-origin-when-cross-origin - CI: 20 deploy-production runs all green; deploy median 30 s - Git: working tree clean; 13 v1.x tags pushed; on main; up-to-date with origin ### Known issues - Production `/robots.txt` returns 2,706 bytes (CF Content-Signal Policy prepended) vs local 942 bytes. Single owner action required: Cloudflare dash → Security → Bots → AI Crawl Control → Display Content Signals Policy → OFF. This lifts Lighthouse SEO 92 → 100. ### Sprint 9 · Substance pass (`v1.1.0-substance`) - Brain-prompt cards · pipeline minute-estimates · LatentSync stage · MMLU/GPQA/GDPval/τ²-Bench bars · EXO Labs callout · OG image with Bricolage VF font ### Sprint 10 · Benchmark refresh 2026-05-25 (`v1.2.0-snapshot-0525`) - BFCL v3 → v4 rotation (Qwen3.5 family entrants) - TTS-Arena drift: Sonic 3.5 + Gemini 3.1 Flash TTS displace Chatterbox; ElevenLabs v3 recalibrated 1320 → 1184 - MiMo-V2.5-Pro now GDPval #1 at 1581 - Music-video pipeline + podcast pipeline diagrams - Per-language ASR WER chart (Whisper-v3 across EN / ES / FR / DE / ZH / JP) ### Sprint 11 · Anatomy depth (`v1.3.0-depth`) - Hermes delegation defaults (orchestrator vs delegation vs subagent vs eval-model) - 12 browser tools enumerated · 10 RL tools enumerated · cron mode taxonomy - Benchmark methodology callout (every cite has a primary-source link + capture date) - Footer "Last refreshed" badge ### Sprint 12 · Objections + threat model (`v1.4.0-objections`) - Image-LoRA tool chooser table (Diffusers vs Kohya vs ai-toolkit vs OneTrainer) - Dataset prep checklist (caption density, BLIP2 auto-cap, regularization, repeats per epoch) - Honest threat model: 4 surviving surfaces (malicious skills, prompt injection, MCP supply chain, physical access) - 5 objection-handling FAQ entries ### Sprint 13 · Worked examples (`v1.5.0-examples`) - "What a single agent turn actually looks like" — 4-turn dialogue (search_files → read_file → reason → answer) for summarise-README task - Real SKILL.md from `~/.hermes/skills/release-it/SKILL.md` - Voice-clone reference-clip 7-item checklist (5-15s · clean · prosody · WAV 16kHz mono · register · punctuation drives prosody · Chatterbox exaggeration param) ### Sprint 14 · Economics + multi-modal (`v1.6.0-economics`) - Multi-modal 4-turn Hermes dialogue: brain → gen_image → write_file + clone_voice → tts_speak - ACE-Step worked prompt: 47-second balkan-folk demo (tags + lyrics + render block + recovery recipes) - Cost-to-rent comparison table: 10 capability rows · local stack vs paid subscriptions · $361/mo rented vs ~$4,499 one-time Mac · 12.5-month payback math · honest 3-case list of where renting still wins ### Sprint 15 · Reproducibility (`v1.7.0-bench`) - "Reproduce the numbers in this guide" — 6-part hands-on benchmark guide - LLM decode tok/s (ollama + mlx_lm) with expected ranges for 5 model configs - Prefill sanity check vs M4 Max (3.3-4.0× expected) - Image wall clock (FLUX.2 dev/klein, Draw Things, Qwen-Image) - Whisper / faster-whisper RTF - ACE-Step music gen wall clock - 5-minute `bench-studio.sh` one-shot smoke test - 6-item "if your numbers are low" debugging checklist - Closing credibility note tying ranges to MLX team + r/LocalLLaMA reproductions ### Sprint 16 · Three high-credibility adds (`v1.8.0-honest`) - Storage + cold-load times table (5 model configs cold vs warm) + 250-300 GB / 4 TB / 8 TB sizing rule - "Swap brain in 30 seconds" recipe (`ollama pull → sed config.yaml → hermes restart`) + 3-case follow-up callout (reasoning flags, context window, tool-call format) - "How this page stays honest as the field moves" — monthly-refresh.yml cron explainer + drift-PR threshold + closing reader-verifiability note ### Sprint 17 · Meta-loop (`v1.9.0-meta`) - "How this page itself was built" — 8-row table walking the entire production pipeline through the studio's own components (research → outline → drafting → SVG diagrams → OG image → validation → deploy → monthly drift refresh) + closing key-call: studio's brain wrote the copy for its own field guide across 17 sections, 2 registers, 6 locales, 224 KB single-page, zero cloud round-trip ### Sprint 18 · Music depth (`v1.10.0-lyrics`) - ACE-Step lyric structure templates: 5 durations (30s / 47s / 60s / 90s / 120s) with structure + lyric budget + per-section breakdown - 5 genre conventions (pop AABA, ballad ABABCB, dance VCD, balkan/folk, hip-hop 16+8) - Length-math rationale (20% overshoot truncates, 30% undershoot trails into instrumental) ### Sprint 19 · Operational reality (`v1.11.0-power`) - Power, thermals, battery table: 6 workloads (idle / chat / sustained agent / image gen / video gen / music gen) with package power + fan profile + battery hours - 16-inch vs 14-inch thermal envelope note (8-12 °C cooler) - On-battery GPU caps at ~70% of plugged-in clocks (benchmark prerequisite) ### Sprint 20 · Vendor lock-in (`v1.12.0-portable`) - "What happens if Hermes itself gets abandoned" — 8-row asset/survival table - Design principle key-call: Hermes treats your stuff as data in standard formats, treats itself as the process running them today - Concrete recovery recipe if last release shipped tomorrow ### Production audit fixes (FAQ drift) - FAQ "How fast is the M5 Max actually?" — corrected stale claim: "gpt-oss-120B at Q8 MLX 64-88 tok/s" → "Q4 MLX 58-72 tok/s" (matches Sprint 15 bench table) · "Qwen3.5 27B dense at Q6 MLX 14-24" → "Qwen3.6-27B at Q4 MLX 21-26 (12-15 at Q8)" - Fix applied in BOTH the FAQPage JSON-LD block and the visible `
` element --- ## [v1.0.0-launch] · 2026-05-25 ### Status **Site public. Repo public. Launch artifacts ready.** Sprints 1-8 complete. ### Lighthouse (production, mobile) - Performance **97** - Accessibility **100** - Best Practices **100** - SEO **92** (only Cloudflare-managed `Content-Signal` directive in /robots.txt blocks 100; no CF public API; single dashboard toggle: Security → Settings → "Instruct AI bot traffic with robots.txt" → OFF) ### Axe-core (WCAG 2.2 AA) - **0 violations** across Simple register, Technical register, toggle keyboard test, FAQ accordion keyboard test ### Schema.org validator - **54 nodes** detected (TechArticle, BreadcrumbList, FAQPage, HowTo, SoftwareApplication graph, WebSite, Organization, Person, all HowToStep/HowToTool/Question/Answer/ListItem children) - **0 errors**, **0 warnings** ### Sprint 1 · Repo bootstrap (`v0.1.0-bootstrap`) - Repo layout (public/, scripts/, tests/, docs/, .github/, public/locales/{en,sr,de,fr,zh,ar}, public/fonts) - Baseline `public/index.html` (copy of original `local-ai-studio.html`) - Static SEO/discoverability: robots.txt, humans.txt, sitemap.xml, llms.txt, manifest.webmanifest, favicon.svg - `_headers` with strict CSP, HSTS preload (max-age 63072000, includeSubDomains, preload), COOP same-origin, X-Frame DENY, Permissions-Policy denying all - `_redirects` with locale aliases + typo redirects + changelog 302 - `scripts/validate.mjs`: em-dash=0 gate, icon-resolve, tag balance, size budget (<250 KB), JSON-LD/canonical/aria presence, cliche scan - `scripts/build-sitemap.mjs`: phased-locale-aware sitemap generator - `scripts/build-llms-txt.mjs`: Markdown export with em-dash guard - `scripts/build-og.mjs`: Resvg OG image renderer (en/sr/de/fr/zh/ar locale strings ready) - `scripts/minify-html.mjs`: optional production-time size diagnostic - GH Actions: `pr-checks.yml` (validate → wrangler-action preview → Lighthouse CI + axe-core), `deploy-production.yml` (push main → CF Pages production), `monthly-refresh.yml` (cron PR opener for benchmark rotation) - `tests/a11y.spec.ts`: Playwright + axe-core WCAG 2.2 AA across Simple + Technical registers + keyboard nav - `playwright.config.ts`, `lighthouserc.json` with preview-URL-friendly thresholds, `wrangler.toml` - LICENSE (CC BY 4.0 content + MIT code), README.md, CHANGELOG.md, CLAUDE.md ### Sprint 2 · Content corrections + em-dash kill (`v0.2.0-corrections`) - 15 critical FACTS.md corrections applied (Hermes tool/skill/gateway counts, Wan 2.7, FLUX.2 license carve-out, Seedream Elo, Suno anchor, wired-memory ceiling, MiMo-V2.5-Pro rename, DeepSeek V4 Flash size, Reddit-stats softening, CVE-2026-7396 addition) - All 46 em-dashes eliminated (section headings → `·`, placeholder cells → `n/a`, prose → colons/separators) - Reasoning-format triple-flag fix documented (`--reasoning off` + `--reasoning-budget 0` + `enable_thinking: false`) ### Sprint 3 · Content additions (`v0.3.0-additions`) - Section 02: Neural Accelerator TFLOPS, SSD speeds, CPU split, chip topology, macOS Tahoe 26.4 prereq, M5 Max launch date, hardware comparison matrix (M4/M3 Ultra/RTX 5090/DGX Spark/Strix Halo/RTX PRO 6000/Jetson Thor), Mac-uniquely-enables vs cannot-do callout - Section 03: sleeper-features card row (hermes proxy, /goal /subgoal, Kanban, Curator, LSP, prompt cache), OpenClaw/Claw Code/ClaudeClaw disambiguation - Section 04: hybrid Mamba+Transformer+MoE arch row; Nemotron 3 Super + Qwen3-Coder-Next deep dives; cloud-only-leader card row (DeepSeek V4 Pro/Flash, GLM-5.1, Kimi K2.6, MiniMax M2.7, MiMo-V2.5-Pro); architecture-innovations vocabulary sidebar; license matrix; GRPO sidebar - Section 05: UGI May 2026 leaders callout - Section 06: Lane × server × quant matrix; reasoning-format YAML cheat-sheet; vllm-metal v0.2 callout - Section 07: HiDream-O1 architecture callout (pixel-level UiT); Draw Things Lightning Draft callout; video comparison table (Wan 2.7 / LTX-Video / HunyuanVideo / Mochi 1 / Step-Video); open-vs-closed video gap callout; Qwen3-TTS / CosyVoice 3 / Sesame CSM-1B / Dia voice cards; TTS arena Elo bench bars; Canary-Qwen-2.5B card; ASR WER bench bars; 3 Apple-Silicon Whisper paths callout - Section 13: 110 GB → 120 GB; "What fits my Mac?" interactive RAM-slider widget (DOM-construction, XSS-safe) - Section 14: 4 install paths (brew/pip/curl/PowerShell); Step 7 reasoning-format; Step 8 hermes proxy; one-shot install script - Section 15: 7 sandbox-backend cards; CVE-2026-7396 disclosure callout - Section 17: 7 new FAQ entries (macOS 26.4, Apple Intelligence, scale-beyond-128 GB, FLUX commercial, M5 Max speed, model supersession, Strix Halo/DGX Spark) ### Sprint 4 · A11y + SEO + meta + JSON-LD + OG (`v0.4.0-seo-a11y`) - Full meta block: tightened title with brand suffix; description trimmed <160; keywords, author, robots policy; canonical + hreflang; theme-color (dark+light); color-scheme; OG full set with absolute URLs; article:published_time / modified_time / author / section / tag; Twitter card with alt; favicon SVG + 192 + 512 + apple-touch + manifest link; format-detection; referrer policy - 6 JSON-LD blocks: TechArticle (with proficiencyLevel Expert, timeRequired PT30M, citation array), BreadcrumbList, FAQPage, HowTo (8 steps + 3 tools), SoftwareApplication graph (Hermes, Ollama, MLX, Draw Things, ComfyUI, MLX-Audio), WebSite - A11y: skip-to-content link; `
` wrap; `aria-hidden` on `.bgfx`; toggle `role=tab` + `aria-selected` + `aria-controls`; `:focus-visible` outline; `prefers-reduced-motion` universal block; `content-visibility:auto` on sections; View Transitions API on toggle swap; `.vh` visually-hidden utility for screen-reader headings - OG image: hand-crafted SVG (1200×630) rendered via @resvg/resvg-js (no satori; pivoted because bundled opentype.js can't parse woff2 or Bricolage variable font); 6-locale ready ### Sprint 5 · Motion polish (`v0.5.0-motion`) - CSS scroll-driven parallax on `.bgfx` layers (`@supports (animation-timeline: scroll())` gated) - View-timeline `cardRise` on `.mcard` entry - `@property --angle` + conic-gradient rotating border on `.mcard:hover` - SVG path draw-on `.draw-on` class - Footer link hover underline scale animation - Bench bar re-animate on viewport re-entry - Hero mouse parallax (±8 px, JS-gated on reduced-motion) - Copy-to-clipboard button on every `
` block

### Sprint 6 · CI/CD + observability (`v0.6.0-ci`, `v0.6.1-ci-verified`)
- CSP-blocked inline scripts extracted to `public/app.js` (XSS-safe DOM construction throughout)
- Self-hosted fonts (Bricolage Grotesque 800 + Hanken Grotesk 400/700 + JetBrains Mono 400 woff2 latin subset, ~99 KB total) — dropped Google Fonts entirely; tightened CSP `style-src 'self' 'unsafe-inline'` and `font-src 'self'`
- Performance jump: 82 → 97 after self-host
- Heading-order fixes: silicon h4 → h3; footer h6 → h3; LoRA h4 → h3; mcard h5 → h4 (81 instances) with CSS rename; sections #prompts and #pipelines got `.vh` h3 to bridge h2→h4 hierarchy
- Color-contrast: `.seg .tip` bumped to var(--ink2) at 0.85 opacity → passes 4.5:1
- Pages Function `functions/robots.txt.js` documented as override attempt (CF's managed robots.txt layer intercepts at higher edge tier; documented as the one remaining blocker requiring 1-click dashboard action)
- CI: `pr-checks.yml` rewritten to use `cloudflare/wrangler-action@v3` for PR preview deploy + comment on PR; Lighthouse + axe-core run against preview URL; verified by PR #1 smoke test (4 jobs green)
- `deploy-production.yml`: push to main auto-deploys via wrangler-action

### Sprint 8 · Launch (`v1.0.0-launch`)
- Snapshot dates refreshed to 2026-05-25 (`article:modified_time`, JSON-LD `dateModified`, `sitemap.xml lastmod`, `humans.txt`)
- Repo public; topics added (local-ai, macbook-pro, m5-max, apple-silicon, hermes-agent, mlx, ollama, etc.)
- Full launch playbook in `docs/launch/`:
  - `00-checklist.md` (T-7 → T-0 → T+7 timeline + worst-case scenarios)
  - `01-hackernews.md` (title + URL + follow-up comment + reply tactics)
  - `01-reddit-localllama.md` (full post body with benchmark table)
  - `02-lobsters.md` (submission + comment)
  - `02-linkedin.md` (long-form post)
  - `02-newsletters.md` (Bensbites, AlphaSignal, Latent Space, Import AI, Algorithmic Bridge, TLDR AI + awesome-list PR template)
  - `03-x-thread.md` (15-tweet thread, tag list, reply tactics)
  - `04-pre-amp-dms.md` (10-name list with per-recipient personalisation)
  - `05-monitoring.md` (tabs to open, health-check curls, hotfix runbook)

### Repo at v1.0.0-launch
- 8 git tags · `main` clean · CI green · production live at https://m5ai.ivanm.dev
- Repo: https://github.com/KeGolk/m5ai-localai-studio
- License: CC BY 4.0 content · MIT code
- Total HTML: 170.6 KB (under 250 KB budget)
- Total deps: 0 runtime (vanilla HTML + CSS + JS); dev-only htmlhint, html-validate, lighthouse-ci, playwright, axe-core, resvg-js, prettier


## [v0.1.0-bootstrap] · 2026-05-DD

### Added
- Repo structure: `public/`, `scripts/`, `tests/`, `docs/`, `.github/`
- Baseline `public/index.html` (the existing local-ai-studio.html draft, untouched at this stage)
- Static SEO/discoverability files: `robots.txt`, `humans.txt`, `sitemap.xml`, `llms.txt`, `manifest.webmanifest`
- Cloudflare Pages `_headers` with strict CSP, HSTS, COOP, frame-deny, permissions-policy
- `scripts/validate.mjs` enforcing em-dash=0, icon resolve, tag balance, file-size budget, JSON-LD presence
- `scripts/build-sitemap.mjs`, `scripts/build-og.mjs`, `scripts/build-llms-txt.mjs`
- GitHub Actions: `pr-checks.yml` (validate + Lighthouse CI + axe-core), `monthly-refresh.yml`
- `tests/a11y.spec.ts` (Playwright + axe-core WCAG 2.2 AA)
- `lighthouserc.json` enforcing ≥95/98/100/100

### Decisions locked
- Subdomain: m5ai.ivanm.dev (`localai.ivanm.dev` reserved as 301 alias)
- Hosting: Cloudflare Pages free tier, GitHub auto-deploy from main
- Fonts: self-hosted woff2 (kills GDPR; simplifies CSP)
- i18n: phased en → +sr/de wk2 → +fr wk4 → +zh/ar wk6
- Motion: CSS scroll-driven + View Transitions. No GSAP/Lenis/Three.js
- Analytics: Cloudflare Web Analytics (cookieless, no consent banner)
- AI crawler policy: allow all (page exists to be cited)