MoreSalamander StudioLabs Productions
presents: The my-AI suite.

The Thesis Methodology of the
Deterministic Scaffold.

One methodology. Five tools. Each one a different expression of the same instinct — human-owned constraints, AI-powered synthesis, and verification at every boundary. The model proposes. Python disposes.

The Thesis

The Deterministic Scaffold

Every MoreSalamander project is built on the same underlying principle: a well-fenced model inside a deterministic scaffold becomes reliable as a system, because the unreliable component is wrapped in reliable ones that decide whether to trust each output — when to retry, when to skip, when to score, when to commit.

This isn't a preference picked up from a book. It was earned through two specific lineages: the pipeline-shape discipline (from the Build It Publisher — named stages, explicit data flow, atomic responsibility), and the verification discipline (from MyMaestro's failure — a study tool that hallucinated, proving that verification must be a system property, not a soft warning at the UI).

"The model proposes. Python disposes."
01
Explain

Write the constraints before any AI synthesis happens. Constraint documents, schemas, scoring rubrics, series style guides. These are the doctrine. The model works inside them — it never edits them.

02
Synthesize

Let the model do the high-volume work — drafting prose, generating images, writing scripts, composing narration. The model is one component in the system, not the whole system.

03
Verify

Wrap every model output in pure Python that decides whether to trust the result. The grader can never be an LLM — because the grader cannot be the thing it grades. Reject what fails. Score what passes. Persist only what survives.

The Tools

my-AI-script
StudioLabs
pronounced: my-script

The universal writer for the whole suite. An LLM conducts a creative interview — asking, proposing, reflecting back — until it has enough to write a complete, detailed spec. One engine, two targets: a ProductionSpec for my-AI-scene, or a SongSpec for my-AI-beats. The LLM decides when the interview ends. Python scores the result.

Explain

The interview is the constraint-gathering phase. The chosen target's schema defines every field that must be filled — the LLM's questions serve the schema. The scoring rubric is in the system prompt before the LLM writes a word, so it knows the bar it's aiming for.

Synthesize

Once the LLM is confident it can score well, it generates the full spec from the conversation — every beat and narration line for video, or every section, prompt and energy value for a song. Reference-quality detail, on demand.

Verify

A target-specific pure-Python judge scores out of 100 — video on narration/footage/grade, music on prompt specificity and the energy arc. Below 75, the breakdown feeds back and the LLM revises. The scorer is never an LLM.

Does best

Collaborative spec authoring where quality is enforced, not hoped for. The scoring incentive — knowing Python will judge the output before it generates — shapes what the LLM produces. It won't settle for vague prompts or a flat energy arc because it knows they score low. Proven live: a synthwave conversation produced a SongSpec that scored 98/100.

my-AI-scene
Productions
pronounced: my-scene

A local-first video production engine. It takes a ProductionSpec — the machine-readable form of a production guide — and renders it into a graded, scored, assembled MP4. Narration via Kokoro TTS, visuals via SDXL-turbo, music via MusicGen, final assembly via ffmpeg. Entirely local, entirely free.

Explain

The ProductionSpec is the constraint document. Every beat declares its narration, footage prompt, music keywords, and color grade before any model runs. The spec is the law.

Synthesize

Kokoro speaks. SDXL-turbo generates stills. MusicGen composes beds. ffmpeg Ken Burns the stills into motion. Each model fills one stage of the pipeline and no other.

Verify

Trust separation baked in: Whisper transcribes Kokoro's output for narration_verify. CLIP scores SDXL's output for footage_verify. ffprobe measures the final MP4 for assembly_verify. The generator never self-grades.

Does best

Deterministic video production from a defined spec. Because every stage is verified independently of the model that produced it, the pipeline degrades gracefully — footage fallbacks replace failed stills, music drops without failing the episode, narration retries on transcript mismatch. The worst case is a slightly simpler video, never a corrupt one.

my-AI-beats
Productions
pronounced: my-beats

A local-first music generator. It takes a SongSpec — section structure, key, tempo, and an energy value per section — and renders a complete, mastered song. MusicGen synthesizes each section, CLAP scores it, ffmpeg crossfades and loudness-normalizes the master. Entirely local, entirely free.

Explain

The SongSpec is the constraint document. Every section declares its prompt, bar count, and energy (0–1) before any model runs. The energy arc — quiet intro, towering chorus, soft outro — is the song's emotional shape, authored up front.

Synthesize

MusicGen renders each section. The hidden gem: every section after the first is conditioned on the last few seconds of the previous one — audio continuation. Sections don't stitch, they flow. Bark adds optional sung vocals on the side.

Verify

CLAP scores each section against its prompt (section_verify); duration and final-stitch gates are blocking. Vocals are non-blocking — a failed vocal is dropped, the instrumental ships. CLAP is the judge; MusicGen never grades itself.

Does best

Complete instrumental songs with a real emotional arc, built section by section and flowing as one piece thanks to audio continuation. The same blocking/non-blocking doctrine as my-AI-stories applies: the instrumental is the premise, vocals are enhancement. Proven live: an 8-section indie folk track, 3:48, mastered to −14 LUFS.

my-AI-stories
StudioLabs
pronounced: my-stories

A long-form, multi-episode serial fiction engine with multi-voice local TTS. You provide the series bible — characters, theme, plot direction. The system generates full audio episodes, verified beat by beat, with distinct voices per character and a sound design layer that enhances without ever failing the episode.

Explain

The series bible is the constraint document. Characters, world rules, plot arcs — written before any synthesis. The LLM fills episodes inside the bible. It never edits the bible.

Synthesize

Ollama generates episode drafts scene by scene. Piper TTS renders each line in the character's assigned voice. Sound cues placed at narrative anchors by a non-blocking mixer.

Verify

Three blocking gates: continuity (characters consistent with bible), structure (beats well-formed), speaker (every line assigned). One non-blocking gate: cue_verify — unresolvable sound cues are dropped, never fail the episode.

Does best

Episodic narrative audio that stays internally consistent across episodes. The continuity gate means characters don't contradict themselves. The non-blocking sound doctrine means "continuity is the premise, sound is enhancement" — a principle that keeps quality high without making the pipeline fragile.

my-AI-stro
StudioLabs The Crowning Jewel
pronounced: my-stro  ·  like Maestro — but inside Python, the AI is silent (in hallucination)

A local-first, self-improving personal knowledge system. Paste a lesson from anywhere — a course, an article, a video transcript — and the system summarizes it, validates it, and stores it in a Source of Truth. Downstream agents (quizzer, advisor, classroom, general chat) reason over the SOT instead of re-summarizing on every query. Four local Ollama models, each assigned a specific role — the model that owns the SOT never handles ungrounded chat. The system gets better the more it is used.

Explain

Four Ollama models with deliberate role separation: llama3:8b summarizes (owns the SOT), llama3.1:8b advises (128K context, study guides), llama3.2 quizzes, chats, and teaches, mistral judges — never a model that generated the content it is scoring.

Synthesize

LLM extracts structured summaries, code snippets, key concepts. Chunked for long lessons. Advisor generates full course-wide study plans from SOT context. Classroom builds interactive lesson sessions.

Verify

Grounding gates at every persistence boundary — SOT write, Notebook save, Classroom plan persist, raise-hand runtime answers. The audit loop's Judge is a deterministic Python formula, not an LLM. It runs continuously and rotates the SOT toward more-grounded versions over time.

Does best — and why it matters most

my-AI-stro is the proof of concept for both lineages of the Deterministic Scaffold thesis — simultaneously, in a single working system.

Lineage one: the pipeline-shape discipline. The Build It Publisher's n8n workflow — named stages, atomic responsibilities, explicit data flow — was internalized once and then re-encoded in every project that followed. my-AI-stro is where that discipline runs at full scale: three coexisting named pipelines (ingestion, advisor, classroom), all sharing the same NDJSON event vocabulary (step_start / step_complete / gate_pass / gate_fail / retry / done) that every tool in the suite now speaks.

Lineage two: the verification discipline. MyMaestro hallucinated. The lesson wasn't to write a better prompt — it was that verification must be a system property, not a soft warning at the UI. my-AI-stro was rebuilt from that failure: grounding gates at every persistence boundary, a deterministic Python judge in the audit loop (never an LLM), and a self-improving SOT that rotates toward more-grounded versions over time without human intervention.

The architectural blueprint for the suite. Every tool that came after inherited its DNA from here. The blocking vs. non-blocking gate distinction — first encoded in my-AI-stro's hard validation gates — became my-AI-story's cue_verify doctrine and my-AI-scene's music_cue_verify. The trust isolation principle — the model that owns the SOT never handles ungrounded chat — became the pattern for every verification boundary in the suite: Whisper verifies Kokoro, CLIP scores SDXL, Python scores the LLM's spec. The NDJSON vocabulary, the swappable backend protocol, the offline-provable scaffold — all formalized here first, then carried forward. my-AI-stro is not the most complex tool in the suite. It is the tool the rest of the suite was built from.

The Pipelines

One shape. Every tool.

Every tool in the suite runs the same underlying pipeline pattern: named stages with explicit data flow, a shared NDJSON event vocabulary, and a gate at every boundary that decides whether to trust the model's output. The domain changes. The shape does not.

The shared event vocabulary

Every stage in every pipeline emits the same events. Any tool's output can be observed, logged, and streamed to a UI with the same listener — because the vocabulary never changes across tools.

step_start step_complete gate_pass gate_fail retry fallback skip token done error
Blocking gates

Hard pass/fail. A blocking failure stops the pipeline at that stage — the model retries within a bounded limit, then falls back or halts. These protect continuity, structure, and correctness. The premise cannot proceed without them.

Non-blocking gates

Soft failures. A non-blocking failure drops the enhancement and continues — a missing music bed, an unresolved sound cue, a low-scoring visual that falls back to neutral. Sound enhances. Continuity is the premise.

my-AI-script

interview llm_generate spec_validate ● spec_score ● retry ↺ spec_ready

The LLM interviews until it can score well, then generates a spec for the chosen target — a ProductionSpec (video) or a SongSpec (music). Two blocking gates in sequence: parse checks structural validity; a target-specific score checks quality. Both must pass. Failure feeds back as context for the next attempt.

my-AI-scene

spec_load narration_verify ● duration_verify ● footage_verify ● music_cue_verify ○ grade assembly_verify ● episode.mp4

Per beat, per stage. Blocking ●: Whisper transcribes Kokoro's narration and compares it to the script; CLIP scores SDXL's still against the footage prompt; ffprobe verifies the final MP4. Non-blocking ○: a music bed that fails its gate is dropped — the episode continues without it. The worst case is a slightly simpler video, never a corrupt one.

my-AI-beats

spec_load section_verify ● duration_verify ● continuation vocal_verify ○ stitch_verify ● song.wav

Per section. The hidden gem ↝: each section is conditioned on the tail of the previous one, so the song flows. Blocking ●: CLAP scores the audio against the section prompt; duration must match the bar count; ffprobe verifies the final master. Non-blocking ○: a failed Bark vocal is dropped — the instrumental ships. Retry then a tone-pad fallback keeps a bad section from corrupting the song.

my-AI-story

bible episode_draft continuity_verify ● structure_verify ● speaker_verify ● tts_render cue_verify ○ mixed WAV

Three blocking gates protect the story's integrity: continuity (characters consistent with the series bible), structure (beats well-formed), speaker (every line assigned to a character). One non-blocking gate: cue_verify — unresolvable sound cues are dropped, never fail the episode. Continuity is the premise; sound is enhancement.

my-AI-stro — three coexisting pipelines

Ingestion
graph_entry retrieval summarization validation ● memory_write
Advisor
retrieval arc section ×N recap assembly
Classroom
picker plan_generation plan_verify ● beat_playback session_end

Three named pipelines running in the same application, all sharing the same NDJSON event vocabulary. The ingestion pipeline's validation gate uses the deterministic Python judge — not the summarizer, not mistral — to decide whether a lesson's summary meets the grounding threshold before it is written to the SOT. The audit loop runs continuously in the background, independently re-scoring and rotating canonical entries toward more-grounded versions over time.

Principles

What the methodology encodes

The grader is never the generator

The model that produces content cannot evaluate its own output. Whisper verifies Kokoro. CLIP scores SDXL. Mistral judges llama3:8b. Python scores the spec the LLM just wrote. Trust separation at every verification boundary.

Doctrine before code

Every project starts with CONSTITUTION.md and ARCHITECTURE.md before a single line of code. The constraints are written down first. When the code needs to change, the doctrine changes first. The constraint document is the source of truth.

Blocking vs. non-blocking gates

Not all failures are equal. Continuity failures are blocking — a story with inconsistent characters fails. Sound cue failures are non-blocking — a missing ambient bed degrades gracefully. Every gate is classified by whether its failure invalidates the artifact.

Bounded retry, defined fallback

No unbounded loops. Every retry path has a maximum, and every maximum has a defined fallback — a neutral clip, a silence, a hard stop. Thrashing is a bug, not a strategy. The system fails predictably or not at all.

Offline-provable before online-expensive

Every pipeline is proven with deterministic fakes — ScriptedLLM, ScriptedRenderer, scripted TTS — before any model weights download. The scaffold is the system. The models are one implementation of it. Tests run in seconds on zero dependencies.

One observable pipeline

Every tool emits the same NDJSON event vocabulary: step_start / step_complete / gate_pass / gate_fail / retry / done. Named stages, explicit data flow, observable end-to-end — the Build It Publisher lineage in every project.

Scoring as incentive, not decoration

In my-AI-stro's audit loop, the summarizer produces better entries because the judge exists and it knows the criteria. In my-AI-script, the LLM produces more detailed specs because the rubric is in its system prompt. Knowing you will be scored changes what you produce.

Local-first, free-first

All synthesis runs on local models — Ollama, Kokoro, SDXL-turbo, MusicGen, faster-whisper, CLIP — on your own hardware. No paid APIs in the render path. The methodology doesn't require a cloud budget. It requires a scaffold.