How one AI agent builds and ships my YouTube Shorts every day
A daily, fully branded YouTube Short with no filming, no manual editing, and no manual upload. A scheduled agent drives VEED.ai to build the video, generates the hook card in Gemini, composites it with ffmpeg, and ships it to YouTube on a schedule. This class teaches the whole system, step by step, with every prompt and script you need.
$ agent run --short next --publish-at 2026-06-20T18:00:00Z
Quick answer
A scheduled AI agent produces one finished YouTube Short per run. It drives VEED.ai to build a 9:16 avatar video on a code-rain background with gold karaoke captions, exports 4K, generates a hook card in Gemini, composites it over the first two seconds with ffmpeg, then uploads to YouTube on a schedule. The agent only schedules, never publishes immediately.
Key takeaways
- Our own 28-day analytics window read 60.4% swipe-away, which triggered the v3 hook-first rebuild. Rewriting words did nothing; changing frame one did.
- One agent run drives six stages and a condensed 12-step build, end to end, unattended.
- Re-encode only the first ~8.333s GOP (a field measurement) so a 4K composite never times out.
- Browser uploads cap near 10 MB; a 4K Short is ~250-360 MB, so the YouTube Data API is mandatory.
- A Testing-mode OAuth token expires in 7 days; move the consent screen to In production.
- The durable lesson: when a third-party tool has an intermittent bug, design a verify, ship-and-flag, and patch path so one flaky render never blocks the cadence.
Watch the system build a Short
This is the whole pipeline as one looping storyboard: a script types in, the avatar generates, the code-rain background drops in behind it, gold karaoke captions burn on, the hook card flashes over the first two seconds, and the finished Short is scheduled. The six steps on the right light up in sync.
90% off my AI bill.
> STAKES
same agents.
same quality.
> METHOD
14 days.
caveats included.
> CTA
link pinned.
MY AI BILL
- 1Script inpaste the script body
- 2Avatar generatesDan voice, visual match
- 3Code-rain backgroundfull length, no black tail
- 4Gold karaoke captionstop band, burned in
- 5Hook card overlayfirst 2s, Gemini image
- 6Scheduledpublish-at, never public now
The loop repeats; with reduced-motion enabled it shows the finished frame and steps statically.
1 · The problem and the bet
Daily short-form is effectively an algorithm requirement, but filming, editing, and uploading every day is not sustainable for a solo creator. The bet behind this system is simple: an AI agent plus an AI avatar can run the whole pipeline so one person ships daily without filming daily. This is the production machine, stated honestly.
This class teaches a real, in-progress system. It makes no view-count or revenue claims. The one measured number it leans on comes from first-party analytics, labeled as such.
Who this is for
- Solo creators, founders, and marketers who want daily short-form output without a daily film-edit-upload grind.
- Vibe coders who want a real agentic automation end to end: a scheduled agent that drives a browser, calls an image model, runs ffmpeg, and hits the YouTube Data API.
The honest framing matters: the avatar is the creator's own likeness reading scripted content, so this is not a hidden-identity channel. AI simply removes the daily film step.
2 · The architecture
One scheduled agent run equals one finished, scheduled Short. The pipeline has six stages: pick the next Short, build it in VEED.ai, retrieve the export, overlay a Gemini hook card with ffmpeg, upload and schedule through the YouTube Data API, then log the run and schedule the pinned comment.
Why an agent and not a SaaS auto-poster
| Dimension | SaaS auto-poster | Scheduled AI agent |
|---|---|---|
| Scope | One upload call | Browser build, image model, ffmpeg, YouTube API |
| Brand rules | Generic templates | Reads your scripts; enforces matte, captions, hook card, cadence |
| Verification | None | Matte check and channel-ID check every run |
| Recovery | Fails the post | Fresh project each run; ship-and-flag on the known matte bug |
| Publishing | Often publishes now | Only ever schedules with publish-at |
3 · Step-by-step setup
Stand up five things once: a VEED talking avatar and a standing voice, a reusable code-rain background, Gemini for hook cards, the YouTube uploader (Google Cloud OAuth plus the two scripts below), and a connected folder with a scheduled task. Every line you need to paste is here. After this, every run is the same short procedure.
3.1 — Create your VEED avatar and lock one voice
- Make an account at veed.io.
- Create a Talking Avatar of yourself. Open AI Avatars, record a short clip or upload one, and approve the generated avatar.
- Save a reference thumbnail. VEED labels every avatar tile "Character" with no custom name, so the agent picks yours by look. Screenshot your avatar's tile and keep it on disk.
- Lock one standing voice for every Short. This system uses AI Voices → Dan → English (US). Do not switch voices between Shorts.
- Top up render credits so unattended runs never stall.
Avatar match (the agent picks by look, not name — VEED labels every tile "Character"):
- Bald presenter
- Blue Design Delight Studio tee with the copper DDS chest logo
- Light background
Keep a screenshot of this tile on disk as your reference thumbnail.3.2 — Make the code-rain background once (reuse forever)
- You need one clip: a looping vertical 1080×1920 green falling-glyph "code-rain", about 44 seconds.
- Generate it once with any AI video or motion tool using the prompt below. If the tool offers model tiers, choose the higher-quality tier — the fast tier often renders an all-black clip.
- Store and reuse it. Save it to disk and upload it to VEED → Your assets. Reuse it on every build; never regenerate. The avatar's Remove Background makes it show behind him.
Create a seamless, looping vertical 9:16 (1080x1920) video, about 44 seconds, of green
falling-glyph "code-rain": thin columns of dim cyan-and-green monospace glyphs
(0 1 / \ : ; . + * [ ] $ #) descending on a near-black #04090c background. Keep a
denser brighter band through the middle third. Subtle, even, hypnotic; no text,
no logos, no faces, no UI. If your tool offers model tiers, pick the higher-quality
tier (the fast tier can render an all-black clip).3.3 — Set up Gemini for the hook card
- Open Gemini in your browser.
- Switch to the right Google account. If you keep several, select the one you will use for image generation — the wrong default account is a common trap.
- Generate the hook card with the prompt below, filling the three claim lines per Short, then download the image to a known folder.
Create a vertical 9:16 hook card for a YouTube Short. Background: near-black #04090c
with a subtle dark-green falling-glyph code-rain texture. Foreground: a big, bold,
ALL-CAPS headline in bright gold #FFE069 -- three short stacked lines:
"<LINE 1>", "<LINE 2>", "<LINE 3>". Heavy condensed sans, high contrast, slight glow
so it reads on mobile. Add one small glowing gold assistant icon. No watermark, no UI,
no clutter, no extra text. Mobile-legible, premium, cinematic.3.4 — Stand up the YouTube uploader
- Create a Google Cloud project at the Google Cloud Console.
- Enable "YouTube Data API v3" under APIs & Services → Library.
- Configure the OAuth consent screen. User type External, fill in the app name and your email, then publish it / set it to "In production." Testing-mode refresh tokens expire in 7 days and will silently break scheduled uploads.
- Create the credential. Credentials → Create → OAuth client ID → Application type Desktop app. Download the JSON and save it as
client_secret.json. - Install the libraries (one line, below).
- Mint your token once. Run
get_refresh_token.py; a browser opens — sign in and consent as your brand-channel Google account. It writesyt_oauth.json. - Find your channel ID (YouTube Studio → Settings → Channel → Advanced). You pass it to the uploader as the guard.
- Upload + schedule with
yt_upload.py. It verifies the channel ID and schedules with--publish-aton every run.
google-api-python-client
google-auth-oauthlib
google-auth-httplib2pip install -r requirements.txt#!/usr/bin/env python3
"""
get_refresh_token.py - one-time OAuth consent. Writes yt_oauth.json.
Mint your OWN credentials:
1) In Google Cloud, enable "YouTube Data API v3".
2) Configure the OAuth consent screen as External and set it to "In production"
(Testing-mode refresh tokens expire in 7 days and will silently break the
scheduled uploads after a week).
3) Create an OAuth client of type "Desktop app" and download its JSON as
client_secret.json into this folder.
4) Run this script. A browser opens; sign in and consent AS THE BRAND ACCOUNT
(the channel you want the Shorts on, not your personal account).
Result: yt_oauth.json, holding the refresh token. Keep it secret.
"""
from google_auth_oauthlib.flow import InstalledAppFlow
SCOPES = [
"https://www.googleapis.com/auth/youtube.upload",
"https://www.googleapis.com/auth/youtube.readonly",
]
def main():
flow = InstalledAppFlow.from_client_secrets_file("client_secret.json", SCOPES)
# Opens a local browser for consent. Sign in as the BRAND Google account.
creds = flow.run_local_server(port=0)
with open("yt_oauth.json", "w", encoding="utf-8") as fh:
fh.write(creds.to_json())
print("Wrote yt_oauth.json. Keep it secret.")
print("If uploads start failing after ~7 days, your consent screen is still in "
"Testing mode. Move it to 'In production' and re-run this once.")
if __name__ == "__main__":
main()
#!/usr/bin/env python3
"""
yt_upload.py - resumable YouTube upload + native scheduling.
Mint your OWN credentials. Never reuse anyone else's yt_oauth.json or
client_secret.json. Run get_refresh_token.py once to create yt_oauth.json.
What it does:
- refreshes your token,
- VERIFIES the token is bound to the channel you expect (the guard),
- resumable-uploads the file (no browser size cap),
- sets status.publishAt so the video is PRIVATE until the scheduled time
(scheduled, never public-now).
The youtube.upload scope can insert + schedule, but cannot set playlists or
thumbnails. Do those in YouTube Studio after the upload.
Usage:
python3 yt_upload.py SHORT.mp4 \
--title "My title" --desc-file desc.txt --tags "a,b,c" \
--category 28 --publish-at 2026-06-20T18:00:00Z \
--creds yt_oauth.json --channel-id UCxxxxxxxxxxxxxxxxxxxxxx
"""
import argparse
import os
import random
import sys
import time
import googleapiclient.discovery
import googleapiclient.errors
from googleapiclient.http import MediaFileUpload
from google.oauth2.credentials import Credentials
from google.auth.transport.requests import Request
# youtube.upload inserts + schedules; youtube.readonly lets us read the channel id for the guard.
SCOPES = [
"https://www.googleapis.com/auth/youtube.upload",
"https://www.googleapis.com/auth/youtube.readonly",
]
RETRIABLE_STATUS = (500, 502, 503, 504)
MAX_RETRIES = 8
def get_service(creds_path):
if not os.path.exists(creds_path):
sys.exit(f"Missing {creds_path}. Run get_refresh_token.py first.")
creds = Credentials.from_authorized_user_file(creds_path, SCOPES)
if not creds.valid:
if creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
sys.exit("Token invalid or expired with no refresh token. Re-run get_refresh_token.py.")
return googleapiclient.discovery.build("youtube", "v3", credentials=creds)
def verify_channel(youtube, expected_id):
"""The guard: stop the run if the token is bound to the wrong channel."""
resp = youtube.channels().list(part="id,snippet", mine=True).execute()
items = resp.get("items", [])
if not items:
sys.exit("This token controls no channel. Re-consent as the brand account.")
cid = items[0]["id"]
title = items[0]["snippet"]["title"]
print(f"[channel] {title} ({cid})")
if expected_id and cid != expected_id:
sys.exit(
f"STOP: token is bound to {cid}, but you expected {expected_id}.\n"
"Re-run get_refresh_token.py and consent as the CORRECT Google account."
)
return cid
def resumable_upload(request, filename):
response = None
error = None
retry = 0
print(f"Uploading {filename} ...")
while response is None:
try:
status, response = request.next_chunk()
if status:
print(f" {int(status.progress() * 100)}%")
if response is not None and "id" not in response:
sys.exit(f"Upload failed, unexpected response: {response}")
except googleapiclient.errors.HttpError as e:
if e.resp.status in RETRIABLE_STATUS:
error = e
else:
raise
except (IOError, OSError) as e:
error = e
if error is not None:
retry += 1
if retry > MAX_RETRIES:
sys.exit("Too many retries. Aborting.")
sleep = min((2 ** retry) + random.random(), 60)
print(f" retriable error; retry {retry} in {sleep:.1f}s")
time.sleep(sleep)
error = None
return response
def main():
ap = argparse.ArgumentParser(description="Resumable YouTube upload + schedule.")
ap.add_argument("file", help="Path to the finished .mp4")
ap.add_argument("--title", required=True)
ap.add_argument("--desc-file", help="UTF-8 text file with the description")
ap.add_argument("--tags", default="", help="Comma-separated tags")
ap.add_argument("--category", default="28", help="28 = Science & Technology")
ap.add_argument("--publish-at", required=True,
help="RFC 3339, e.g. 2026-06-20T18:00:00Z (UTC). Makes it SCHEDULED.")
ap.add_argument("--creds", default="yt_oauth.json")
ap.add_argument("--channel-id", default=os.environ.get("YT_CHANNEL_ID", ""),
help="Your brand channel id; the run stops if the token does not match.")
args = ap.parse_args()
if not os.path.exists(args.file):
sys.exit(f"File not found: {args.file}")
description = ""
if args.desc_file:
if not os.path.exists(args.desc_file):
sys.exit(f"Description file not found: {args.desc_file}")
with open(args.desc_file, encoding="utf-8") as fh:
description = fh.read()
youtube = get_service(args.creds)
verify_channel(youtube, args.channel_id)
body = {
"snippet": {
"title": args.title,
"description": description,
"tags": [t.strip() for t in args.tags.split(",") if t.strip()],
"categoryId": args.category,
},
"status": {
"privacyStatus": "private", # private + publishAt == SCHEDULED
"publishAt": args.publish_at, # do NOT also set 'public' anywhere
"selfDeclaredMadeForKids": False,
},
}
media = MediaFileUpload(args.file, chunksize=8 * 1024 * 1024, resumable=True)
request = youtube.videos().insert(part="snippet,status", body=body, media_body=media)
resp = resumable_upload(request, args.file)
video_id = resp["id"]
print(f"DONE: https://youtu.be/{video_id}")
print(f"Scheduled for {args.publish_at}. Set playlists + thumbnail in Studio.")
if __name__ == "__main__":
main()
# mint your OWN credentials first (get_refresh_token.py). category 28 = Science & Technology.
python3 yt_upload.py rendered_shorts/<SHORT_ID>_<date>.mp4 \
--title "<title>" --desc-file desc.txt --tags "veed ai,ai video automation" \
--category 28 --publish-at 2026-06-20T18:00:00Z \
--creds yt_oauth.json --channel-id <YOUR_CHANNEL_ID>yt_oauth.json and client_secret.json as private. The upload scope cannot set playlists or thumbnails — do those in Studio.3.5 — Connect a folder and schedule the daily run
- Keep one working folder with your exports and a production log, and connect it to your agent runner.
- Schedule one daily task (build daily, publish three times a week, keep the queue 4–7 deep). Use the command for your OS below.
- Give the agent its standing rules (the ruleset below) so it never asks questions, never publishes now, and never changes the voice or quality.
REM Windows: run the daily build at 1:00 AM. run_daily.bat triggers your agent + ship.sh.
schtasks /Create /TN "ddsboston-veed-auto-build-daily" /SC DAILY /ST 01:00 /F ^
/TR "C:\YouTube\03_Content_Pipeline\run_daily.bat"
# macOS / Linux equivalent (crontab -e): 1:00 AM daily
0 1 * * * /home/you/youtube/03_Content_Pipeline/run_daily.sh >> ~/veed.log 2>&1You are the daily Shorts build agent. On each run, with no questions:
1. Read the active build order and production_log.md. Pick the FIRST Short with no successful row.
2. Check the queue depth and the next open publish slot in Studio. Skip long-form days and any booked date.
3. Build the Short in VEED (fresh AI-Avatars project every run; 9:16; avatar by visual match;
voice = Dan, US English; Eye Contact + Remove Background; code-rain background full length, no black tail;
auto B-roll; gold karaoke captions; export YouTube 4K, burn subtitles).
4. Verify the matte ON THE EXPORT. If Remove Background reverted, re-export once; if still bad,
SHIP ANYWAY, log "matte reverted - shipped per override", and create a one-time patch task.
5. Retrieve the .mp4 (curl), generate the Gemini hook card, composite the first GOP with ffmpeg.
6. Upload with yt_upload.py using --publish-at (SCHEDULED, never public now). Verify the channel id.
7. Add playlists in Studio, append production_log.md, and schedule the pinned-comment follow-up.
HARD RULES: never publish immediately (always --publish-at); never change the voice (Dan) or quality (4K);
never invent results in the description.#!/usr/bin/env bash
# ship.sh - the deterministic half of the pipeline (retrieve -> composite -> upload).
# The VEED build (browser) is driven by the agent; this chains the steps that ARE scriptable.
# Usage: ./ship.sh <SHORT_ID> <YYYY-MM-DD> <MP4_URL> <HOOK_IMAGE> <PUBLISH_AT> <CHANNEL_ID>
set -euo pipefail
ID="$1"; DATE="$2"; MP4_URL="$3"; HOOK="$4"; PUBAT="$5"; CHAN="$6"
BASE="rendered_shorts/${ID}_${DATE}.mp4"
OUT="${ID}_${DATE}.mp4"
# 1) retrieve from the VEED CDN
curl -sL "$MP4_URL" -o "$BASE"
SIZE=$(wc -c < "$BASE")
if [ "$SIZE" -lt 1000000 ]; then echo "Download too small ($SIZE bytes) - URL likely expired"; exit 1; fi
# 2) GOP-aligned hook composite (re-encode first GOP only, stream-copy the tail)
ffmpeg -y -t 8.333333 -i "$BASE" -i "$HOOK" -filter_complex \
"[1:v]scale=2160:3840:force_original_aspect_ratio=increase,crop=2160:3840,setsar=1[hk];[0:v][hk]overlay=x=0:y=0:enable='lt(t,2)'[v]" \
-map "[v]" -map 0:a -c:v libx264 -preset veryfast -crf 18 -pix_fmt yuv420p -c:a aac -b:a 192k head.mp4
ffmpeg -y -ss 8.333333 -i "$BASE" -c copy tail.mp4
printf "file 'head.mp4'\nfile 'tail.mp4'\n" > concat.txt
ffmpeg -y -f concat -safe 0 -i concat.txt -c copy "$OUT"
# 3) upload + schedule (verifies channel id, never publishes now)
python3 yt_upload.py "$OUT" --title "$ID" --tags "veed ai,ai video automation" \
--category 28 --publish-at "$PUBAT" --creds yt_oauth.json --channel-id "$CHAN"
echo "Shipped $OUT (scheduled $PUBAT)."
| short_id | build_date | publish_at (UTC) | youtube_id | matte | notes |
|-----------------|------------|-----------------------|-------------|-------|---------------------|
| VA_cost_hook_01 | 2026-06-13 | 2026-06-20T18:00:00Z | t2qWz2M5QTM | ok | - |Free masterclass (full walkthrough): https://youtu.be/<LONGFORM_ID>
This Short's class page: https://ddsboston.com/pages/<CLASS_SLUG>
All 60 free classes (no signup): https://ddsboston.com/pages/dds-vibe-academy4 · The content system
The format decides whether the videos perform. The rule that moved the needle: the first second must stop a sound-off scroll, so frame one is on-screen text or motion, never the face. Every Short runs three beats, a five-beat script, a locked gold karaoke caption spec, and a defined B-roll philosophy.
Format v3 — hook-first
Rewriting the script did nothing; rewriting the visual format did. Frame one used to be a face on black, which reads as "AI" and gives a scroller nothing to stop for. The three beats:
- Beat 0 — hook card (0.0-2.0s): full-frame, one to four bold words per line, motion behind it, never a flat card. The spoken hook plays underneath.
- Beat 1 — avatar delivers (~1.8s to end-2s): cut to the cut-out avatar on the code-rain; captions are the hero in the top third, gold, karaoke, one phrase at a time; a pattern interrupt every two to three seconds.
- Beat 2 — CTA and loop (last ~2s): "Free. No signup. Link's pinned." The last frame echoes the first so the Short loops.
Hook-text formula: number plus contrast, scannable in under a second. Real examples used on the channel: "11 AI AGENTS / $8 A DAY", "800ms -> 90ms / ONE CHANGE", "STOP USING / ONE AI MODEL".
v3 · hook-firstSame script, different frame one. The format change is what moved the metric — not the words.
The script structure (Hook / Method / Payoff)
Each long-form video spawns three Shorts. Each script is a five-beat, roughly 40-60 second read: Hook (the swipe-stopping claim), Stakes (why it costs you), Cliffhanger (the named method), Sting (the payoff plus "free masterclass walks you through it"), and CTA. Same avatar, wardrobe, and backdrop every Short.
Caption spec (locked)
Gold #FFE069, top band, big, a Dynamic (AI) karaoke preset that highlights the spoken word with a solid filled box, Text-Behind-Person off so captions stay in front, burned in at export.
B-roll philosophy
- Hook Shorts stay on face-cam; the hook lands on face and voice.
- Method Shorts cut to screen-cap or stock per step and return to face at transitions.
- Payoff Shorts lean on before/after visuals; the last five seconds are always face-cam for the CTA.
- Never insert a black frame or a missing-asset placeholder; fall back to face-cam with a typographic overlay.
5 · Build A — the VEED build and retrieve
Build A is what the agent does inside VEED on every run: open a fresh project, set 9:16, select the avatar by visual match, paste the script, generate, polish, add the code-rain background full length, add B-roll and gold karaoke captions, export 4K, verify the matte on the export, and pull the file down with curl.
- Read the Short's content from the pack: script, overlay schedule, title, description, tags, playlists, pinned comment, post date.
- Open a fresh VEED AI-Avatars project. A reused project corrupts; the script field rejects input and generation stalls near 95%.
- Switch to 9:16 (the default loads 16:9; Shorts need 1080×1920).
- Select the avatar by visual match, not by name.
- Paste the script body only (no timestamps) via a contenteditable insert.
- Verify voice = Dan (US English) and the 9:16 canvas still hold.
- Generate and poll to completion (up to ~10 minutes).
- Polish: Eye Contact on, Remove Background on, fade audio in and out.
- Code-rain background, full length. Copy, paste, and trim a second copy so the background exactly equals the Short. No black tail.
- Add auto B-roll, then gold karaoke subtitles; verify accuracy and spot-fix key terms.
- Export YouTube 4K with Burn Subtitles on.
- Verify the matte on the export, not the editor preview, then retrieve the .mp4 with curl.
# Get the direct MP4 URL from the VEED Share panel (download anchor href), or read the
# network requests for a *.mp4 on the VEED CDN, then pull it into the connected folder:
curl -sL "<MP4_URL>" -o "rendered_shorts/<SHORT_ID>_<YYYY-MM-DD>.mp4"
# Verify: a 40-60s 4K Short is ~250-360 MB and ffprobe/file reports an MP4.
# If the download is HTML or under 1 MB, the URL expired -- re-grab it.6 · Build B — post-process and ship
Build B turns the raw export into the finished, scheduled Short: generate the Frame-1 hook card in Gemini, composite it over the first GOP with ffmpeg without re-encoding the whole 4K file, upload through the YouTube Data API with a publish-at timestamp, set playlists in Studio, log the run, and schedule the pinned comment.
GOP-aligned ffmpeg composite
VEED 4K exports use an approximately 8.333-second GOP (a field measurement). Re-encoding the whole 4K file would time out a short window, so re-encode only the first GOP with the hook overlaid for the first two seconds, then stream-copy the rest losslessly and concatenate.
HOOK="<newest Gemini hook-card image in your controlled download path>"
BASE="rendered_shorts/<SHORT_ID>_<date>.mp4"
# head: first GOP, hook card overlaid full-frame for the first 2s (audio kept)
ffmpeg -y -t 8.333333 -i "$BASE" -i "$HOOK" -filter_complex \
"[1:v]scale=2160:3840:force_original_aspect_ratio=increase,crop=2160:3840,setsar=1[hk];\
[0:v][hk]overlay=x=0:y=0:enable='lt(t,2)'[v]" \
-map "[v]" -map 0:a -c:v libx264 -preset veryfast -crf 18 -pix_fmt yuv420p \
-c:a aac -b:a 192k head.mp4
# tail: stream-copy from the 8.333s keyframe (no re-encode -> fast, lossless)
ffmpeg -y -ss 8.333333 -i "$BASE" -c copy tail.mp4
# concat head + tail
printf "file 'head.mp4'\nfile 'tail.mp4'\n" > concat.txt
ffmpeg -y -f concat -safe 0 -i concat.txt -c copy "<SHORT_ID>_<date>.mp4"Upload and schedule
Browser file inputs cap near 10 MB and a 4K Short is far larger, so the API path is mandatory. Use the same yt_upload.py from setup; --publish-at keeps the video private until the scheduled time. The upload scope cannot set playlists or thumbnails, so finish those in Studio, append the log, and schedule the pinned comment.
7 · Prompts and assets (set once, reuse)
Three assets stay fixed across the daily cadence: the Gemini hook-card prompt (filled per Short), the code-rain background (reused, never regenerated), and the gold karaoke subtitle style. The prompts live in section 3; the subtitle recipe is below.
Subtitle style (set once, reused)
Subtitles → Style → Dynamic (AI) → the karaoke preset that puts a solid gold box behind the spoken word; gold #FFE069 text; top band; Text-Behind-Person off; burn on export.
8 · Failure modes and the ship-and-flag mindset
These are production failures that were actually hit and solved. The meta-lesson runs through all of them: when a third-party tool has an intermittent bug you cannot fix, design a verify, ship-and-flag, and patch path so the daily cadence never stalls.
| Failure mode | Symptom | Fix |
|---|---|---|
| Reused VEED project corrupts | Script field rejects input; generation stalls near 95% | Create a fresh AI-Avatars project every run; never reuse. |
| Dual voiceover | Export plays two voices at once | A reused project adds a track instead of replacing it; clear the timeline first, or use a fresh project. |
| Remove-Background reverts at export | Preview clean, export shows the room | Intermittent VEED-side bug: verify the export, re-export once, else ship-and-flag and schedule a patch. |
| Black tail on background | Code-rain ends before the avatar | Copy, paste, and trim the background to exactly match the Short length. |
| Voice reads faster than planned | A 145-word "60s" script renders to ~40s | The voice reads ~3.5 words/second (field measurement); rely on auto-subtitles synced to the actual audio. |
| Browser upload size cap | Cannot push a 250+ MB 4K Short via the browser | Switch to the YouTube Data API resumable uploader. No size cap; schedules natively. |
| Token bound to the wrong channel | Upload lands on a different channel | Verify the channel ID every run; re-consent as the correct Google account. |
| Testing-mode token expires | Uploads fail after a week | Move the OAuth consent screen to In production. |
| Flat hook card looks cheap | Static text-on-black rejected | Generate the hook card as an image with code-rain motion; never a flat card. |
| Rebuild the format, not the script | Rewriting words did not cut swipe-away | Change the visual format: text or motion in frame one, never the face first. |
9 · Automate it
Wrap the whole chain in one scheduled task. It reads the active build order and the production log, picks the first unbuilt Short, verifies the next open publish date, runs Build A and Build B end to end, logs the result, and schedules the pinned comment. It never auto-publishes and never changes the standing voice or 4K quality.
The orchestration command and the agent's standing rules are the two copy blocks in section 3.5. Two guardrails are non-negotiable: always use --publish-at so nothing goes public immediately, and never change the voice or quality, which keeps every Short consistent.
Cadence in practice: build daily, publish three times a week on fixed days, and keep the queue roughly 4-7 deep so a single failed render is absorbed by buffer rather than a missed post.
10 · Measure and iterate
The metric that drove every decision is swipe-away: viewed versus swiped on Shorts. Our own 28-day window read 60.4 percent, which triggered the hook-first rebuild. Watch this number on your own analytics rather than trusting any external benchmark, and treat the format as a hypothesis to A/B test.
Secondary signals worth watching: average view duration, loop or replay rate (the last-frame-echoes-first-frame trick targets this), and click-through from the pinned comment to the long-form video.
The honest caveat: this is mid-experiment. The system is a production machine paired with a format hypothesis, not a finished result. A/B your own hook cards and measure.
Frequently asked questions
What is the VEED automation pipeline in one sentence?
A scheduled AI agent reads the next script, drives VEED.ai to build a 9:16 avatar video on a code-rain background with gold karaoke captions, exports 4K, generates a hook card in Gemini, composites it with ffmpeg, and uploads it to YouTube on a schedule. One run equals one finished, scheduled Short.
Can one person produce a daily YouTube Short without filming every day?
Yes. The system uses an AI avatar reading a scripted line so the creator does not film daily. It is not a hidden-identity channel; it is one person producing daily short-form without a daily film-edit-upload grind. The avatar is the creator's own likeness reading scripted content.
What tools does the pipeline use?
VEED.ai for the avatar video and captions, Google Gemini for the Frame-1 hook card image, ffmpeg for the GOP-aligned composite, the YouTube Data API v3 for resumable upload and scheduling, curl to retrieve the export, and a scheduled agent runner to orchestrate the whole chain unattended.
Why use an AI agent instead of a SaaS auto-poster?
A SaaS auto-poster makes one API call. The agent reads first-party scripts, enforces brand rules, runs a real multi-tool workflow across a browser, an image model, ffmpeg, and the YouTube API, verifies its own output with a matte check and a channel-ID check, and self-heals by starting a fresh project each run.
What is the hook-first format?
Hook-first means the first second is on-screen text or motion, never the face. Frame one is a full-frame hook card with motion behind it and the spoken hook underneath; the avatar appears second. A sound-off scroller needs something to stop for, and a face on black does not provide it.
What metric triggered the format rebuild?
Swipe-away rate. Our own 28-day YouTube analytics window read 60.4 percent swipe-away, which triggered the v3 hook-first rebuild. Rewriting the script words had not moved the number; changing the visual format in frame one did. Treat swipe-away as the metric to watch on your own analytics.
How does the agent select the right VEED avatar if avatars cannot be named?
VEED labels every avatar tile Character with no custom name. The agent selects by visual match using a reference thumbnail kept on disk, identifying the avatar by appearance rather than by name. This selector-by-purpose approach survives UI changes better than a fixed position or label.
Why create a fresh VEED project on every run?
A reused or shared project corrupts: the script field rejects input and avatar generation stalls near 95 percent. A fresh AI-Avatars project generates first-try and leaves no stale timeline. It also avoids the dual-voiceover bug, where a reused project adds an audio track instead of replacing it.
What causes the dual-voiceover bug and how is it fixed?
A reused project adds a new voiceover track instead of replacing the old one, so the export plays two voices at once. The fix is to clear the entire timeline first, or better, use a fresh project every run so there is no leftover track to conflict with.
What is the remove-background matte bug and what is ship-and-flag?
VEED's Remove Background can silently revert at export even when the editor preview looks clean, exposing the original room behind the avatar. The policy is verify the export, re-export once because it is intermittent, and if it still reverts, ship anyway, log the override, and schedule a patch task so one flaky render never blocks the cadence.
Why must the hook card be a generated image rather than a flat text card?
Flat text-on-black cards read as clip-art and were rejected in testing. The hook card is generated as an image with code-rain motion and a glowing accent so frame one looks premium and stops a sound-off scroll. The card is overlaid only for the first two seconds.
How do you overlay the hook card without re-encoding the whole 4K file?
Re-encode only the first GOP. VEED 4K exports use an approximately 8.333-second GOP, a field measurement, so ffmpeg re-encodes just the first GOP with the hook overlaid for the first two seconds and stream-copies the rest losslessly. Concatenating head and tail avoids a full 4K re-encode that would time out a short sandbox window.
Why upload through the YouTube Data API instead of the browser?
Browser file inputs cap around 10 MB, and a 4K Short is roughly 250 to 360 MB, so a browser upload fails. The YouTube Data API v3 resumable uploader has no practical size cap and schedules natively with a publish-at timestamp.
How do you guarantee uploads land on the correct channel?
The uploader prints and verifies the channel ID on every run. If the OAuth token is bound to the wrong account, the run stops. The fix is to re-consent as the correct Google account so the token is bound to the brand channel, not a personal one. This was a real bug, caught by the guard.
Why do uploads start failing after about a week?
An OAuth consent screen left in Testing mode issues refresh tokens that expire in seven days, so scheduled uploads break after a week. Move the consent screen to In production so the refresh token persists.
How is the agent prevented from publishing immediately?
Every upload uses publish-at, which keeps the video private until the scheduled time. The agent only ever schedules and never publishes now. The scheduled task also never changes the standing voice or the 4K quality, which keeps every Short consistent.
Is this a deceptive or hidden-identity channel?
No. The avatar is the creator's own likeness reading scripted content. The honest frame is that AI lets one person produce daily short-form without filming daily. The class does not present the approach as deception.
What does it cost to run, and is this a finished case study?
Costs and quotas for VEED render credits, Gemini, and the YouTube Data API change and are not verified here; confirm current pricing and the daily upload quota with each vendor before relying on a number. The system is a working production machine paired with a format hypothesis, not a finished case study. Measure your own results.
The bottom line: a scheduled agent plus an AI avatar lets one person ship a daily, branded YouTube Short without filming or manual editing. The engineering that makes it durable is the verification and the ship-and-flag recovery, not the avatar. Build the pipeline, then measure your own swipe-away and iterate the hook.
Free. No paywall. No signup. No certificate. Just the work.
Explore the DDS Vibe AcademyThis is an independent educational class. VEED, Gemini, and YouTube are trademarks of their respective owners and are referenced for identification only; no affiliation or endorsement is implied. Pricing and quotas change — confirm current figures with each vendor.
