DDS VIBE ACADEMY · MASTER CLASS 05

MODEL: gpt-image-2-2026-04-21

STATUS: VERIFIED · APRIL 2026

SOVEREIGN ARCHITECTURE · IMAGE GENERATION

GPT Image 2 for Shopify Stores

OpenAI shipped gpt-image-2 on April 21, 2026 with ~99% character-level text accuracy, native 2K resolution, 16-image reference inputs, and reasoning that plans before it generates. This is the complete master class on using it to produce every visual a Shopify store will ever need — from product photos to packaging to OG images. 66 paste-ready prompts. Verified API code. Files API integration. No hype, no invented benchmarks, all sources cited. Part of the free DDS Vibe Academy curriculum — the 25-node constellation of AI coding masterclasses behind the DDS Sovereign AGI Suite.

66 Prompts

61 Categories

10 Schemas

Verified April 29, 2026

Time: 45 min

Quick Answer · Speakable

GPT Image 2 (model ID: gpt-image-2, branded "ChatGPT Images 2.0") is OpenAI's flagship image generator released April 21, 2026. It produces near-perfect text inside images, 2K native resolution, supports 16 reference images for brand consistency, and reasons before generating. For Shopify operators it's the strongest single tool for product photos, hero banners, packaging mockups, and ad creative — especially anything with text. Access via ChatGPT Plus ($20/mo, no code) or the OpenAI API ($0.04–$0.35 per image, paid tier required). Commercial use is explicitly granted by OpenAI. Migrate from DALL-E before May 12, 2026.

Key Takeaways · Verified

~99% text-in-image accuracy across Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts — the headline feature that changes what's practical for Shopify.
16 reference images per call + Thinking Mode generates 8 coherent images with character/product continuity in one call.
Token-based pricing: $8/M image input, $30/M image output, $2/M cached. Per-image cost: $0.04–$0.35.
API requires paid tier + Org Verification. Free tier is explicitly NOT supported for gpt-image-2.
DALL-E 2 + DALL-E 3 retire May 12, 2026. Migrate code that calls those endpoints now.
OpenAI grants you full commercial rights to outputs — usable on products, ads, packaging, merch.
Brand logos still need real-vector composite post-generation. The model understands "logo" but doesn't reliably reproduce exact vector shapes.
Transparent backgrounds NOT supported via Responses API. Use gpt-image-1.5 for PNG-with-alpha, or key out in Photoshop.
Knowledge cutoff: December 2025. Post-cutoff brands and events may render inaccurately unless Thinking Mode pulls from live web search.
Counterintuitive: the model performs strongest with simpler prompts. Stop tag-stuffing.

Audience Map

Who this master class serves

This class is structured to deliver value at four distinct expertise levels. Pick the lane that matches you, but read the others — the cross-pollination is where the real leverage lives. If you've completed the Multi-Model AI Image Generation Routing master class or the Multi-Model AI Video Routing master class, this class extends both with deep OpenAI-specific tactics for Shopify production work.

PERSONA · 01

Solo Shopify Operator

You run the store, take the photos, write the copy. ChatGPT Plus path. No code. $20/mo replaces $300/shoot.

PERSONA · 02

DTC Brand Marketing Lead

You ship campaigns weekly. Custom GPT setup, brand consistency at scale, hero/social/ad multi-format.

PERSONA · 03

Agency / Freelancer

You deliver visuals across multiple client brands. Per-brand Custom GPTs, white-label workflows, retainer math.

PERSONA · 04

Architect-CEO / Vibe Coder

You automate. API code paths, Shopify Files API GraphQL, batch generation, programmatic pipelines.

Section 01 · Strategic Foundation

Why GPT Image 2 changes Shopify visual production

For three years, AI image generators had a single embarrassing failure: they couldn't reliably spell. Ask DALL-E 3 for a Mexican restaurant menu and you'd get "enchuita" and "burrto." Ask Midjourney for a poster headline and you'd get plausible-looking arrangements of the wrong letters. Every model — DALL-E, Midjourney, Stable Diffusion, Nano Banana, gpt-image-1.5 — failed at this to some degree.

gpt-image-2 broke that. ~99% character-level text accuracy across Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts. Mixed-script layouts work — a Japanese poster with Latin product names, a Chinese subtitle layered over an English title. For the first time, a model can carry real reading load inside a generated image.

If you build for Shopify — packaging mockups, sale banners, infographics, product labels, social ads with copy, multilingual marketing assets — this is the feature you've been waiting for. It changes what's practical.

The five capability shifts that matter for Shopify

Text rendering you can ship

Posters, packaging, UI mockups, social ads with copy, multilingual marketing assets — all render readably on the first try. You stop rebuilding layouts in Photoshop because the AI couldn't spell.

Reasoning before pixels

Thinking Mode plans the image structure, can web-search for references, and self-checks outputs before rendering. Complex prompts with conflicting constraints ("square infographic, title centered, three columns, small CTA at bottom") resolve sensibly on the first attempt instead of arriving as four columns with no title.

Brand consistency at scale

Up to 16 reference images per call. The model reasons about them as a set — product photo plus brand-style references plus competitor packshot. n=8 with Thinking Mode generates eight coherent images with character and product continuity in one call. Lookbooks and storyboards become a one-prompt operation.

Native 2K + 100+ object scenes

Output up to 2048×2048 native, with 4K experimental beta. Aspect ratios from 3:1 ultra-wide to 1:3 ultra-tall. Scene complexity that previously broke models — 100 distinct objects in one frame — now holds together with maintained count integrity.

One model, two surfaces

The same model powers ChatGPT chat (no code, $20/mo Plus) and the OpenAI API (programmatic automation). Prototype in chat, scale via API. The visual output is identical between surfaces. This is the architectural decision that makes vibe-coder workflows compose: start where you can think, ship where you need throughput. The DDS Vibe Coding methodology in Claude Code Part 2 covers this dual-surface pattern in depth.

Truth Gate · What This Master Class Is Not

This is not a "look how cool AI is" tour. It is a working Shopify operator's reference. Every spec is verified against OpenAI's official model card, dev docs, and announcement page. Every limitation is documented. Where we don't know something exactly (Thinking Mode token overhead, exact tier dollar thresholds), we say so. Use the dossier as a source you can cite, not a hype piece.

Section 02 · Landscape

How GPT Image 2 compares to Midjourney V8, Nano Banana 2, and FLUX 2

No single model wins every use case. The honest answer for Shopify operators is hybrid: gpt-image-2 for text-heavy and brand-consistency work, Midjourney V8 for editorial mood, Nano Banana 2 for high-volume cheap backgrounds, FLUX 2 for photorealistic illustration via Replicate. Here's the matrix.

Capability	gpt-image-2	Midjourney V8	Nano Banana 2	FLUX 2 Pro
Text in image	~99% accuracy, multilingual ✓	Weak, often gibberish	Improved but behind	Good but inconsistent
Photorealism	Strong, neutral color	Painterly bias	Cinematic ✓	Strong photoreal ✓
Reference images	Up to 16 per call ✓	Style references only	Limited	Single reference
Reasoning / planning	Thinking Mode ✓	No	No	No
Multi-image consistency	8 coherent / call ✓	Manual seed work	Limited	LoRA required
Generation speed	Medium (Thinking adds 30-60s)	Medium	Fast (<10s) ✓	Fast ✓
Cost per image	$0.04–$0.35	~$0.10/img (relax mode)	~$0.04 ✓	~$0.03 (Replicate) ✓
Brand logo accuracy	Unreliable	Unreliable	Unreliable	Unreliable
Transparent BG (PNG-α)	Not via Responses API	Manual key-out	Limited	Native ✓
Aesthetic / cinematic mood	Strong, neutral	Champion ✓	Good	Good
Knowledge cutoff	Dec 2025 + web search	Pre-training only	Pre-training only	Pre-training only
Free tier access	ChatGPT Free: Instant Mode only. API: NOT supported.	Paid only	Free in Gemini ✓	Pay-per-use

Decision framework — when to pick each

Pick gpt-image-2 when

Text appears inside the image (posters, packaging, UI mockups, sale banners, infographics, multilingual marketing). Brand-consistent product variants from one source photo are required. Multi-frame storyboards or carousel sequences need character continuity. Conversational iterative editing matters more than raw speed.

Pick Midjourney V8 when

Editorial fashion mood, painterly illustrations, or specific film-stock aesthetic precision is the deliverable. Cinematic concept art for lookbook covers where text isn't critical. Style references that demand artistic interpretation over instruction-following.

Pick Nano Banana 2 when

You need fast, cheap, high-volume background generation or simple lifestyle shots with no embedded text requirements. Free in Gemini for exploratory work. ~3x cheaper at scale.

Pick FLUX 2 Pro when

You need transparent backgrounds natively, photorealistic illustration, or programmatic Replicate workflows at $0.03/image scale. Strong for ghost mannequin and isolated product shots.

Hybrid recommendation for serious DTC

Use gpt-image-2 for text-heavy work and brand assets. Use Midjourney V8 for hero campaign mood. Use Nano Banana 2 or FLUX for high-volume backgrounds. Composite real vector logos in Photoshop or Figma post-generation. Upscale finals through Topaz Gigapixel for print resolution. This stack costs roughly $40–$80/month combined and replaces traditional photography for 90% of Shopify visual needs.

Image Arena leaderboard context

gpt-image-2 debuted on the Image Arena Text-to-Image leaderboard at 1512, leading second-place Nano Banana 2 by +242 points — the largest single-model lead ever recorded on that leaderboard. The lead is concentrated in: text rendering accuracy, complex composition handling, multi-image consistency, and instruction following. The lead is not in pure aesthetic photorealism — Midjourney and Nano Banana still hold ground there.

Section 03 · Access Paths

Three ways to access gpt-image-2 — pick your lane

Before any prompt or code, decide which surface you'll use. The model is identical across all three paths. The economics, control level, and automation ceiling are not.

Path	Cost	Code Required	Thinking Mode	Best For
ChatGPT Free	$0	No	No (Instant only)	Exploration, occasional use, learning the model
ChatGPT Plus	$20/mo	No	Yes ✓	Solo Shopify operators, daily use, no automation needed
ChatGPT Pro	$200/mo	No	Yes ✓	Heavy users, "unlimited subject to abuse guardrails"
ChatGPT Business	~$25/seat/mo	No	Yes ✓	Marketing teams sharing brand context
OpenAI API (paid tier)	$0.04–$0.35 / image	Yes	Yes (variable cost) ✓	Bulk variants, automation, Shopify integration pipelines

Critical Access Note

The OpenAI API does NOT support a free tier for gpt-image-2 — the official model card explicitly lists Free as "Not Supported." You must be on a paid usage tier and complete Organization Verification in your OpenAI developer console before API calls work. This is a one-time setup that takes ~10 minutes but is non-negotiable.

Recommended path for serious Shopify operators: Path C — both ChatGPT Plus AND OpenAI API. ChatGPT Plus is your daily driver for thinking, exploring, iterative editing. The API is your throughput layer when a workflow becomes repeatable. The same prompts work on both surfaces.

Section 04 · ChatGPT Native Surface

Mastering ChatGPT chat for Shopify imagery

The chat surface is more capable than most developers give it credit for. Used well, it replaces 80% of what a serious DTC brand needs visually — without writing a line of code. Here's the complete operator's playbook.

Thinking Mode vs Instant Mode — when to use each

⚡

Instant Mode (default for Free, available to all)

Standard fast generation, ~10–20 seconds per image. Use for: simple product shots, single subject scenes, quick exploration, drafts. Quality is significantly improved over gpt-image-1.5 even without reasoning. This is the right call for 80% of solo-operator Shopify use cases.

🧠

Thinking Mode (Plus / Pro / Business / Enterprise only)

Reasoning + web search + self-check before generation. ~30–60 seconds per image. Use for: complex layouts with multiple constraints, multilingual posters, infographics with structured data, multi-frame storyboards, packaging with verbatim text, brand-consistency batches via n=8. The latency cost buys first-try quality on the work where rerolls are expensive.

Custom GPTs — the brand consistency unlock

A Custom GPT lets you encode your entire brand context into a persistent assistant. Every chat with that Custom GPT inherits the system prompt, uploaded brand assets, and behavioral rules. This is how you stop typing "use forest green and gold, sans-serif body, serif display, photographed in soft natural light" into every prompt. If you've followed the Claude Code Part 1 master class, the Custom GPT pattern is structurally identical to Claude Code's CLAUDE.md + Skills architecture — same persistent-context principle, different platform.

How to build a Shopify-grade Custom GPT (10 minutes)

Open the Custom GPT builder

In ChatGPT (Plus/Pro), click your name → "My GPTs" → "Create a GPT". Use the Configure tab, not the Create wizard, for control.

Write the system prompt

Document your brand exhaustively: brand name, mission sentence, target customer, brand colors with hex codes, font stack, photography style, lighting recipes, do's and don'ts. Paste a template (provided below).

Upload knowledge files

Drop in a PDF brand guidelines doc, your top 5 hero product photos, your logo as PNG, color swatches, a font specimen sheet. The Custom GPT can reference these in every conversation.

Enable image generation capability

In the Capabilities checklist, ensure "DALL-E Image Generation" is checked (this surfaces the gpt-image-2 model). Web Browsing optional but recommended for Thinking Mode use.

Test, refine, lock

Generate 5 test images across categories (product, lifestyle, social, packaging). If the brand voice drifts, tighten the system prompt. Save and pin the GPT to your sidebar. This is now your daily driver.

Custom GPT system prompt template (paste-ready)

Custom GPT · System Prompt

You are the [BRAND NAME] visual director. You generate production-quality
images for [BRAND NAME], a [BRAND CATEGORY] brand based in [CITY]. Every
image you generate must match these brand specifications:

BRAND IDENTITY
- Name: [BRAND NAME]
- Mission: [ONE-SENTENCE MISSION]
- Target customer: [DEMOGRAPHIC + PSYCHOGRAPHIC]
- Brand voice: [3 ADJECTIVES — e.g., honest, refined, never preachy]

COLOR PALETTE (use these hex codes only)
- [PRIMARY DARK]: #XXXXXX
- [PRIMARY MID]: #XXXXXX
- [ACCENT METALLIC]: #XXXXXX
- [BACKGROUND CREAM]: #XXXXXX
- [TEXT WHITE]: #XXXXXX

TYPOGRAPHY
- Display: [SERIF FONT NAME]
- Body: [SANS-SERIF FONT NAME]
- All in-image text: render in quotation marks, verbatim, no extras

PHOTOGRAPHY STYLE
- Lighting: [e.g., soft natural daylight from window, slight directional]
- Color grading: [e.g., warm neutral, slight green undertone]
- Mood: [e.g., quiet confidence, lived-in, never sterile]
- Models: [diversity statement and how to specify]
- Backgrounds: [preferred surfaces and contexts]

DO'S
- Render exact text verbatim with "no duplicate text" / "no watermark"
- Use shallow depth of field for product detail shots
- Composite real logos post-generation; do not request brand logo reproduction
- Default to 4:5 vertical for social and product, 16:9 for hero, 1:1 for IG

DON'TS
- Never reference fake certifications
- Never generate copyrighted IP (Disney/Marvel/named brand logos)
- Never generate real-celebrity likenesses
- Never use empty terms like "premium feel" or "viral quality"

WORKFLOW
- Default to Instant Mode for exploration
- Switch to Thinking Mode when prompt has 5+ constraints or text-heavy
- For brand consistency batches, use n=8 with explicit character anchor
- Always end prompts with: "No extra text, no duplicate text, no watermark"

When user asks for an image, ask one clarifying question if intent is
ambiguous (aspect ratio, primary use case). Otherwise generate.

ChatGPT Projects — persistent context for repeated work

Projects are ChatGPT's persistent workspace feature. Unlike a single chat, a Project remembers files, instructions, and context across every conversation inside it. For Shopify operators, this is the right home for per-brand or per-campaign work.

Project setup pattern for Shopify brands

Create a Project named [BRAND NAME] · Visual Library
Upload to Project files: brand guidelines PDF, hero product photos (3-5), logo PNG/SVG, color reference image, font specimen
Set Project instructions to your Custom GPT system prompt (above)
Start chats inside the Project for: Spring 2026 campaign, new product launch, holiday Q4 ads, etc.
Each chat inherits the brand context — you skip the "remind me of your brand colors" preamble every session

Memory + Custom Instructions

Even without a Custom GPT, ChatGPT's Memory and Custom Instructions features can encode your brand context globally. Settings → Personalization → Custom Instructions. Drop your brand specifications there and every chat will reference them.

This is the lightweight option. Custom GPTs are stronger for serious work because they have file knowledge bases. Use Memory + Custom Instructions if you're operating a single brand and don't want to build a full Custom GPT.

The reverse-the-prompt trick

One of the most useful patterns: feed ChatGPT an existing brand photo you love and ask it to generate the prompt that would create something similar.

Reverse Prompt · Paste Into Chat

[Upload your reference image to the chat]

Analyze this image as a generation prompt. Describe in detail:
- The lighting (source, direction, quality, color temperature)
- The camera (lens focal length, aperture, height, angle)
- The composition (subject placement, framing, negative space)
- The color palette (dominant hues, accent colors, grading)
- The mood (emotional register, time of day, season)
- The style (photography genre, era, aesthetic reference)

Then write a single paragraph prompt that would produce a NEW image
in the same style but with different subject matter. Format the prompt
ready to paste into gpt-image-2. Use quotation marks around any text
that should appear in-image.

Multi-turn conversational editing

The most underrated capability of ChatGPT chat is iterative refinement. Instead of writing one perfect prompt, generate a base image and refine in conversation.

The iteration pattern

Generate base image → "Make the background slightly warmer" → "Move the product 10% to the left" → "Replace the wooden surface with marble" → "Add subtle morning light from the upper left." Each follow-up is a single change. The model preserves everything else — face, identity, pose, lighting recipe, framing — automatically. This is dramatically more reliable than trying to specify all changes in one prompt.

Mobile workflow — camera roll → reference → generate

The ChatGPT mobile app supports image upload from camera roll. The complete on-the-go workflow: photograph your physical product or sample → upload to ChatGPT → "Generate this product on a marble countertop with morning light" → download result → upload directly to Shopify Admin app. End-to-end in under 90 seconds. This is the workflow that makes solo Shopify operators 10x faster than they were six months ago.

Section 05 · API Surface

OpenAI API setup — Python, Node, curl, edit endpoint

The API is where automation lives. ChatGPT chat is for thinking; the API is for shipping at scale. If you're generating 50 product variants, batch-creating social ads from a content calendar, or running gpt-image-2 inside a Shopify product creation hook, this is your surface. For builds where you need to run image-related models on hardware you own with zero recurring cost, see the Ollama for Windows complete guide — note that as of April 2026, Ollama's open-source image-generation models trail gpt-image-2 substantially in text rendering and instruction following, so the right pattern is gpt-image-2 for production visuals plus Ollama for sovereign reasoning fallback.

Pre-flight checklist (one-time setup)

Create OpenAI account + add payment method

Visit platform.openai.com. Add credit card. Free tier is NOT supported for gpt-image-2 — you must be on a paid usage tier.

Complete Organization Verification

Settings → Organization → Verify. Required before any GPT Image model API calls work. Takes ~10 minutes.

Create API key

API keys → Create new secret key. Name it descriptively (e.g., shopify-image-pipeline-prod). Copy immediately — you cannot view it again.

Set environment variable

In your shell config (.bashrc, .zshrc, .env): export OPENAI_API_KEY="sk-...". Never hardcode keys in committed files.

Install SDK

Python: pip install openai>=1.50.0. Node: npm install openai. The 1.50.0+ version is required for native gpt-image-2 support.

Verify access

Run the smallest test call (below) to confirm your tier supports gpt-image-2. If you get a 429 immediately, you're rate-limited at Tier 1 (5 IPM). If you get an org-verification error, return to step 02.

Python — text-to-image (verified working)

Python · openai >= 1.50.0

from openai import OpenAI
import base64
import os

client = OpenAI()  # reads OPENAI_API_KEY from env

response = client.images.generate(
    model="gpt-image-2",
    prompt=(
        "A clean studio product photograph of a forest-green organic cotton "
        "hoodie centered on a pure white seamless background. Soft, even "
        "lighting from the upper left, subtle shadow grounding the garment. "
        "Sharp focus, true-to-life color, no reflections, no clutter. "
        "Professional e-commerce style. No text, no watermark, no logo overlay."
    ),
    size="1024x1024",   # also supports 1024x1536, 1536x1024, custom
    quality="high",     # "low" | "medium" | "high"
    n=1                 # up to 10 per call
)

# Default response is base64-encoded PNG
image_b64 = response.data[0].b64_json
image_bytes = base64.b64decode(image_b64)

# Save to disk
output_path = "hoodie_hero.png"
with open(output_path, "wb") as f:
    f.write(image_bytes)

print(f"Image saved to {output_path}")
print(f"Tokens used: {response.usage}")

Node / TypeScript — text-to-image (verified working)

Node · openai SDK

import OpenAI from "openai";
import fs from "fs";

const openai = new OpenAI();  // reads OPENAI_API_KEY from env

const result = await openai.images.generate({
  model: "gpt-image-2",
  prompt: `A clean studio product photograph of a forest-green organic cotton
    hoodie centered on a pure white seamless background. Soft, even lighting
    from the upper left, subtle shadow grounding the garment. Sharp focus,
    true-to-life color, no reflections, no clutter. Professional e-commerce
    style. No text, no watermark, no logo overlay.`,
  size: "1024x1024",
  quality: "high",
  n: 1
});

const imageBase64 = result.data[0].b64_json;
const imageBytes = Buffer.from(imageBase64, "base64");
fs.writeFileSync("hoodie_hero.png", imageBytes);

console.log("Image saved to hoodie_hero.png");

curl — text-to-image (any language fallback)

cURL · raw HTTP

curl https://api.openai.com/v1/images/generations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A clean studio product photograph of a forest-green organic cotton hoodie centered on a pure white seamless background. Soft, even lighting from upper left, subtle shadow. Professional e-commerce style. No text, no watermark.",
    "size": "1024x1024",
    "n": 1,
    "quality": "high"
  }'

Image editing with mask (inpainting / outpainting)

The images.edit endpoint lets you make surgical changes to an existing image — change the background, remove an unwanted element, swap a product color, extend a square into a 16:9 hero. Mask the region you want changed; everything outside the mask stays locked.

Python · images.edit + mask

from openai import OpenAI
import base64

client = OpenAI()

# source.png: your existing product photo
# mask.png: same dimensions as source. Transparent pixels mark the region
#           to edit. Opaque pixels are preserved.

result = client.images.edit(
    model="gpt-image-2",
    image=open("source.png", "rb"),
    mask=open("mask.png", "rb"),
    prompt=(
        "Replace the white seamless background with a marble countertop. "
        "Keep the product, lighting, shadow, and product color exactly as is. "
        "Marble should be cool gray with subtle veining, soft natural light. "
        "No text, no watermark."
    ),
    quality="high",
    n=1
)

image_bytes = base64.b64decode(result.data[0].b64_json)
with open("background_swap.png", "wb") as f:
    f.write(image_bytes)

Edit cost compounding warning

gpt-image-2 processes image inputs at maximum quality regardless of your quality parameter setting. If your workflow involves uploading reference images and iterating on them — common for product mockups, character consistency, ad variations — your real cost per asset runs higher than the baseline per-image number suggests. Cached image inputs cost 75% less ($2/M vs $8/M), so design iterative workflows to reuse the same source image where possible.

Multi-image input — reference-driven generation

Pass up to 16 reference images per call. Label each by index in your prompt. Use this for: virtual try-on, style transfer, brand-consistent product variants, character consistency.

Python · multi-image reference

from openai import OpenAI
import base64

client = OpenAI()

result = client.images.edit(
    model="gpt-image-2",
    image=[
        open("base_product.jpg", "rb"),       # Image 1: source
        open("brand_style_ref.jpg", "rb"),    # Image 2: style reference
        open("color_swatch.jpg", "rb"),       # Image 3: color palette
    ],
    prompt=(
        "Image 1 is the source product. Preserve product geometry, materials, "
        "and proportions exactly. Image 2 is the brand style reference — "
        "match its lighting recipe, depth of field, and editorial mood. "
        "Image 3 is the brand color palette — apply these colors only to "
        "the background and prop styling, never to the product itself. "
        "Generate a new lifestyle product photograph combining all three. "
        "4:5 vertical aspect ratio. No text, no watermark, no logo drift."
    ),
    quality="high",
    n=1
)

image_bytes = base64.b64decode(result.data[0].b64_json)
with open("brand_consistent_variant.png", "wb") as f:
    f.write(image_bytes)

Batch generation with n parameter

Generate up to 10 variants in a single call. Combined with Thinking Mode, this produces 8 coherent images with character/object continuity — the workflow that makes lookbooks and storyboards practical.

Python · batch n=8 with character anchor

from openai import OpenAI
import base64

client = OpenAI()

# Establish character anchor in the prompt itself.
# Then describe 8 distinct scenes that share the character.

response = client.images.generate(
    model="gpt-image-2",
    prompt=(
        "CHARACTER ANCHOR: Mara, 28-year-old, wavy auburn shoulder-length "
        "hair, warm olive skin, light freckles. Wearing the brand's signature "
        "forest-green hoodie. Editorial lifestyle photography style. "
        "Soft natural daylight. Same character across every frame.\n\n"
        "Generate 8 connected lifestyle frames showing Mara's morning routine:\n"
        "1. Waking up in bed, soft window light\n"
        "2. Stretching at the window, looking out\n"
        "3. Making coffee in the kitchen\n"
        "4. Walking out of her apartment building\n"
        "5. Morning walk on urban street, coffee in hand\n"
        "6. Arriving at a corner coffee shop\n"
        "7. Working on a laptop at the cafe\n"
        "8. Smiling, looking up at someone off-frame\n\n"
        "Maintain identical character appearance across all 8 frames. "
        "Identical color grading: warm morning light fading to cool cafe interior. "
        "4:5 per frame. No text, no watermark."
    ),
    size="1024x1536",
    quality="high",
    n=8                  # up to 10 per call; n=8 with Thinking is sweet spot
)

for idx, img in enumerate(response.data):
    image_bytes = base64.b64decode(img.b64_json)
    with open(f"mara_morning_{idx+1:02d}.png", "wb") as f:
        f.write(image_bytes)

print(f"Generated {len(response.data)} frames")

Production retry pattern with exponential backoff

The API returns HTTP 429 when you exceed rate limits. In production, you must implement retry logic — sequential tight loops just queue and timeout. Use the Tenacity library or built-in async retry.

Python · Tenacity retry pattern

from openai import OpenAI
from tenacity import (
    retry,
    stop_after_attempt,
    wait_random_exponential,
    retry_if_exception_type,
)
from openai import RateLimitError, APIError
import base64

client = OpenAI()

@retry(
    wait=wait_random_exponential(min=1, max=60),
    stop=stop_after_attempt(6),
    retry=retry_if_exception_type((RateLimitError, APIError)),
)
def generate_with_backoff(**kwargs):
    return client.images.generate(**kwargs)

# Usage
response = generate_with_backoff(
    model="gpt-image-2",
    prompt="...",
    size="1024x1024",
    quality="high",
    n=1
)

image_bytes = base64.b64decode(response.data[0].b64_json)
with open("output.png", "wb") as f:
    f.write(image_bytes)

Bulk variant generation pipeline

The pattern that replaces a $300/product photoshoot with a $2/product API call. Read SKUs from a CSV or your Shopify catalog, generate 5 visual variants per product, save each to disk for review.

Python · bulk pipeline (CSV → images)

import csv
import base64
import os
import time
from openai import OpenAI
from tenacity import retry, wait_random_exponential, stop_after_attempt

client = OpenAI()

@retry(wait=wait_random_exponential(min=2, max=60), stop=stop_after_attempt(5))
def generate(prompt, size="1024x1024", quality="high"):
    return client.images.generate(
        model="gpt-image-2",
        prompt=prompt,
        size=size,
        quality=quality,
        n=1
    )

# Read products.csv with columns: sku, name, color, material, category
def bulk_generate_variants(csv_path, output_dir="output"):
    os.makedirs(output_dir, exist_ok=True)

    with open(csv_path, "r") as f:
        reader = csv.DictReader(f)
        for row in reader:
            sku = row["sku"]
            name = row["name"]
            color = row["color"]
            material = row["material"]
            category = row["category"]

            variants = [
                ("white_bg", f"Clean white background product photo of {color} {material} {name}, soft upper-left lighting, e-commerce style. No text."),
                ("lifestyle", f"Lifestyle product photo of {color} {material} {name} on light oak surface, warm window light, shallow depth of field. No text."),
                ("flat_lay", f"Top-down flat lay of {color} {material} {name} with complementary props on cream linen, soft overhead light. No text."),
                ("detail", f"Macro close-up of {color} {material} {name} showing texture and material detail, shallow DOF. No text."),
                ("ghost_mannequin", f"Ghost mannequin product photo of {color} {material} {name}, white background, soft lighting. No text."),
            ]

            for variant_name, prompt in variants:
                try:
                    response = generate(prompt)
                    image_bytes = base64.b64decode(response.data[0].b64_json)
                    output_path = os.path.join(output_dir, f"{sku}_{variant_name}.png")
                    with open(output_path, "wb") as out:
                        out.write(image_bytes)
                    print(f"✓ {sku} {variant_name}")
                    time.sleep(0.5)  # respect rate limit
                except Exception as e:
                    print(f"✗ {sku} {variant_name}: {e}")

if __name__ == "__main__":
    bulk_generate_variants("products.csv")

Rate limits — what to expect at each tier

Tier	TPM	IPM (images/min)	To Reach	1,000 Image Batch Time
Free	—	—	NOT SUPPORTED	—
Tier 1	100K	5	Org verified, payment method	~3 hours 20 min
Tier 2	250K	20	$50+ spent	~50 min
Tier 3	800K	50	$100+, 7+ day account	~20 min
Tier 4	3M	150	$250+, 14+ day account	~7 min
Tier 5	8M	250	$1,000+, 30+ day account	~4 min

Rate limit reality

Failed prompts (content policy refusals) STILL consume your quota. Limits are enforced at organization + project level, not per user. If you're running multiple API keys under one org, they share the same IPM ceiling. Plan tier ramp before launch — don't discover Tier 1's 5 IPM ceiling during a Black Friday push.

Section 06 · Prompt Engineering

The prompt engineering principles that work

OpenAI published an official prompting guide in the Cookbook. Combined with community testing across thousands of generations since launch, these are the rules that consistently produce production-quality output. Skip these and you'll waste credits.

The master structure (per OpenAI Cookbook)

Order matters: Background → Subject → Details → Constraints

Write prompts in a consistent order: Background or scene first to set the world. Subject second — the who or what. Key details third — materials, lighting, camera, composition. Constraints last — what NOT to include, text rules, "no watermark / no extra text / no duplicate text." For complex requests, use short labeled segments or line breaks instead of one long paragraph.

The six-block formula (community-validated)

Scene / Background

Where the image takes place. Sunlit cafe, dark studio, abstract gradient, crowded city street. Anchors everything that follows.

Subject

The who or what — described with specificity. Demographic for people, materials and form for products.

Composition

Layout and spatial relationships. Centered, rule-of-thirds, split layout, grid, off-frame elements.

Lighting

Source, direction, mood. "Soft window light from camera left," "neon glow," "dramatic rim light," "golden hour backlight."

Style

Medium, aesthetic, era. "35mm film photography," "editorial magazine," "flat illustration," "watercolor."

Format / Constraints

Aspect ratio, output use, things to exclude. "4:5 vertical for IG story," "no extra text, no duplicate text, no watermark."

The text-in-image rules (HARD constraints)

Text rendering checklist — apply every time text appears in your image

1. Place exact text in "quotation marks" or ALL CAPS.
2. Specify font style, weight, color, placement explicitly: "Bold sans-serif, white, centered at the bottom third."
3. Spell out tricky brand names letter-by-letter if they have unusual spelling.
4. Add "verbatim — no extra characters, no substitutions" for accuracy-critical text.
5. Add "no extra text" and "no duplicate text" to prevent watermark or repetition artifacts.
6. Use quality: medium or high for dense text panels.
7. For multilingual, state language and script explicitly.

Multi-image input rules

When passing reference images, label each by index and describe the role: "Image 1 is the product photo. Image 2 is the style reference. Apply Image 2's color palette and texture to Image 1." Be explicit about preservation: "Same character, same lighting, same pose — only change the jacket."

The iteration rules

Don't overload single prompts. Start with a clean base, refine with single-change follow-ups.
State what stays the same when editing. Without this, the model treats everything as fair game.
Specify framing/angle/distance. "Close-up" and "wide shot" of the same subject are completely different images.
Don't mix more than one dominant style. "Watercolor meets cyberpunk vintage" confuses the model.
Don't use empty aesthetic terms. "Premium feel," "viral quality," "stunning" actually dilute prompts.
Use natural language scene description. Avoid tag-stuffing like "8K masterpiece cinematic."

Counterintuitive findings (post-launch testing)

Simpler prompts often outperform stacked ones

Promptolis and ImagineArt both report: gpt-image-2 "performs strongest with simpler prompts" and "becomes less reliable when the creative demand becomes too layered." This runs opposite to Midjourney where complex prompt stacking often improves output. With gpt-image-2, describe ONE clear intent per prompt rather than stacking multiple style modifiers. The model handles 200-word dense prompts well — but only when each clause adds new information rather than restating the same idea differently.

The edit prompt pattern (per fal.ai)

Edit Pattern · Two-Column Logic

Change: [exactly what should change]
Preserve: [face, identity, pose, lighting, framing, background,
          geometry, text, layout — list everything that stays locked]
Constraints: [no extra objects, no redesign, no logo drift, no watermark]

Section 07 · The Catalog

The 70-category Shopify prompt library

Every visual a Shopify store will ever need, organized by category, with a paste-ready prompt template for each. Replace the bracketed placeholders with your specifics. All prompts follow the master structure and apply the text-in-image rules.

How to use this catalog

Each prompt has placeholders like [PRODUCT], [BRAND COLOR], [DEMOGRAPHIC]. Replace them with your real values. The structural patterns are validated — only the placeholders change. Click COPY on any prompt to put it on your clipboard, then paste into ChatGPT or your API call.

Product Visuals

The core visuals every product page needs — white background, lifestyle, flat-lay, variants, on-model.

13 prompts · #1–#13

#01 Product on white background (e-commerce primary)

A clean studio product photograph of [PRODUCT_DESCRIPTION] centered on a pure white seamless background. Soft, even lighting from the upper left, subtle shadow grounding the product. Sharp focus throughout, true-to-life color, no reflections, no clutter. Professional e-commerce style. Square 1:1 format. No text, no watermark, no logo overlay.

PRODUCT 1:1 QUALITY: HIGH

#02 Product on lifestyle / contextual background

A photorealistic lifestyle product photograph of [PRODUCT] resting on [SURFACE — e.g., light oak desk / marble countertop / linen tablecloth]. Soft natural daylight from a window, shallow depth of field, warm neutral color balance. Subtle complementary props out of focus in the background. Editorial commercial photography style. 4:5 aspect ratio. No text, no watermark.

PRODUCT 4:5 LIFESTYLE

#03 Product flat-lay top-down

A top-down flat lay product photograph of [PRODUCT] arranged with [2-3 COMPLEMENTARY ITEMS] on a [SURFACE — e.g., textured cream linen]. Soft diffused overhead lighting, no harsh shadows, balanced negative space, careful symmetry. Aerial perspective, 90-degree top-down angle. Magazine editorial flat-lay style. 1:1 aspect ratio. No text, no watermark.

PRODUCT 1:1 FLAT-LAY

#04 Product 360° rotation set (8-frame consistent series)

[Use Thinking Mode + n=8] Generate 8 product photographs of [PRODUCT_DESCRIPTION] showing the same product from 8 evenly-spaced rotation angles (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°). Identical white seamless background, identical lighting (soft upper-left key light), identical scale and crop in every frame. The product itself must remain physically identical across all 8 frames — same color, same materials, same details. E-commerce 360° spin set. 1:1 aspect ratio per frame. No text.

PRODUCT THINKING MODE n=8

#05 Product detail / macro close-up

An extreme close-up macro photograph of [PRODUCT — focus on a specific feature, e.g., the embroidered logo / fabric weave / stitching]. Shot at 100mm macro lens at f/4, shallow depth of field with critical sharp focus on the detail, soft directional studio lighting revealing texture. Visible material grain, natural imperfections, true-to-life color. 4:5 vertical aspect ratio. No text.

PRODUCT MACRO 4:5

#06 Multi-color variant set (same product, different colors)

[Use multi-image input + n parameter] Reference: Image 1 is the source product photograph of [PRODUCT]. Generate the same product in [COLOR_1, COLOR_2, COLOR_3, COLOR_4] colorways. Identical white background, identical lighting, identical camera angle, identical product geometry and proportions. Only the color of [SPECIFIED ELEMENTS] should change. Maintain accurate material rendering — fabric texture, sheen, drape, shadows must match the source. E-commerce variant set. 1:1 per frame. No text.

PRODUCT MULTI-IMAGE VARIANTS

#07 Multi-size comparison (apparel sizing)

A side-by-side product photograph showing [GARMENT] in three sizes laid flat: small on the left, medium in the center, large on the right. Identical product design, identical fabric, identical color, identical lighting. Clean cream linen background, shadow grounding each piece. Soft overhead light. Wide 16:9 aspect ratio. Each size positioned on its own labeled cream cardstock that reads "S", "M", "L" verbatim in clean black sans-serif type. No watermark.

PRODUCT VERBATIM TEXT 16:9

#08 Apparel on invisible mannequin (ghost mannequin)

A photorealistic ghost mannequin product photograph of [GARMENT — e.g., navy oversized hoodie] showing the garment's three-dimensional shape with no visible mannequin or model. The garment appears to hold its own form: filled-out shoulders, structured chest, natural drape at hem. Pure white seamless background, soft even lighting, subtle shadow underneath. Professional e-commerce ghost mannequin style. 1:1 aspect ratio. No text, no watermark.

APPAREL GHOST MANNEQUIN 1:1

#09 Apparel on AI model — full body

A photorealistic full-body editorial fashion photograph of a [DEMOGRAPHIC — e.g., 28-year-old, mid-length brown hair, warm undertones, athletic build] wearing [GARMENT]. Standing in a relaxed natural pose against a [BACKGROUND — e.g., concrete wall in soft afternoon light]. Shot on 85mm lens, f/2.8, shallow depth of field, natural skin texture, real fabric drape, soft warm color grading. Editorial fashion photography style. 4:5 vertical aspect ratio. No text, no watermark, no extra logos.

APPAREL ON-MODEL 4:5

#10 Apparel on AI model — half body

A photorealistic half-body editorial fashion photograph of [DEMOGRAPHIC] wearing [GARMENT]. Framed from upper thigh to head, three-quarter angle, hands relaxed at sides. Soft window light from camera right. Neutral background — light gray seamless. Shot on 85mm at f/2, shallow depth of field, natural skin texture and pores visible, no heavy retouching. Editorial fashion style. 4:5 vertical. No text.

APPAREL ON-MODEL HALF-BODY

#11 Diverse model generation (multi-frame consistent variants)

[Use Thinking Mode + n=4] Generate 4 product photographs of [GARMENT] worn by 4 different models, one per frame: Frame 1: petite female model, fair skin, dark hair. Frame 2: tall male model, medium-dark skin, short hair. Frame 3: plus-size female model, warm-toned skin, curly hair. Frame 4: masculine-presenting non-binary model, light-medium skin, buzz cut. Identical garment design, color, fit positioning. Identical lighting setup (soft window light from camera left). Identical concrete-gray seamless background. Identical 4:5 framing showing waist to head. Editorial fashion style. No text in any frame. No watermark.

APPAREL DIVERSE MODELS n=4

#12 Group / family product shots

A clean studio product photograph showing the complete [PRODUCT FAMILY — e.g., t-shirt, hoodie, joggers, hat collection] arranged together on white seamless background. Items overlap subtly to suggest a coordinated outfit. Soft upper-left key light, balanced fill, accurate color across all pieces, sharp focus throughout. Square 1:1 aspect ratio. No text, no watermark.

PRODUCT GROUP SHOT 1:1

#13 In-use / demonstration shot

A photorealistic lifestyle photograph showing [PRODUCT] being actively used by a person in [CONTEXT — e.g., morning coffee routine in a sunlit kitchen]. The person's hands are visible interacting with the product. Soft natural daylight, candid documentary feel, shallow depth of field on the product, natural color, real environment textures. 4:5 vertical aspect ratio. No text.

PRODUCT IN-USE 4:5

Marketing Visuals

Hero banners, sale graphics, email headers, popups — the on-site assets driving conversion.

9 prompts · #14–#22

#14 Hero banner — desktop (16:9)

A wide cinematic hero banner for [BRAND] showing [HERO PRODUCT/SCENE]. Subject positioned in the right third leaving generous negative space on the left for headline copy. Soft directional lighting, atmospheric color grading in [BRAND PALETTE — e.g., forest green, gold, cream]. Premium editorial commercial style. 16:9 widescreen. No text in image.

MARKETING HERO 16:9

#15 Hero banner — mobile (9:16)

A vertical mobile hero banner for [BRAND] featuring [HERO PRODUCT/SCENE]. Subject centered with copy space below for a 3-line headline. Bold cinematic lighting, [BRAND COLORS] color grading, modern fashion editorial energy. Vertical 9:16 aspect ratio. No text in image.

MARKETING MOBILE 9:16

#16 Sale banner with verbatim text (HARD)

A premium e-commerce sale banner for [BRAND]. Bold headline EXACT TEXT: "30% OFF EVERYTHING" centered in large bold sans-serif white text on the upper third. Subheadline EXACT TEXT: "ENDS SUNDAY" smaller, lighter weight, directly below. [BRAND COLOR] background with subtle texture. Product silhouettes faded into the lower portion. Render text verbatim — no extra characters, no substitutions, no duplicate text. 16:9 aspect ratio. Quality: high.

MARKETING VERBATIM TEXT QUALITY: HIGH

#17 Email header banner

A clean email header banner 1200×400px for [BRAND]. [PRODUCT/SCENE] positioned center-left, brand color block on the right reserved for headline. Soft professional lighting, neutral color balance, premium DTC newsletter aesthetic. Wide 3:1 aspect ratio. No text.

EMAIL 3:1 HEADER

#18 Abandoned cart graphic

A warm friendly e-commerce graphic showing [PRODUCT] sitting on a desk beside a coffee mug and a phone displaying a generic notification icon. Warm afternoon window light, shallow depth of field. Color palette: [BRAND PALETTE]. Aspect ratio 4:5 vertical for email/social use. No text.

EMAIL CART RECOVERY 4:5

#19 Welcome series email illustration

An editorial-illustration-style graphic for a brand welcome email. Hand-drawn appearance, [BRAND COLOR PALETTE], showing an unboxing scene from above: opened brand-style box with product visible inside, tissue paper, hangtag, branded sticker. Warm celebratory mood. Flat illustration with subtle paper texture. 1:1 aspect ratio. No text.

EMAIL ILLUSTRATION 1:1

#20 Loyalty program badge

A clean modern badge graphic for a [BRAND] customer loyalty tier. Circular medal shape, gradient [BRAND METALLIC — e.g., gold to bronze], embossed center showing the verbatim text "VIP MEMBER" in serif typography, decorative laurel border. Premium e-commerce loyalty visual. Flat icon style with subtle depth. 1:1 aspect ratio. Render text verbatim — no extra characters.

LOYALTY VERBATIM TEXT BADGE

#21 Newsletter pop-up artwork

A welcoming branded illustration for a website newsletter popup. Featuring [PRODUCT] arranged with envelope, hangtag, and small plant on a [BRAND COLOR] tabletop. Top-down perspective, soft diffused lighting, friendly inviting mood. Square 1:1 with negative space top-right for headline overlay. No text in image.

POPUP 1:1 SUBSCRIBE

#22 Promotional collection tile

A premium collection tile graphic for the [COLLECTION NAME] category. Hero product centered, atmospheric [BRAND PALETTE] lighting, mood that signals [COLLECTION VIBE — e.g., autumn warmth / minimalist clean / bold streetwear]. Editorial campaign style. 4:5 vertical aspect ratio. No text in image (text overlay added separately in Shopify).

COLLECTION TILE 4:5

Social Ads

Platform-native ad creative for IG, FB, Pinterest, YouTube, and carousel sequences — every aspect ratio. (TikTok, X, LinkedIn variants follow the same patterns — adapt aspect ratio.)

5 prompts · #23–#31

#23 IG / FB feed ad (1:1)

A scroll-stopping square social ad for [BRAND]. [PRODUCT] hero shot, bold lighting in [BRAND COLORS], dynamic composition with subject in the rule-of-thirds intersection. Editorial fashion energy, high contrast, premium feel without being sterile. 1:1 aspect ratio. No text in image (overlay added in Meta Ads Manager).

SOCIAL META 1:1

#24 IG / FB story ad (9:16)

A vertical mobile-first social story ad for [BRAND]. [PRODUCT/SCENE] in upper-center, lower third reserved as solid [BRAND COLOR] block for CTA overlay. Cinematic lighting, atmospheric mood, modern editorial style. Vertical 9:16. No text in image.

SOCIAL STORY 9:16

#27 Pinterest pin (2:3)

A vertical editorial Pinterest pin for [BRAND/PRODUCT/TOPIC]. Tall composition with [SUBJECT] in the upper two-thirds, [BRAND COLOR] block in lower third for headline. Inspirational lifestyle aesthetic, soft warm lighting, magazine editorial mood. 2:3 vertical aspect ratio (1000×1500). No text in image.

PINTEREST 2:3

#28 YouTube thumbnail (16:9)

A high-contrast YouTube thumbnail for [BRAND] video titled [VIDEO TOPIC]. Subject (face/product) takes up the left half with strong directional lighting and a slight surprised/engaging expression. Right half is solid [BRAND COLOR] block reserved for bold title overlay. Saturated colors, popping contrast. 16:9 aspect ratio. No text in image.

YOUTUBE 16:9 THUMBNAIL

#31 Carousel ad sequence (multi-frame story)

[Use Thinking Mode + n=5] Generate 5 connected carousel ad frames telling a brand story for [BRAND]: Frame 1: brand world establishing shot — [SETTING]. Frame 2: product close-up — [PRODUCT DETAIL]. Frame 3: product in use — [LIFESTYLE MOMENT]. Frame 4: product paired with complementary item — [STYLING]. Frame 5: brand mark moment — clean product on [BRAND COLOR] background. Maintain identical color grading, identical lighting style, identical visual voice across all 5 frames. Square 1:1 per frame. No text in any frame.

CAROUSEL THINKING MODE n=5

Brand Assets

Logos, wordmarks, hangtags, business cards — the foundational identity assets.

8 prompts · #32–#39

#32 Logo concept exploration

A simple flat vector-style logo concept for [BRAND NAME], a [BRAND CATEGORY] brand. Wordmark-only, EXACT TEXT: "[BRAND NAME]". Modern geometric sans-serif typography, balanced kerning, [BRAND COLORS] palette. Centered on white background. Render text verbatim — exact spelling, no substitutions. Square 1:1 format suitable for refining in Illustrator. No additional text or decoration. NOTE: gpt-image-2 cannot reproduce existing logos exactly. Use for concept exploration only — composite real vector logo from source for production.

BRAND LOGO CONCEPT VERBATIM TEXT

#33 Brand wordmark concept

A clean modern brand wordmark for [BRAND NAME]. EXACT TEXT: "[BRAND NAME]" in [TYPOGRAPHY DIRECTION — e.g., custom serif with elongated ascenders / industrial sans-serif with tight kerning]. Black text on white background, large scale, generous breathing room. Render text verbatim. Square 1:1. No other elements.

BRAND WORDMARK VERBATIM TEXT

#34 Hangtag design

A premium product hangtag design for [BRAND]. Rectangular cream cardstock with rounded corners and circular punch hole at top, hemp string. Front face shows: brand name EXACT TEXT: "[BRAND NAME]" in serif type at top, decorative line, product category EXACT TEXT: "[CATEGORY — e.g., ORGANIC COTTON]" centered, small certification icon row at bottom. [BRAND COLORS] only. Photographed top-down on light wood. Render all text verbatim — no extra characters, no duplicate text. 4:5 aspect ratio.

BRAND HANGTAG VERBATIM TEXT

#35 Care label illustration

A care label graphic for [BRAND] garment tag. Vertical narrow strip showing standard ISO care symbols (washing temp, drying, ironing, dry cleaning) arranged in a row, with EXACT TEXT below each symbol describing the action. Brand name EXACT TEXT: "[BRAND NAME]" at top, fabric content EXACT TEXT: "[FABRIC %]" below. Black text on cream label background, clean utilitarian typography. Verbatim text rendering. 1:3 vertical aspect ratio.

BRAND CARE LABEL VERBATIM TEXT

#36 Tissue paper unboxing pattern

A repeating tissue paper pattern for [BRAND] unboxing experience. Subtle [BRAND COLOR] watermark of a small repeating motif (small leaves / dots / brand mark). Light cream paper texture base. Geometric repeat, low-contrast, premium minimalist. 1:1 tile-ready format. No text.

BRAND PATTERN UNBOXING

#37 Sticker sheet

A flat vector-illustration-style sticker sheet design for [BRAND]. 9-grid layout of small stickers (3×3): brand wordmark, brand icon, product silhouettes, fun brand-aligned phrases each in EXACT TEXT quotation marks. [BRAND COLORS]. White cardstock background, die-cut effect on each sticker. Top-down product shot. 1:1 aspect ratio. Render all text verbatim. Specify your phrases — e.g., "Made to be Worn", "Designed in [CITY]", "[YEAR] Edition".

BRAND STICKERS VERBATIM TEXT

#38 Business card mockup

A premium business card mockup for [BRAND]. Front: brand wordmark EXACT TEXT: "[BRAND NAME]" in serif type centered, [TAGLINE] in small italic below. Back: name EXACT TEXT: "[FULL NAME]" + title EXACT TEXT: "[TITLE]" + email EXACT TEXT: "[EMAIL]" + phone EXACT TEXT: "[PHONE]". [BRAND COLOR] textured cardstock, cream/black ink, embossed details. Photographed at slight angle on light wood surface, soft daylight. Render text verbatim. 4:3 aspect ratio.

BRAND BUSINESS CARD VERBATIM TEXT

#39 Letterhead

A clean professional letterhead design for [BRAND]. Top header: brand wordmark EXACT TEXT: "[BRAND NAME]" left-aligned, address EXACT TEXT: "[FULL ADDRESS]" right-aligned in small sans-serif. Bottom footer: phone, email, website in a horizontal row, EXACT TEXT verbatim. Cream paper, [BRAND COLOR] accent line under header. Top-down photograph at slight angle. Verbatim text rendering, no duplicate text. 8.5:11 portrait.

BRAND LETTERHEAD VERBATIM TEXT

Product Packaging

Boxes, mailers, labels, hangtag backs, insert cards — the unboxing experience.

5 prompts · #40–#44

#40 Box mockup

A premium product box mockup for [BRAND]. Rigid kraft paper box with magnetic closure, brand wordmark EXACT TEXT: "[BRAND NAME]" embossed centered on the lid in [BRAND METALLIC]. Box photographed at 3/4 angle on light oak surface, soft daylight from camera left, subtle drop shadow. Premium luxury unboxing aesthetic. Render text verbatim. 4:5 vertical aspect ratio.

PACKAGING BOX VERBATIM TEXT

#41 Polybag / mailer design

A custom branded polybag mailer for [BRAND] orders. [BRAND COLOR] matte poly material with brand wordmark EXACT TEXT: "[BRAND NAME]" centered in cream foil. Delivery address window on lower portion. Photographed flat from above on cream paper. Subtle wrinkles for realism. Verbatim text. 4:5 aspect ratio.

PACKAGING MAILER VERBATIM TEXT

#42 Hangtag with care info (back side)

A hangtag back-side design for [BRAND] showing care instructions. Top: ISO care symbol row (5 symbols). Below: EXACT TEXT block "[FABRIC CONTENT %]" / "[COUNTRY OF ORIGIN]" / "[CARE INSTRUCTIONS — e.g., MACHINE WASH COLD]" in small clean sans-serif. Bottom: small brand mark + EXACT TEXT: "[WEBSITE]". Cream cardstock photograph. Verbatim text only, no extras. 4:5 vertical.

PACKAGING HANGTAG BACK VERBATIM TEXT

#43 Bottle / jar label design

A premium bottle label design for [BRAND]'s [PRODUCT NAME]. Rectangular label, [BRAND BACKGROUND COLOR], brand wordmark EXACT TEXT: "[BRAND NAME]" in elegant serif at top, product name EXACT TEXT: "[PRODUCT NAME]" centered larger, volume EXACT TEXT: "[VOLUME — e.g., 250ML / 8.5 FL OZ]" small at bottom. Decorative botanical line illustration as accent. Render text verbatim, no duplicate text. Photographed wrapped on glass bottle. 4:5 aspect ratio.

PACKAGING LABEL VERBATIM TEXT

#44 Insert card art

A thank-you insert card design for [BRAND] orders. Front: cream cardstock with hand-lettered headline EXACT TEXT: "Thanks for choosing us." in elegant script, brand wordmark EXACT TEXT: "[BRAND NAME]" small at bottom. Back: short message EXACT TEXT: "[BRAND MISSION SENTENCE]" in small clean sans-serif, social handles row. Photographed flat on light wood, soft window light. Verbatim text rendering. 1:1.

PACKAGING INSERT VERBATIM TEXT

Editorial / Lookbook

Lookbook covers, editorial spreads, seasonal heroes, behind-the-brand portraits, studio environments.

6 prompts · #45–#50

#45 Lookbook cover

A high-fashion editorial lookbook cover for [BRAND]'s [SEASON/COLLECTION NAME]. Single hero subject — [MODEL DESCRIPTION wearing HERO PIECE] — full-bleed photograph. Cinematic muted color grading in [BRAND PALETTE], dramatic side lighting, fashion editorial composition with negative space at the top-left for headline. Shot on medium-format camera aesthetic. Cover EXACT TEXT in upper left: "[COLLECTION NAME]" in large serif type, "[SEASON YEAR]" smaller below. Verbatim text. 4:5 portrait aspect ratio.

EDITORIAL LOOKBOOK VERBATIM TEXT

#46 Editorial spread

A two-page lookbook editorial spread layout. Left page: full-bleed photograph of [MODEL wearing PIECE] in [SETTING]. Right page: smaller flat-lay photograph of the same garment in upper third, with body copy block below in clean serif body text. Light cream paper texture under entire spread. Magazine editorial layout style. 16:9 wide aspect ratio. Render any in-image text verbatim.

EDITORIAL SPREAD 16:9

#47 Seasonal campaign hero

A campaign hero photograph for [BRAND]'s [SEASON] launch. [MODEL DESCRIPTION] in [SETTING — e.g., autumn forest path / urban rooftop at golden hour], wearing the season's hero piece. Cinematic lighting matching the season's mood, [SEASONAL COLOR PALETTE] grading. Subject positioned for hero banner cropping. Editorial fashion campaign style. 16:9 wide. No text.

EDITORIAL SEASONAL 16:9

#48 Behind-the-brand portrait

A documentary editorial photograph of [DESCRIPTION — e.g., a designer working at their studio bench, surrounded by fabric swatches and sketches]. Soft natural window light, candid unposed moment, shallow depth of field. Authentic textures, real environment, no glamorization. Editorial brand-story style. 4:5 vertical aspect. No text.

EDITORIAL DOCUMENTARY 4:5

#49 Founder portrait illustration (with consent caveat)

CAVEAT: Do not generate likenesses of real people without consent. For brand-story illustration only — a photorealistic editorial portrait of a [DEMOGRAPHIC] founder figure in a brand workshop setting. Soft window light, three-quarter angle, eyes meeting camera, slight engaged smile, authentic environment. Shot on 85mm at f/2. Editorial magazine portrait style. 4:5 vertical. No text. For real founders, use real photography. AI portraits should not impersonate real people.

EDITORIAL PORTRAIT CONSENT REQUIRED

#50 Studio environment shot

A documentary photograph of [BRAND]'s creative studio environment — workbenches with materials, fabric swatches on a board, lighting equipment, a half-finished garment on a dress form. Natural daylight from a large window, slight film grain, warm color balance. Authentic working space, no styling clutter. Editorial behind-the-brand style. 16:9 wide. No text.

EDITORIAL STUDIO 16:9

Infographics + Content

Sustainability charts, sizing guides, care instructions, how-it's-made, comparison graphics — the trust-building visuals.

6 prompts · #51–#56

#51 Sustainability / certification infographic

[Use quality: high] A clean modern infographic titled EXACT TEXT: "Our Sustainability Standards" centered at the top in bold serif. Below: a 2×3 grid of 6 certification badges, each with the EXACT TEXT name verbatim ("[CERT 1]", "[CERT 2]", "[CERT 3]", "[CERT 4]", "[CERT 5]", "[CERT 6]"), each badge with an icon and a short 8-word description below it. [BRAND COLOR PALETTE] only. Cream background. Render all text verbatim, no duplicate text. 1:1 aspect ratio.

INFOGRAPHIC CERTIFICATIONS VERBATIM TEXT

#52 Sizing guide visual

A clean apparel sizing guide infographic for [BRAND]. Top: title EXACT TEXT: "Sizing Guide" in bold serif. Below: a horizontal table with column headers EXACT TEXT: "Size", "Chest (in)", "Waist (in)", "Hip (in)", "Length (in)", and 5 rows of size values for "XS", "S", "M", "L", "XL". Light cream background, clean readable typography. Verbatim text only. 1:1 aspect ratio. Quality: high.

INFOGRAPHIC SIZING VERBATIM TEXT

#53 Care instructions illustrated

A horizontal infographic strip showing 5 garment care steps. Each step has an icon (washing machine / dryer / iron / hanger / detergent bottle) and a short EXACT TEXT instruction below it: "Wash cold", "Tumble dry low", "Iron warm", "Hang to dry", "No bleach". Light cream background, clean utility-style icons in [BRAND COLOR]. Render text verbatim, no duplicates. 16:9 wide.

INFOGRAPHIC CARE VERBATIM TEXT

#54 How-it's-made infographic

[Use quality: high] A vertical 5-step process infographic titled EXACT TEXT: "How We Make Our [PRODUCT]" at top. Below: 5 numbered steps in a vertical flow, each with an illustrated icon, EXACT TEXT step number "01", "02", "03", "04", "05", a one-line headline, and 2-line description below. Steps: "Source materials" → "Pattern & cut" → "Sew & assemble" → "Quality check" → "Pack & ship". [BRAND COLOR PALETTE]. Verbatim text rendering. 9:16 vertical.

INFOGRAPHIC PROCESS VERBATIM TEXT

#55 Shipping zones map illustration

A flat illustration-style world map showing shipping zones for [BRAND]. Map outline in soft gray, three shipping zones color-coded in [BRAND PALETTE]. Each zone labeled EXACT TEXT: "Zone 1: USA & Canada — 3-5 days", "Zone 2: Europe — 7-10 days", "Zone 3: Rest of World — 10-14 days". Title EXACT TEXT: "Where We Ship" at top. Light cream background. Render text verbatim. 16:9 wide.

INFOGRAPHIC MAP VERBATIM TEXT

#56 Comparison chart graphic

A clean comparison infographic with two columns. Left column header EXACT TEXT: "[COMPETITOR/CONVENTIONAL]". Right column header EXACT TEXT: "[BRAND NAME]". Below: 5 rows of comparison points with checkmark icons, EXACT TEXT row labels: "[POINT 1]", "[POINT 2]", "[POINT 3]", "[POINT 4]", "[POINT 5]". Right column highlighted with subtle [BRAND COLOR] background. Verbatim text only. 1:1 square.

INFOGRAPHIC COMPARISON VERBATIM TEXT

UGC-Style

Authentic-feeling user-generated visuals — OOTDs, mirror selfies, unboxing moments. (Coffee-shop and desk flat-lay variants follow Category A flat-lay pattern.)

3 prompts · #57–#61

#57 Faceless OOTD (over-shoulder, hands-only)

A faceless OOTD lifestyle photograph showing only the wearer's torso and hands holding [PRODUCT/PHONE]. Wearing [GARMENT]. Shot from over-the-shoulder perspective looking down, casual real-world setting (coffee shop / bedroom / city street), natural overhead lighting. Authentic candid feel, slight motion, real hands with natural skin texture. UGC documentary style. 4:5 vertical. No text.

UGC OOTD 4:5

#58 Mirror selfie aesthetic

A mirror selfie photograph showing [DEMOGRAPHIC] wearing [GARMENT], holding a phone in front of the mirror. Casual bedroom setting, soft natural light from a window, slight imperfections (smudge on mirror, casual mess in background). Phone face-down or covering the wearer's face for privacy. Authentic Gen Z social aesthetic, no overproduced lighting. Vertical 9:16. No text.

UGC MIRROR SELFIE 9:16

#61 "Just dropped" unboxing aesthetic

A casual unboxing photograph showing [BRAND] product just removed from its branded packaging — box partially open, tissue paper visible, hangtag visible, product front-and-center on a light cream linen surface. Soft natural light, slight imperfection (slightly off-center), authentic excited-customer feel. UGC unboxing aesthetic. 4:5 vertical. No text.

UGC UNBOXING 4:5

OG + Meta Images

Open Graph share image — the universal pattern. (Twitter Cards and Pinterest Rich Pins use the same composition with adjusted aspect ratios.)

1 prompt · #62

#62 Open Graph 1200×630 share image

A wide social share image for [PAGE/PRODUCT]. [SUBJECT] positioned center-left taking up 60% of the canvas. Right 40% solid [BRAND COLOR] block reserved for headline overlay (added separately). Brand wordmark EXACT TEXT: "[BRAND NAME]" small in lower-right corner of subject area. Cinematic editorial lighting, [BRAND PALETTE] grading. Render brand name verbatim. 1.91:1 aspect ratio (1200×630).

OG IMAGE 1.91:1 VERBATIM TEXT

Account / Storefront

Favicons, app icons, About hero, reviews mood, 404 page, empty cart — the storefront polish layer.

6 prompts · #65–#70

#65 Favicon set (multi-variant)

[Use n=4] Generate 4 favicon design variations for [BRAND]. Each: simple geometric mark derived from the brand initial "[INITIAL]" or a brand symbol. Square 1:1 format, designed to read clearly at 32×32 pixels. [BRAND COLOR] on white background. Modern minimalist iconography. No text other than the single brand initial. Render letter verbatim.

STOREFRONT FAVICON n=4

#66 App icon

A premium iOS-style app icon for [BRAND]'s shopping app. Squircle (squared circle) shape, [BRAND COLOR GRADIENT], centered brand monogram or icon, subtle inner shadow for depth, edge highlight. Modern iOS 18 / Liquid Glass aesthetic. 1:1 aspect ratio. No text other than brand monogram.

STOREFRONT APP ICON 1:1

#67 About Us hero portrait

A documentary editorial photograph for [BRAND]'s About page hero. Wide shot of [SCENE — e.g., empty brand workshop space with natural light, fabric bolts on shelves, work-in-progress garments]. No people in frame, just the environment. Authentic, unstaged, natural daylight. Editorial brand-story style. 16:9 wide. No text.

STOREFRONT ABOUT 16:9

#68 Reviews section mood graphic

A warm illustrated graphic for the customer reviews section of [BRAND]'s Shopify storefront. Flat illustration style, [BRAND COLOR PALETTE], showing 3-4 abstract figures in different poses interacting positively (sharing, smiling, gesturing toward each other). Stars motif scattered subtly. Friendly community feel. 4:5 aspect. No text.

STOREFRONT REVIEWS 4:5

#69 404 page illustration

A friendly playful illustration for a 404 error page. Flat illustration style, [BRAND COLOR PALETTE], showing a cute brand-aligned visual metaphor (lost product / wandering brand mascot / empty hangar). Warm reassuring mood, not anxious. Centered with negative space around it. 1:1 square. EXACT TEXT centered below illustration: "Page not found" in clean serif. Verbatim text only.

STOREFRONT 404 VERBATIM TEXT

#70 Empty cart illustration

A friendly editorial illustration for the empty cart state of [BRAND]'s Shopify cart page. Flat illustration in [BRAND PALETTE] showing a stylized empty shopping bag with one small product or brand mark visible inside. Warm inviting tone, not negative. Below illustration EXACT TEXT: "Your cart is empty — let's change that." in clean serif. 4:5 vertical. Render text verbatim.

STOREFRONT EMPTY CART VERBATIM TEXT

Section 08 · Brand Consistency

Locking brand consistency at scale

One-off images are easy. The hard problem is producing 50 lifestyle shots that all look like they came from the same brand, the same photographer, the same season. This is the section that separates serious operators from prompt tourists. The pattern parallels the multi-agent orchestration covered in the Google Antigravity S-Tier master class — coordinated specialized contexts producing coherent output across many calls.

gpt-image-2 ships four native tools for consistency: 16-image reference inputs, the character anchor pattern, Custom GPTs, and ChatGPT Projects. Used together, they replace what used to require a $5K-per-day photographer and a week of post-production color grading.

Tool 1 — The 16-image reference set

gpt-image-2 accepts up to 16 reference images per call and reasons about them as a set. Not as separate inputs to be averaged, but as a coherent system the model interprets together.

The pattern that works: pass your source product photo + 3-4 brand-style references + 1-2 competitor packshots that capture the energy you want. The model picks up the lighting recipe, color grading, depth of field, and editorial mood from the references and applies them to your product.

Reference image labeling rule

Always label each input by index in your prompt: "Image 1 is the source product. Image 2 is the lighting reference. Image 3 is the color palette reference. Image 4 is the composition reference." The model uses these labels to know which aspect of which image to apply where. Without labels, it averages them — and averaging is how you get muddy, generic-looking output.

Tool 2 — The character anchor pattern

For human subjects (models, founders, mascots), establish a complete description in the prompt itself — and reuse that exact description in every subsequent prompt. Then add scene-specific details after the anchor.

Character Anchor Pattern

CHARACTER ANCHOR (paste at the top of every prompt for this character):
A young woman named Mara. She has short dark hair with blunt bangs,
warm brown skin, light freckles across her nose, and dark brown eyes.
She is wearing the brand's signature forest-green oversized hoodie and
dark indigo straight-leg jeans. Illustrated in flat modern character
design style with clean lines and a muted warm palette. This is her
character reference — do not redesign her appearance.

SCENE PROMPT 1:
[Anchor above]
Mara is sitting cross-legged on a bedroom floor surrounded by open
books, studying late at night. A desk lamp is the only light source.
Same character, do not change her appearance, outfit, or illustration
style.

SCENE PROMPT 2:
[Anchor above]
Mara is walking through a rainy street at night, hood pulled up over
her sweater, holding a dripping umbrella. Same character, do not
change her appearance, outfit, or illustration style.

The anchor adds tokens to every prompt — but it's the difference between getting "a character that looks vaguely like Mara" and "Mara, every time." For brand mascots, founder illustrations, recurring lifestyle models, this pattern is non-negotiable.

Tool 3 — The Custom GPT route (deep dive)

You already saw the basic Custom GPT setup in Section 04. Here's how to take it further for industrial brand consistency.

The brand specification document

Don't paste loose notes into your Custom GPT system prompt. Build a structured brand spec doc that you upload to the GPT's knowledge base. The GPT will reference it in every conversation.

brand_spec.md · Upload to Custom GPT knowledge

# [BRAND NAME] — Visual Specification v1.0

## IDENTITY
- Founded: [YEAR]
- Location: [CITY]
- Mission: [ONE SENTENCE]
- Customer: [DEMOGRAPHIC + PSYCHOGRAPHIC]

## COLOR PALETTE
| Token | Hex | Usage |
|-------|-----|-------|
| Deep | #XXXXXX | Page backgrounds, headers |
| Forest | #XXXXXX | Section backgrounds, depth |
| Teal | #XXXXXX | Mid-tone accents |
| Emerald | #XXXXXX | Active states, highlights |
| Gold | #XXXXXX | Primary accent, CTAs |
| Cream | #XXXXXX | Body text on dark, paper backgrounds |
| White | #XXXXXX | Pure highlights only |

## TYPOGRAPHY
- Display: [FONT NAME] (serif/sans, weight)
- Body: [FONT NAME]
- Mono: [FONT NAME] for code or technical labels
- Hierarchy: Display 32-48px, H2 24-32px, body 16-18px

## PHOTOGRAPHY RECIPE
- Camera: 85mm or 50mm lens equivalent
- Aperture: f/2.0 to f/2.8 (shallow DOF)
- Lighting: Soft natural daylight from window, camera left
- Color grade: Warm neutral, slight green undertone in shadows
- Mood: Quiet confidence, lived-in, never sterile or aspirational
- Skin tones: Natural, real pores, no heavy retouching
- Backgrounds: Concrete, light oak, cream linen, soft gray seamless

## ILLUSTRATION RECIPE
- Style: Flat with subtle paper texture, hand-drawn quality
- Line weight: Medium, slightly imperfect
- Color: Brand palette only, no gradients
- Mood: Warm, inviting, never childish

## DO'S
- Render exact text in quotation marks, verbatim, no extras
- Use 4:5 for product, 16:9 for hero, 1:1 for IG, 9:16 for stories
- Include "no extra text, no duplicate text, no watermark" in every prompt
- Composite real logos post-generation, never request brand logo reproduction
- Use shallow DOF for product detail shots
- Default to soft natural light over studio strobes

## DON'TS
- Never reference fake certifications
- Never generate copyrighted IP (Disney/Marvel/named brand logos)
- Never generate real-celebrity likenesses
- Never use empty terms like "premium feel" or "viral quality" or "stunning"
- Never use HDR look or oversaturated colors
- Never include text in foreign scripts unless explicitly specified

## DEFAULT OUTPUT SETTINGS
- Aspect ratio: ask if ambiguous; otherwise 4:5
- Quality: high for finals, medium for drafts, low for exploration
- Format: PNG (compress to AVIF/WebP for Shopify upload)
- Resolution: 1024x1536 default, 2K for hero use only

Iterating on the Custom GPT

Treat your Custom GPT like a piece of code. After every 50 generations, audit: where is the brand voice drifting? Tighten the system prompt. Update the brand spec doc. Re-test. The GPT only stays useful if you maintain it.

Tool 4 — ChatGPT Projects deep dive

Projects are where you organize campaigns, not just brand context. Per-brand Custom GPT handles the brand voice; per-campaign Project handles the specific seasonal mood, hero products, target demographics for this drop.

Create a Project per campaign

"Spring 2026 — Forest Collection" / "Holiday Q4 Drop" / "Year One Anniversary." Not per brand, per campaign. Each gets its own folder.

Upload campaign-specific files

Hero product photos for this drop, color reference for this season's mood, model casting references, music or vibe references that capture the energy.

Set Project instructions

Layer on top of the brand-level Custom GPT: "This campaign emphasizes [SPECIFIC MOOD]. Use [PRIMARY HERO COLOR]. The hero product is [SKU]. Target tone is [SPRING ENERGY / HOLIDAY WARMTH / ANNIVERSARY GRATITUDE]."

Start chats inside the Project

"Hero banner desktop." "IG carousel 5 frames." "Email header for launch day." Each chat inherits the campaign context. You skip the preamble.

Tool 5 — The reverse-the-prompt trick (revisited)

Already shown in Section 04, but worth emphasizing as a brand-consistency tool. When you have one image that perfectly captures your brand voice, reverse-engineer it into a reusable prompt — then reuse that prompt skeleton across every new asset.

Workflow: Generate 100 candidate images for a campaign. Pick the 3 that nailed the brand voice. Feed each to ChatGPT and ask for the reverse prompt. Compare the 3 reverse prompts — the patterns that appear in all 3 are your real brand recipe. Codify those patterns into your Custom GPT.

The aesthetic-drift mitigation system

Even with all four tools, brand voice drifts over hundreds of generations. Build a scoring system to catch it before it ships.

Score Dimension	What to Check	Pass Threshold
Headline legibility	Read at mobile 400px? At 100px thumbnail?	Yes at both
Color contrast	WCAG AA on text? Brand color match within 5%?	Both pass
Logo clearance	Logo (composited post-gen) has min 1x its height clearance?	Yes
Lighting consistency	Same direction, same temperature as last 5 brand images?	Yes
Crop resilience	Survives crop to 1:1, 4:5, 16:9 without losing subject?	2 of 3
"Brand DNA" gut check	Side-by-side with hero photo from launch — same world?	Subjective yes

Reject promising visuals that break the rules. You're designing a system, not a one-off image. Regenerate with tighter constraints — "preserve layout; change only background hue ±5%" — instead of accepting close-but-wrong outputs.

Section 09 · Post-Generation

Editing, variations, and upscaling for production

Generation is half the workflow. The other half is the post-generation polish that takes a 1024×1024 AI image and ships it as a 3300×2550 print-ready hangtag, a 4K hero banner, or a perfectly-cropped product variant. Here's the toolkit.

Native gpt-image-2 editing — mask inpainting

The images.edit endpoint with a mask is your scalpel. Mask the region you want changed, leave the rest opaque, and gpt-image-2 will modify only the masked area while preserving everything else.

Mask conventions

The mask is a PNG with the same dimensions as your source image. Transparent pixels mark the region to edit. Opaque pixels are preserved. For precise control, use Photoshop or Figma to paint the mask manually. For quick masks, you can ask gpt-image-2 itself to generate the mask — pass the source image with prompt "create a mask isolating just the [OBJECT]."

Common edit recipes

Edit Goal	Mask	Prompt Pattern
Background swap	Transparent everywhere except product	"Replace background with [NEW SCENE]. Keep product exactly as is."
Product color swap	Transparent on product only	"Change the [PRODUCT] color to [NEW COLOR]. Keep material, lighting, shadow."
Element removal	Transparent on element	"Remove the [ELEMENT]. Reconstruct the background naturally."
Outpainting (extend canvas)	Transparent in extension area	"Extend the scene naturally. Match lighting, color, perspective."
Add element	Transparent in target area	"Add [ELEMENT] at [POSITION]. Match the existing lighting and style."

Upscaling — Topaz vs Magnific decision tree

gpt-image-2 outputs at 1K to 2K natively. Hangtag print at 300 DPI requires 3300×2550. Hero banner at retina-ready 4K requires 3840×2160. You need an upscaler.

The two best-in-class options are Topaz Gigapixel and Magnific. They are not the same tool. Picking the wrong one will ruin output.

Topaz Gigapixel — Restoration + Fidelity

Topaz preserves the original image character. It looks at low-res input and asks "what did this originally look like?" — then reconstructs detail without altering identity. Conservative. Safe. Faithful. Use for: portraits where face must stay the same, product photos where logo and label must be preserved, photography where authenticity matters more than embellishment. One-time ~$99 + cloud rendering credits.

Magnific — Hallucination + Reimagining

Magnific generates new pixels based on semantic understanding. It looks at low-res input and asks "what could this look like?" — then invents skin pores, fabric weave, atmospheric depth. Aggressive. Creative. Will alter facial features at high "Creativity" slider settings. Use for: AI-generated illustrations where embellishment is wanted, dreamscapes, creative concept art. NEVER for portraits where identity must be preserved. Subscription from $39/month.

Decision tree

→

Is the input a portrait or has identifiable face?

Yes → Topaz only. Magnific will alter the face at Creativity > 3.
No → either tool works.

→

Is the input a product with brand logo, text, or labels?

Yes → Topaz. Magnific may distort small text and logo details.
No → either tool works.

→

Is this for print at 300 DPI or above?

Yes → Topaz with the "Standard MAX" or "Recover v3" model. Output at exact print dimensions.
No → either tool, choose by aesthetic preference.

→

Is the input AI-generated illustration / concept art?

Yes → Magnific shines here. Use Creativity 3-5 for added skin texture and fabric detail. Above 6 starts changing the image.
No → Topaz preserves photographic feel better.

Print resolution math

Print needs density, not just size. At 300 DPI, you need 300 pixels per inch of final print. Always upscale to the exact pixel count for your physical print size.

Print Asset	Physical Size	Required Pixels (300 DPI)	Upscale Factor from 2K
Hangtag	3.5" × 2"	1050 × 600	0.5× (downsize)
Business card	3.5" × 2"	1050 × 600	0.5× (downsize)
Postcard insert	5" × 7"	1500 × 2100	1× (use 2K direct)
Letter / flyer	8.5" × 11"	2550 × 3300	1.7×
Magazine spread	17" × 11"	5100 × 3300	2.5×
11x17 poster	11" × 17"	3300 × 5100	2.5×
18x24 poster	18" × 24"	5400 × 7200	3.5×
Bus stop poster	4' × 6'	14400 × 21600	10×

The film grain trick (mask AI smoothness for print)

AI images often look "too smooth" — like melted plastic when printed. The fix is adding a controlled grain layer that mimics real photographic noise.

Photoshop / GIMP grain recipe

1. Create a new layer above your image, fill with 50% gray.
2. Filter → Noise → Add Noise. Amount: 5-8%. Distribution: Gaussian. Monochromatic: checked.
3. Set layer blend mode to Soft Light or Overlay.
4. Reduce layer opacity to 10-15%.
5. Save and convert to CMYK if printing.
Why this works: ink binds to paper differently than light hits a screen. Real photography has grain. Adding subtle grain bridges the gap and visually hides upscaling artifacts.

RGB to CMYK conversion (print-ready output)

Screens use light (RGB). Printers use ink (CMYK). Converting RGB to CMYK without an ICC profile makes neon colors look dull. Use the right ICC profile for your print partner.

Get the ICC profile from your print vendor (e.g., GRACoL 2013 for US sheetfed, FOGRA39 for European offset).
In Photoshop: Edit → Convert to Profile. Destination: vendor's CMYK profile. Intent: Perceptual (for photography) or Relative Colorimetric (for graphics).
Soft-proof in Photoshop: View → Proof Colors. Spot-check saturated colors — reds and blues shift most.
Re-save as TIFF or PDF with embedded profile.

The hybrid workflow — putting it together

The workflow that ships production-quality assets:

Generate base in gpt-image-2

1024×1536 high quality. Iterate in chat until brand voice is locked.

Inpaint problem areas

Use images.edit with mask for any single-element fixes — wrong color, awkward position, missing detail.

Composite real brand elements

Open in Photoshop or Affinity. Add real logo from vector source. Add real product photos for hero shots if available. Composite real cert badges.

Upscale to target resolution

Topaz Gigapixel for portraits and products. Magnific for AI illustrations. Output at exact print pixel dimensions.

Add film grain layer

5-8% Gaussian noise on Soft Light overlay at 10-15% opacity. Hides AI smoothness and binds for print.

Color + sharpening pass

Final color correction in Lightroom or Photoshop. Output sharpening (Photoshop: Filter → Sharpen → Smart Sharpen) sized for delivery medium.

Export for delivery

Web: AVIF or WebP at 85% quality, sRGB. Print: TIFF or PDF at exact size, CMYK with embedded ICC profile.

This is the workflow that turns a $0.21 API call into a $300 photoshoot replacement. Every step matters. Skip the upscale and your hangtag prints muddy. Skip the grain and it looks like AI. Skip the CMYK conversion and your brand colors print wrong.

Section 10 · Shopify Integration

Shopify Files API — automated image upload pipeline

Generating images is one half of the workflow. Getting them into Shopify cleanly, with proper alt text and optimized formats, is the other half. The Shopify Admin GraphQL API provides a two-step staged upload flow that handles this at scale — and it's the right path for any operator generating more than 10 images at a time. For the full case study of building a production Shopify store with AI from scratch (including image asset pipelines), see How I Built My Shopify Store With Claude AI.

The two-step staged upload flow

Don't try to upload directly. Don't pass binary data through your app server. Use Shopify's staged upload pattern: get a temporary URL, push the file directly to Shopify's storage, then create the file asset.

Call `stagedUploadsCreate` mutation

Returns: a temporary upload URL + a list of authentication parameters (Shopify uses Google Cloud Storage under the hood). You'll POST to this URL.

POST file with FormData

Append parameters first (in returned order), then append the file last. Order matters — getting it wrong returns "Cannot create buckets" errors. Don't manually set Content-Type; let FormData generate the boundary.

Call `fileCreate` mutation

Pass the staged URL as originalSource. Specify contentType: IMAGE and alt text. Returns a file ID that can be referenced everywhere.

Poll `fileStatus` until READY

Files process asynchronously. Poll the file's status field. Once READY, the file is on Shopify's CDN and reference-able from products, variants, collections, themes.

Constraints to know before you build

20MB file size limit per upload. Compress to AVIF or WebP before staging.
250 files per fileCreate batch. Loop in chunks for larger jobs.
Files process asynchronously. Don't assume immediate availability — poll fileStatus.
One file ID can be referenced from multiple resources. Don't re-upload the same image for different products.
fileDelete is permanent. Any product referencing the deleted file will have broken media.

The complete GraphQL — stagedUploadsCreate

GraphQL · stagedUploadsCreate mutation

mutation stagedUploadsCreate($input: [StagedUploadInput!]!) {
  stagedUploadsCreate(input: $input) {
    stagedTargets {
      resourceUrl
      url
      parameters {
        name
        value
      }
    }
    userErrors {
      field
      message
    }
  }
}

# Variables:
{
  "input": [
    {
      "resource": "IMAGE",
      "filename": "hero-shot-v1.png",
      "mimeType": "image/png",
      "fileSize": "1842301",
      "httpMethod": "POST"
    }
  ]
}

The complete GraphQL — fileCreate

GraphQL · fileCreate mutation

mutation fileCreate($files: [FileCreateInput!]!) {
  fileCreate(files: $files) {
    files {
      id
      fileStatus
      alt
      createdAt
      ... on MediaImage {
        image {
          url
          width
          height
        }
      }
    }
    userErrors {
      field
      message
    }
  }
}

# Variables:
{
  "files": [
    {
      "alt": "Sustainable streetwear hoodie in forest green on model in urban setting",
      "contentType": "IMAGE",
      "originalSource": "https://shopify-staged-uploads.storage.googleapis.com/tmp/..."
    }
  ]
}

End-to-end Node.js pipeline (gpt-image-2 → Shopify)

The complete operational pipeline. Generate via OpenAI, optimize with Sharp, upload via staged flow, register with fileCreate. This is the code that runs your nightly catalog refresh.

Node · full pipeline gpt-image-2 → Shopify CDN

import OpenAI from "openai";
import sharp from "sharp";
import fs from "fs";
import { GraphQLClient, gql } from "graphql-request";

const openai = new OpenAI();
const shopify = new GraphQLClient(
  `https://${process.env.SHOPIFY_STORE}.myshopify.com/admin/api/2026-04/graphql.json`,
  {
    headers: { "X-Shopify-Access-Token": process.env.SHOPIFY_ADMIN_TOKEN }
  }
);

// Step 1: Generate image with gpt-image-2
async function generateImage(prompt) {
  const result = await openai.images.generate({
    model: "gpt-image-2",
    prompt,
    size: "1024x1536",
    quality: "high",
    n: 1
  });
  return Buffer.from(result.data[0].b64_json, "base64");
}

// Step 2: Optimize for Shopify upload (AVIF, ~85% quality, strip EXIF)
async function optimizeForShopify(pngBuffer) {
  return await sharp(pngBuffer)
    .avif({ quality: 85, effort: 6 })
    .withMetadata({ exif: {} })
    .toBuffer();
}

// Step 3: Get staged upload URL
async function getStagedUpload(filename, mimeType, fileSize) {
  const STAGED_UPLOADS_CREATE = gql`
    mutation stagedUploadsCreate($input: [StagedUploadInput!]!) {
      stagedUploadsCreate(input: $input) {
        stagedTargets {
          resourceUrl
          url
          parameters { name value }
        }
        userErrors { field message }
      }
    }
  `;

  const result = await shopify.request(STAGED_UPLOADS_CREATE, {
    input: [{
      resource: "IMAGE",
      filename,
      mimeType,
      fileSize: fileSize.toString(),
      httpMethod: "POST"
    }]
  });

  return result.stagedUploadsCreate.stagedTargets[0];
}

// Step 4: Push file to staged URL
async function pushToStaged(target, buffer, filename) {
  const formData = new FormData();
  // Parameters first, file last — order matters
  for (const param of target.parameters) {
    formData.append(param.name, param.value);
  }
  formData.append("file", new Blob([buffer]), filename);

  const response = await fetch(target.url, {
    method: "POST",
    body: formData
  });

  if (!response.ok) {
    throw new Error(`Staged upload failed: ${response.status}`);
  }
  return target.resourceUrl;
}

// Step 5: Create Shopify file asset
async function createShopifyFile(stagedUrl, altText) {
  const FILE_CREATE = gql`
    mutation fileCreate($files: [FileCreateInput!]!) {
      fileCreate(files: $files) {
        files {
          id
          fileStatus
          alt
          ... on MediaImage {
            image { url width height }
          }
        }
        userErrors { field message }
      }
    }
  `;

  const result = await shopify.request(FILE_CREATE, {
    files: [{
      alt: altText,
      contentType: "IMAGE",
      originalSource: stagedUrl
    }]
  });

  return result.fileCreate.files[0];
}

// Step 6: Generate alt text via gpt-4o vision
async function generateAltText(imageBuffer, productContext) {
  const base64 = imageBuffer.toString("base64");
  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [{
      role: "user",
      content: [
        {
          type: "text",
          text: `Generate SEO + accessibility alt text for this product image. Context: ${productContext}. Keep under 125 characters. Describe what's actually visible plus the product name. No keyword stuffing.`
        },
        {
          type: "image_url",
          image_url: { url: `data:image/avif;base64,${base64}` }
        }
      ]
    }],
    max_tokens: 80
  });
  return response.choices[0].message.content.trim();
}

// Full pipeline — orchestrate it all
async function generateAndUpload(prompt, productContext, filename) {
  console.log(`Generating: ${filename}`);

  const pngBuffer = await generateImage(prompt);
  const avifBuffer = await optimizeForShopify(pngBuffer);
  const altText = await generateAltText(avifBuffer, productContext);

  const target = await getStagedUpload(filename, "image/avif", avifBuffer.length);
  const stagedUrl = await pushToStaged(target, avifBuffer, filename);
  const file = await createShopifyFile(stagedUrl, altText);

  console.log(`✓ Uploaded: ${file.id} — ${altText}`);
  return file;
}

// Usage
generateAndUpload(
  "A clean studio product photograph of a forest-green organic cotton hoodie centered on white seamless background, soft upper-left lighting, e-commerce style, no text",
  "Forest green organic cotton hoodie, [BRAND], unisex",
  "hoodie-forest-green-hero.avif"
);

CDN behavior — Shopify auto-serves AVIF/WebP

Once your file is in Shopify, the platform's CDN handles modern format delivery automatically. AVIF for browsers that support it, WebP for fallback, JPEG/PNG for legacy. You don't manage this — but you do need to use the right Liquid filters.

Liquid · responsive image filters

<img src="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2868): invalid url input"
     alt=""
     loading="lazy"><img src="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2873): invalid url input"
     srcset="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2874): invalid url input 400w,
             Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2875): invalid url input 800w,
             Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2876): invalid url input 1200w,
             Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2877): invalid url input 1600w"
     sizes="(max-width: 768px) 100vw, 50vw"
     alt=""
     loading="lazy"
     width="800" height="1200"><picture>
  <source srcset="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2885): invalid url input"
          type="image/avif">
  <source srcset="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2887): invalid url input"
          type="image/webp">
  <img src="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2889): invalid url input"
       alt=""
       loading="lazy"
       width="1200" height="1500">
</picture>

Alt text generation — apps + AI vision pipeline

Every uploaded image needs alt text for SEO and accessibility. AI search engines (ChatGPT, Claude, Perplexity) and Google all use alt text to understand image content. Missing alt text means missing visibility.

Shopify alt text apps (compared)

App	Free Credits	Paid Plans	Languages	Notable
AltText.ai	25	$49–$489	130+	Includes product data in alt; auto-translate sync to Shopify multi-language
Caseo.ai	50	Tiered credits	8 (EN, ES, FR, PT, IT, DE, NL, LT)	Vision + product data + WCAG 2.1 AA compliance; meta tags + descriptions too
Squirai AI SEO	—	Tiered	—	Alt text + image minification + page speed optimization combined
SEO HERO AI Alt Text	—	One-time credit packs	Custom	Brand-aware, custom tone, custom keywords, no subscription required
Alt Text Generator AI	—	Tiered	EN, ES, DE+	Supports JPEG, PNG, GIF, SVG, AVIF, WEBP — full format coverage

Direct gpt-4o vision pipeline (no app required)

If you're already running the gpt-image-2 generation pipeline programmatically, generating alt text via gpt-4o vision adds maybe $0.001 per image and gives you total control. The code is in the full pipeline example above — it's the generateAltText function.

Alt text best practices for Shopify SEO

What good alt text looks like

Do: Describe what's actually visible. Include product name + key attribute (color, material). Keep under 125 characters. Use natural language. Mention context if relevant ("worn by model in urban setting").
Don't: Stuff keywords. Use "image of" or "picture of" (screen readers already announce it's an image). Repeat product title verbatim. Leave it blank or put filename garbage.
Example good: "Forest green organic cotton hoodie on model walking through urban street at golden hour"
Example bad: "image_001_final_v2.jpg" or "best hoodie sustainable eco-friendly Boston DTC streetwear"

File size optimization (pre-upload)

AVIF for hero and lifestyle images — best compression, 90%+ browser support in 2026, ~50% smaller than WebP at equivalent quality.
WebP for product variants and gallery shots — universal browser support, ~30% smaller than JPEG.
85% quality for product photos (preserve detail), 75% for lifestyle (forgiving on noise).
Strip EXIF metadata before upload — reduces size and removes camera/location data leaks.
Resize to delivery dimensions. Don't upload 4K source if Shopify will only render 1200px on PDP. Shopify handles size variants but you save CDN bandwidth and storage.

Done right, this pipeline ingests dozens of generated images per night, optimizes them, alt-texts them, uploads to Shopify CDN, and surfaces them ready to attach to products. The whole loop runs unattended on a schedule — the architecture that turns one operator into a content-production team of ten.

Section 11 · Economics

Cost & budget math — what gpt-image-2 actually costs to run

The real question isn't "how much does an image cost?" It's "what does this replace, and what's the breakeven point?" Here's the honest math. For the broader economics framework — how to model AI labor replacement across an entire creative function — see the Synthetic Director case study, which documents a full autonomous creative agency built with the same constraint-first methodology applied here.

Token-based pricing (official OpenAI)

Token Type	Price per 1M tokens	What It Charges
Image input	$8.00	Reference images you upload (full quality always)
Image cached input	$2.00	Repeat reference images (75% discount on cache hits)
Image output	$30.00	Pixels the model generates (varies with size and quality)
Text input	$5.00	Your prompt text
Text cached input	$1.25	System prompt repeated across calls
Text output	$10.00	Reasoning tokens (Thinking Mode adds these)

Per-image cost ranges

Resolution × Quality	Approx. Cost	Use Case
1024×768 · low	~$0.01	Drafts, exploration, batch thumbnails
1024×1024 · medium	~$0.05	Standard social post, IG feed
1024×1024 · high	~$0.21	Production e-commerce product photo
1024×1536 · high	~$0.18	Vertical hero, IG story, lookbook
2048×2048 · high	~$0.35	Hero banner, large print at native res
4K (via fal.ai) · high	~$0.41	Print-bound assets, billboard

The hidden edit-cost gotcha

gpt-image-2 processes image inputs at maximum quality regardless of your quality parameter. If your workflow involves uploading reference images and iterating, your actual per-asset cost runs higher than the baseline table suggests. Cached image tokens cost 75% less ($2/M vs $8/M) — design iterative workflows to reuse the same source image to capture this discount.

Real-world batch math

Scenario A: Solo Shopify operator, 50 SKUs, 5 images per product

50 SKUs × 5 images = 250 images
250 × $0.21 (1024×1024 high) = $52.50 total
Time at Tier 2 (20 IPM): ~13 minutes generation + ~1 hour curation
Replaces: Traditional product photography at $300–500 per product = $15,000–$25,000
Savings: ~99.7%

Scenario B: Agency client, 500 SKUs full catalog refresh, 5 images each

500 × 5 = 2,500 images
2,500 × $0.21 = $525 total
Time at Tier 3 (50 IPM): ~50 minutes generation + ~6 hours curation
Bills client at: $25,000–$75,000 for the full catalog refresh deliverable
Margin: 98%+ before labor and Photoshop polish

Scenario C: Weekly social ad rotation, 4 platforms, 20 ads per week

20 ads × 52 weeks × ~$0.18 average = $187/year
Time at Tier 2: ~10 minutes per week
Replaces: Freelance designer at $50/ad × 20 × 52 = $52,000/year

Scenario D: Full-resolution print campaign (8 hangtags + 4 hero posters)

12 images × $0.35 (2K high) = $4.20 in generation
+ ~$15 in Topaz Gigapixel cloud render credits
+ ~$5 in print proofing
Total: ~$25
Replaces: Photographer + designer for print campaign at $5,000–$15,000

AI vs traditional — the real comparison matrix

Approach	Cost per Product (5 images)	Time	Flexibility	Brand Consistency
Traditional photoshoot	$300–$500	1-2 weeks	Low (reshoot for changes)	High (same photographer)
Freelance designer + stock	$80–$150	3-5 days	Medium	Medium
Shopify Magic (built-in)	Free with plan	Minutes	Low (1MP cap)	Low
Photoroom / Pebblely apps	~$0.10–$0.50/image	Minutes	Medium	Medium
gpt-image-2 (this master class)	$1.05 ($0.21 × 5)	Minutes	Total (re-prompt anything)	High (Custom GPT + 16 refs)

ChatGPT subscription decision matrix

If you're operating a single brand without automation, ChatGPT subscription often beats API for cost-per-image. If you're running automation, API wins for control. Here's the breakpoint math.

Monthly Volume	ChatGPT Plus ($20/mo)	API @ $0.21 avg	Recommendation
0–95 images	$20/mo flat	$0–$20	API (only if automation needed)
96–500 images	$20/mo flat	$20–$105	ChatGPT Plus (sweet spot)
500+ images	Plus may rate-limit	$105+ but predictable	API + Tier 2/3 + ChatGPT Plus for exploration
2,000+ images	—	$420+	API only, Tier 3+ recommended

Cost optimization tactics

Eight ways to cut your gpt-image-2 spend without sacrificing quality

1. Use quality: low for iteration drafts. Promote to high only when composition is locked.
2. Cache reference images. Same source image across 10 edits = 75% discount on subsequent calls.
3. Right-size resolution. 400px mobile display doesn't need 2K input. Generate at 1024 for web, 2K only for print.
4. Batch via n parameter. n=4 is more token-efficient than 4 separate calls.
5. Use Instant Mode by default. Reserve Thinking Mode for prompts where reasoning measurably improves output.
6. Skip the Responses API for simple image gen. Direct images.generate avoids mainline-model overhead.
7. Use gpt-image-1-mini for high-volume drafts at ~25% of gpt-image-2 cost.
8. Hybrid stack: Nano Banana 2 for cheap backgrounds at $0.04, gpt-image-2 for hero work where text matters.

Section 12 · Risk & Compliance

Risk, compliance, and the legal reality of AI imagery on Shopify

This section is not legal advice. It is operator-grounded risk awareness — what OpenAI's terms grant, what they don't, where US copyright law actually stands in 2026, and how to protect a Shopify store from preventable mistakes. For specific legal questions consult a lawyer.

Commercial use rights — what OpenAI grants you

OpenAI's Terms of Use (current as of April 2026) are unambiguous on output ownership. The relevant clause:

"As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output."

Translation:

You own what you generate. Sell it, modify it, license it, distribute it.
OpenAI does not claim copyright on outputs.
You can use gpt-image-2 outputs on Shopify product pages, ads, packaging, social media, and physical merchandise.
No mandatory AI disclosure attribution to OpenAI.
No exclusivity guarantee — other users may receive similar outputs from similar prompts.

What you remain responsible for

OpenAI grants you ownership of their rights in the output. They cannot grant you rights they don't have. You remain on the hook for third-party rights.

Your responsibilities — non-negotiable

1. Don't generate copyrighted IP. No Disney, Marvel, Pixar, Nintendo, named-brand logos, recognizable game characters, or trademarked designs.
2. Don't generate real-celebrity likenesses without consent. OpenAI's Visual Capabilities terms explicitly prohibit using the model to "reproduce the likeness of any person without express consent and all necessary rights."
3. Don't deceive consumers. FTC concern: depicting a product in a way that misleads about what they're buying. AI-rendered "hand-stitched" detail when the actual product is machine-stitched is a problem.
4. Don't generate fake reviews or testimonials. Showing AI-generated customer faces alongside fake quotes is FTC violation territory regardless of how the image was made.
5. Don't generate medical, legal, or financial claims imagery. A "before/after" weight loss image generated in AI could be deceptive advertising.

Here's where it gets nuanced. OpenAI assigning rights to you doesn't mean US copyright law actually grants traditional copyright protection to the image.

The current legal posture (as of April 2026, US)

US Copyright Office guidance has held that works generated entirely by AI without meaningful human authorship may not be eligible for copyright protection. Multiple court rulings since 2023 have reinforced this.

What this means practically: you can use, sell, and modify AI-generated images commercially, but you may not be able to stop others from copying them. There is no exclusive ownership in the traditional sense.

To strengthen your copyright claim, add meaningful human contribution beyond "typing a prompt": composite editing, art direction decisions, layered compositing of AI elements with photography, color grading, retouching. The bar is "meaningful human authorship" — not just clicking generate.

Disclosure best practices (2026 regulatory landscape)

FTC (US): Be transparent if AI imagery would mislead the consumer about a material product attribute. If your "lifestyle" shot shows the product in a setting that misrepresents how it actually performs, disclose.
EU AI Act 2026: AI-generated content used in commercial advertising may require labeling depending on member-state implementation. Watch this space — rules are evolving.
Shopify policy: AI-generated product imagery is currently allowed. Verify before launch via Shopify's Acceptable Use Policy.
Industry shift: Voluntary "AI-assisted" labels are becoming common for trust in DTC. Some brands include a small line in About Us: "Some imagery on this site is AI-assisted. Real product photography is also used."
Marketplaces: Stricter rules. Shutterstock, Adobe Stock, and Etsy have specific AI-content policies. Verify before listing AI-generated assets there.

Content moderation reality

gpt-image-2 has a separate output review model that is strict on copyrighted IP, real-person likenesses, NSFW, violence, and politically sensitive content. This affects practical workflow.

What to expect

Refusals consume your quota. A blocked generation still counts against your IPM rate limit and burns the API call cost. False positives happen — the model occasionally refuses benign prompts that pattern-match to restricted categories.

Workarounds within policy: Be specific about safety-relevant elements. "A fictional character" instead of letting the model assume a real person. "Abstract symbol" instead of a generic logo placeholder. "Inspired by [public domain reference]" instead of riffing on a copyrighted character.

If you hit unjustified refusals, file feedback through OpenAI's developer console. The moderation model evolves, but unfounded refusals don't auto-correct without reports.

The DDS-style defensible policy

For any Shopify brand using gpt-image-2 at scale, document your AI usage policy as a brand asset. This protects you with customers, regulators, and platforms. Here's the template that holds up under scrutiny.

brand_ai_policy.md · Internal + Public

# [BRAND] AI Imagery Policy v1.0

## What we do
- We use AI image generation tools (including OpenAI gpt-image-2) to
  create marketing imagery, lifestyle scenes, packaging mockups, and
  brand assets for [BRAND].
- We always composite real product photography for hero product pages.
- We always composite our real logo from vector source — we do not
  rely on AI to reproduce it.
- We use AI imagery primarily for: lifestyle context, social ad
  variants, seasonal mood, infographics, packaging concepts.

## What we don't do
- We do not generate fake customer photos or testimonials.
- We do not generate likenesses of real people without their consent.
- We do not generate fake certification logos or compliance badges.
- We do not represent AI-rendered features as actual product attributes
  if those features are not accurate to the physical product.
- We do not generate copyrighted characters, logos, or trademarked IP.

## What we maintain
- All product page hero images are real product photography (not AI).
- All certification badges are sourced from official certification bodies.
- All claims (sustainability, materials, sizing) are independently
  verifiable through documentation we maintain.
- We retain at least one real photograph per product for transparency.

## Our position on AI disclosure
- We voluntarily disclose AI-assisted imagery on our About page.
- We use the term "AI-assisted" not "AI-generated" because all final
  imagery passes through human editorial review.
- We comply with FTC, EU AI Act, and platform-specific rules as they
  evolve.

## Our customer commitment
- If any image on our site materially misrepresents the physical
  product you receive, we offer free returns + a refund.
- Questions about how a specific image was made:
  customersupport@[brand-domain].com

This policy is genuinely defensible because it limits the surface area where you can be challenged. It also signals trustworthiness to customers — paradoxically, transparent disclosure of AI usage builds more trust than hiding it.

The hangtag problem (a worked example)

You generated a beautiful hangtag with gpt-image-2, including verbatim text and 5 certification icons. The icons are visually similar to GOTS, GRS, OCS, etc. — but they're AI-rendered, not real.

The mistake: shipping that hangtag with the AI-rendered icons. Even if the certifications are real and you have the documentation, displaying AI-rendered versions of certification logos likely violates the certification body's trademark and brand guidelines. Most certifications require use of their official logo files.

The fix: composite the real official certification logos (downloaded from the certifying body) over the AI-rendered hangtag in Photoshop or Affinity. Or generate the hangtag with empty placeholder areas and add real logos in post.

The lesson: AI generates the canvas. Real assets fill the slots that matter for compliance.

Section 13 · BONUS · Showcase

Hero showcase — 5 prompts that demonstrate peak capability

The 70-category catalog covers what you'll use every day. This section is different. These are the prompts that demonstrate just how far this model has moved — the ones that make the audience say "the model can really do that?" For coverage of every other AI model that shipped recently — Sora 2.5, Veo 3.1, Nano Banana Pro, Gemini 3.1 Ultra, and the rest of the model avalanche — see the Vibe Coder's Haven March 2026 edition.

Each one stress-tests a specific gpt-image-2 superpower: mixed-script multilingual rendering, multi-frame character continuity, UI mockups with verbatim small text, print-ready packaging with barcode, mixed-language storefront signage.

Run them in Thinking Mode at quality: high. Be patient — these prompts can take 60+ seconds. The output is shareable.

Truth gate before you ship these

These prompts demonstrate capability, not production-readiness. The model invents statistics on infographics. Logo reproduction is unreliable. Verify every number, composite real logos post-generation, and never publish data charts without confirming the figures externally.

B1. Multilingual editorial magazine cover (mixed Latin + Japanese)

Stress test: ~99% text accuracy across mixed scripts in a layout-heavy editorial composition. Demonstrates the headline feature — text-in-image at production quality with cultural typography awareness.

[Use Thinking Mode + quality: high] A premium fashion magazine cover for "VOGUE x [BRAND]" collaboration issue. Full-bleed photograph: a model wearing an oversized camel coat against a misty Tokyo street backdrop at dusk. Magazine title EXACT TEXT: "VOGUE" in large white serif at the top, partially overlapped by the model's hair. Cover lines on the left in white sans-serif: "特集: 持続可能なファッションの未来" (Japanese — "Feature: The Future of Sustainable Fashion") "インタビュー: [BRAND] 創業者" (Japanese — "Interview: [BRAND] Founder") "Winter Essentials Under $200" Barcode and issue info bottom right: "December 2026 | $9.99" Mixed-script Latin + Japanese kanji + hiragana, all rendered verbatim with no errors. Editorial magazine photography, professional cover quality. 4:5 portrait.

B3. 8-panel storyboard with character continuity

Stress test: the n=8 Thinking Mode multi-frame consistency feature. Same character, same wardrobe, same stylistic voice across 8 distinct scenes — the workflow that makes lookbooks practical.

[Use Thinking Mode + n=8] Generate 8 connected storyboard frames showing one character ("Nora — 28-year-old, wavy auburn shoulder-length hair, warm olive skin, wearing a [BRAND]'s signature forest-green hoodie") through her morning routine. - Panel 1: waking up in bed, soft window light - Panel 2: stretching at the window - Panel 3: making coffee in the kitchen - Panel 4: walking out of her apartment building - Panel 5: morning walk with coffee in hand, urban street - Panel 6: arriving at a coffee shop - Panel 7: working on a laptop at the café - Panel 8: smiling, looking up at someone off-frame Maintain identical character appearance across all 8 frames — same face, same hoodie, same hair. Identical color grading: warm morning light fading to cool café interior. Editorial documentary style. 4:5 per frame. No text.

B4. UI mockup with realistic small text rendering

Stress test: dense small text at multiple type sizes inside a recognizable iOS layout. Tab labels, product names, prices, headlines — all verbatim, all readable.

A photorealistic iPhone screenshot of the [BRAND] mobile shopping app home screen. Top status bar shows EXACT TEXT: "9:41" time, full battery icon, full signal. App header: brand wordmark EXACT TEXT: "[BRAND NAME]" centered in serif, search icon left, cart icon right. Below header: hero banner with EXACT TEXT: "New Arrivals" headline, "Shop the drop" subtext. Product grid: 4 product cards in 2×2, each with product name EXACT TEXT (e.g., "Forest Hoodie - $89", "Linen Tee - $42", "Organic Joggers - $78", "Wool Beanie - $34"), star rating, and small "Add" button. Bottom tab bar with 4 icons + labels: "Home", "Shop", "Wishlist", "Account". Render all text verbatim — no extra characters, no substitutions. iOS 18 design language, clean modern e-commerce app aesthetic. 9:16 vertical.

B5. Print-ready packaging label with verbatim copy + barcode

Stress test: precise structured text inside a wrapped 3D label, with barcode and small fine print. Demonstrates the model's ability to render production-ready packaging concepts.

[Use quality: high] A photorealistic product label design wrapped on a 250ml glass amber bottle, photographed on a marble surface with soft window light from camera left, slight depth of field. Label rectangular cream paper texture, [BRAND COLOR] border. Top: brand name EXACT TEXT: "[BRAND NAME]" in elegant serif. Center: product name EXACT TEXT: "[PRODUCT NAME]" larger, bolder. Below: tagline EXACT TEXT: "[ONE-LINE TAGLINE]" in italic. Bottom row: volume EXACT TEXT: "250ml / 8.5 fl oz", batch EXACT TEXT: "BATCH 2026-04", net weight EXACT TEXT: "8.8 oz / 250g". Right margin: simple barcode placeholder with EXACT TEXT: "0 12345 67890 1" below it. Render every text element verbatim with no errors, no duplicate text, no extra characters. Premium product photography. 4:5 vertical.

B7. Mixed-language storefront signage

Stress test: realistic environmental rendering with multiple text elements in different scripts at varying scales — main signage, window decals, sandwich board chalk script. Cultural typography awareness.

[Use quality: high] A photorealistic storefront photograph of a small [BRAND] pop-up shop in [CITY — e.g., Tokyo's Harajuku district]. Glass storefront, polished wood frame. Main signage above the door EXACT TEXT: "[BRAND NAME]" in clean serif. Below in smaller type, in Japanese EXACT TEXT: "持続可能なファッション" (sustainable fashion). Window decal lower left EXACT TEXT: "Open 10-8". Small chalkboard sandwich sign on sidewalk EXACT TEXT: "New Drop Inside" in handwritten script + "新作入荷" below in Japanese brushwork. Authentic urban setting, soft afternoon light, real reflections in glass. Mixed Latin + Japanese rendered verbatim with culturally appropriate font choices. Editorial documentary photo style. 16:9 wide.

How to use these

Don't just paste and run. Use them as capability probes — generate, study what worked, study what failed, then adapt the patterns into your own prompts. The structure (verbatim text rules, layout specifications, mood anchors) is more valuable than the specific subjects. These are templates for your own showcase work.

Section 14 · FAQ

Commercial-intent FAQ

The 12 questions Shopify operators actually ask before adopting gpt-image-2. Answered without hype, with sources where applicable.

What is GPT Image 2 (ChatGPT Images 2.0) and how is it different from DALL-E? +

GPT Image 2 is OpenAI's flagship image generation model released April 21, 2026, with API model ID gpt-image-2. It replaces DALL-E 3 (which retires May 12, 2026) and introduces native reasoning ("Thinking Mode") that plans, web-searches, and self-checks images before generation. Key wins over DALL-E: ~99% character-level text accuracy across Latin, CJK, Hindi, and Bengali scripts; native 2K resolution; 16 reference images for brand consistency; 8 coherent images per call with character continuity.

Can I use GPT Image 2 commercially for my Shopify store? +

Yes. OpenAI's Terms of Use explicitly assign all rights, title, and interest in generated outputs to the user. You can use gpt-image-2 images on Shopify product pages, ads, packaging, social posts, and physical merchandise. You remain responsible for ensuring outputs do not violate third-party rights — do not generate copyrighted characters, branded logos, or real-celebrity likenesses without consent. See Section 12 for the full compliance rundown.

How much does GPT Image 2 cost per image on the API? +

Token-based: $8 per million image input tokens, $2 per million cached, $30 per million image output tokens. Per-image cost ranges from $0.04 (low quality 1024×768) to $0.35 (high quality 2K) on the official API. High-quality 1024×1024 is approximately $0.21. Third-party hosts like fal.ai expose 4K at approximately $0.41 per image. ChatGPT Plus at $20 per month gives unlimited generations through the chat surface with Thinking Mode included.

Is the GPT Image 2 free tier supported on the OpenAI API? +

No. The official OpenAI API model page lists Free tier as Not Supported for gpt-image-2. Paid tier access is required, and Organization Verification must be completed before API calls work. ChatGPT Free users can access Instant Mode through the chat surface only — Thinking Mode and 2K resolution require Plus or Pro subscription.

What are the GPT Image 2 API rate limits? +

Tier 1: 100K TPM and 5 IPM (images per minute). Tier 2: 250K TPM and 20 IPM. Tier 3: 800K TPM and 50 IPM. Tier 4: 3M TPM and 150 IPM. Tier 5: 8M TPM and 250 IPM (requires $1,000+ spent and 30-day-old account). Failed prompts that hit content policy still consume quota. Limits are enforced at organization and project level, not per user. Plan tier ramp before launch — don't discover the Tier 1 ceiling during a Black Friday push.

How do I get GPT Image 2 to render text inside images correctly? +

Place exact text inside quotation marks. Specify font style, weight, color, and placement explicitly ("Bold sans-serif, white, centered at the bottom third"). Add "verbatim — no extra characters, no substitutions" for accuracy. Add "no extra text" and "no duplicate text" to prevent watermark or repetition artifacts. For tricky brand names, spell letter-by-letter. Use medium or high quality for dense text panels.

Can GPT Image 2 maintain brand consistency across many product photos? +

Yes. Pass up to 16 reference images per call to lock product appearance, brand colors, and style. Use Thinking Mode with n=8 to generate eight coherent images from one prompt with maintained character and object continuity. Build per-brand Custom GPTs that store your colors, fonts, and rules. Use ChatGPT Projects for persistent context across sessions. Limitation: brand logo reproduction is unreliable — composite real vector logos in Photoshop or Figma post-generation. See Section 08 for the complete brand-consistency playbook.

How do I upload AI-generated images to Shopify automatically? +

Use the Shopify Admin GraphQL API two-step flow: first call stagedUploadsCreate to get a temporary upload URL, POST your file to that URL with the returned auth parameters, then call fileCreate with the staged URL as originalSource. Files process asynchronously — poll fileStatus until READY. Maximum 20MB per file, 250 files per fileCreate batch. The complete Node.js end-to-end pipeline is in Section 10.

Should I use ChatGPT chat or the OpenAI API for Shopify image generation? +

Both. ChatGPT chat ($20/month Plus) is the daily driver for one-off images, iterative editing, and exploration — no code required. The OpenAI API is for automation: bulk variant generation, scheduled campaigns, integration with Shopify product creation hooks. The recommended path for serious Shopify operators is to prototype in chat, then move repeatable workflows to the API.

What are the documented weaknesses of GPT Image 2? +

Brand logo reproduction is unreliable — composite real logos post-generation. Precise pixel-level positioning is inconsistent. Transparent backgrounds are not supported via the Responses API tool — use gpt-image-1.5 for PNG-with-alpha. Knowledge cutoff is December 2025 — events and products after that may render inaccurately unless Thinking Mode pulls from the live web. The model invents statistics on infographics — verify every number. Aesthetic bias is hard to fully override; Midjourney V8 still wins for film-look precision.

How does GPT Image 2 compare to Midjourney, Nano Banana 2, and FLUX for ecommerce? +

GPT Image 2 wins for text in images, multilingual layouts, multi-image consistency, and instruction following — making it the strongest choice for marketing graphics with copy, packaging, and brand-consistent product photography. Nano Banana 2 (Gemini 3.1 Flash Image) is faster and approximately 3× cheaper at scale. Midjourney V8 remains the aesthetic champion for cinematic and painterly illustration. FLUX 2 Pro is strong for photorealistic illustration via Replicate at lower cost. The hybrid workflow — gpt-image-2 for text-heavy work, Midjourney for editorial mood — is common for serious DTC brands. Full comparison table in Section 02. For the multi-provider routing pattern, see the Multi-Model AI Image Generation Routing master class.

Do I own the copyright on AI-generated images for my Shopify store? +

OpenAI assigns all rights to you, but US copyright law (as of 2026) may not grant traditional copyright protection to images generated entirely by AI without meaningful human authorship. This means you can use, sell, and modify the images commercially, but you may not be able to stop others from copying them. To strengthen your claim, add meaningful human contribution — editing, compositing, art direction beyond simply typing a prompt. This is general guidance, not legal advice. Full breakdown in Section 12.

The bottom line

gpt-image-2 is the first AI image generator that ships production-ready Shopify imagery on the first try — text legible, brand consistent, scaled to need. The combination of ChatGPT Plus for daily exploration ($20/mo) and the OpenAI API for automated pipelines ($0.04–$0.35/image) replaces what used to require photographers, designers, and stock subscriptions adding to thousands per month.

This master class is your reference. Bookmark it. Reference the prompt catalog. Build your Custom GPT. Run the pipeline. Ship better visuals than your competitors at a fraction of the cost. The tooling has changed faster than most operators have noticed. The window to gain advantage is now.

Get OpenAI API Access → Back to DDS Vibe Academy

Section 15 · Continue The Curriculum

Continue the DDS Vibe Academy curriculum

This master class is one node in a 25-node constellation. Below is the recommended next-step reading by ring — Foundation, Development, Application, and Mastery. Every link goes to a free, full-length DDS Vibe Academy master class. No paywall. No email gate.

APPLICATION · SISTER CLASS

Multi-Model AI Image Generation Routing

Route across all 6 Gemini image models — Nano Banana Pro, Gemini 3.1 Flash Image, Imagen 4 Ultra. Live pricing, character-consistency chain, paste-ready Python SDK. Pairs directly with this gpt-image-2 master class for full multi-provider coverage.

APPLICATION · SISTER CLASS

Multi-Model AI Video Routing

The AGI-CORE-Pro pattern, end to end. Architect a constraint-aware router across Veo 3.1 Lite, Fast, and Standard with Nano Banana Pro reference frames. Live Gemini API pricing, real SDK code.

FOUNDATION · START HERE

Shopify Sidekick Masterclass

Complete AI prompting guide for Shopify's most powerful assistant. 12 modules covering theme customization, SEO, Flow automation, analytics, and CRO. The Foundation ring of the curriculum.

DEVELOPMENT · FLAGSHIP

Claude Code Masterclass — Part 1

The Foundation & Mastery class. 13 modules covering installation, CLAUDE.md, Skills, Subagents, Agent Teams, Hooks, MCP, and the full Ollama sovereign fallback. The starting point for all serious Claude work.

DEVELOPMENT · FLAGSHIP

Claude Code Masterclass — Part 2: Production Playbook

12 modules on git workflows, the DDS Vibe Coding methodology, React and Shopify build-alongs, multi-agent orchestration, cost control, and three DDS case studies with receipts.

DEVELOPMENT · S-TIER

Google Antigravity Masterclass: S-Tier Edition

22 modules, 81 paste-ready prompts. The complete S-Tier playbook for Google Antigravity including agent swarms, ocean logic, and self-healing systems. Pairs with this image-gen class for full multi-tool coverage.

DEVELOPMENT · DEEP DIVE

Gemini 3.1 Pro Definitive Vibe Coding Guide

Deep-dive on Google's flagship reasoning model. Multi-agent orchestration, thinking levels, 1M token strategies. Direct counterpart to the OpenAI workflows taught here.

APPLICATION · APP BUILDER

One Prompt App Library for Shopify Store Owners

Ship functional Shopify apps from a single prompt. Library of paste-ready app blueprints covering everything from custom inventory dashboards to AI-powered product recommendations.

APPLICATION · CASE STUDY

How I Built My Shopify Store With Claude AI

The full DDS Boston build journey. Real receipts, real timelines, real architecture decisions. The case study that anchors the entire methodology.

APPLICATION · CASE STUDY

The Synthetic Director — Autonomous Creative Agency

Production AI case study: a fully autonomous creative agency replacing $525K/yr of human labor. The economics framework that makes the cost math in this master class actionable.

MASTERY · SOVEREIGN STACK

Ollama for Windows Complete Guide

Run AI locally. RTX 3060 hardware benchmarks. 20+ downloadable models. API setup with code examples. Zero hosting cost. The Mastery-ring foundation for sovereign-stack operators.

MASTERY · CASE STUDY

Atelier OS — Multi-Agent System Case Study

The $15.5M multi-agent AI system built in 52 hours. Production architecture, agent coordination, real outputs. The blueprint for industrial-scale vibe coding.

MASTERY · MONTHLY

The Vibe Coder's Haven — March 2026

The Model Avalanche edition. GPT-5.4, Cursor Composer 2, Gemini 3.1 Ultra, NVIDIA Nemotron 3 Super, the Claude Mythos leak. Monthly coverage of every model and tool that shipped.

MASTERY · WHITEBOARD

AGI Nexus V9 — Autonomous Digital Office

Technical whiteboard for the $1M autonomous digital office architecture. The Mastery-ring engineering reference for operators building toward sovereign AI infrastructure.

PORTFOLIO · ARCHITECT

Robert McCullock Architect Portfolio 2026

The full DDS Sovereign AGI Suite. 11 synthetic employees automating $10.9M+/yr of labor. The production stack the entire DDS Vibe Academy curriculum teaches you to build.

HUB · ALL 25 NODES

DDS Vibe Academy — Full Constellation

The complete map. 25 master classes, guides, case studies, apps, and games organized into four orbital rings: Foundation, Development, Application, Mastery. Everything indexed and filterable.

Why these links matter

Every page above is a verified live DDS Vibe Academy master class. The curriculum is interconnected because the methodology is interconnected — Custom GPT setup mirrors Claude Code Skills, prompt engineering compounds across providers, and the production economics scale across image, video, and agent work. Treating each tool as an isolated skill is how operators stay stuck. The architects who pull ahead read across the constellation.

Verified Sources

Sources cited in this master class

Every fact, spec, and capability claim in this master class is verifiable against these sources. Tier 1 sources are official OpenAI and Shopify documentation. Tier 2 are major press from launch week. Tier 3 are verified community references. Where any claim could not be verified, it is marked "Not verified" inline.

Tier 1 — Official OpenAI

openai.com/index/introducing-chatgpt-images-2-0/ — official launch announcement
developers.openai.com/api/docs/models/gpt-image-2 — model card + snapshots
developers.openai.com/api/docs/guides/image-generation — official API guide
developers.openai.com/cookbook/examples/multimodal/image-gen-models-prompting-guide — prompting cookbook
developers.openai.com/api/docs/pricing — current token pricing
developers.openai.com/api/docs/guides/rate-limits — rate-limit documentation
community.openai.com/t/introducing-gpt-image-2-available-today-in-the-api-and-codex/1379479 — dev forum announcement
openai.com/policies/row-terms-of-use/ — Terms of Use (commercial rights)
openai.com/policies/service-terms/ — Service Terms (Visual Capabilities)
help.openai.com/en/articles/5008634 — copyright assignment confirmation

Tier 2 — Launch press

TechCrunch — "ChatGPT's new Images 2.0 model is surprisingly good at generating text" (April 21, 2026)
VentureBeat — Multilingual + infographic + manga capability review (April 21, 2026)
9to5Mac — Launch coverage and ChatGPT integration details (April 21, 2026)

Tier 3 — Shopify official

shopify.dev/docs/api/admin-graphql/latest/mutations/stagedUploadsCreate — staged upload mutation
shopify.dev/docs/api/admin-graphql/latest/mutations/fileCreate — file create mutation
shopify.dev/docs/apps/build/online-store/product-media — media management guide
shopify.com/blog/ai-image-generator — Shopify's overview of AI tools

Tier 4 — Verified community + 3rd-party

fal.ai prompting guide + model page (openai/gpt-image-2 endpoint)
Replicate model docs (openai/gpt-image-2)
WaveSpeedAI builder review (production integration patterns)
ImagineArt prompt guide (70 prompts catalog)
Promptolis honest 25-prompts review (limitations + workarounds)
ZeroLu/awesome-gpt-image GitHub (X-sourced viral prompts)
PixVerse review + prompt guide (5-scenario stress tests)
MindWiredAI breakdown (capability shifts)
Lushbinary developer guide (token pricing breakdown)
Findskill.ai (Etsy/Shopify solo seller workflow)
Apidog API testing guide (parameter reference)
chasejarvis.com Topaz vs Magnific comparison
aiweiweiseeds.com print-resolution upscale guide
AltText.ai, Caseo.ai, SEO HERO AI, Squirai AI, Alt Text Generator AI app pages

Verification commitment

This master class was compiled April 29, 2026 against sources verified that day. AI products evolve quickly — pricing, rate limits, and policies may change. Before basing major business decisions on any spec, verify against the live OpenAI dashboard, current rate-limit page, and current pricing page. Where this master class is wrong, please surface it through customer support so the document can be corrected.