GPT Image 2 for Shopify Stores
OpenAI shipped gpt-image-2 on April 21, 2026 with ~99% character-level text accuracy, native 2K resolution, 16-image reference inputs, and reasoning that plans before it generates. This is the complete master class on using it to produce every visual a Shopify store will ever need — from product photos to packaging to OG images. 66 paste-ready prompts. Verified API code. Files API integration. No hype, no invented benchmarks, all sources cited. Part of the free DDS Vibe Academy curriculum — the 25-node constellation of AI coding masterclasses behind the DDS Sovereign AGI Suite.
GPT Image 2 (model ID: gpt-image-2, branded "ChatGPT Images 2.0") is OpenAI's flagship image generator released April 21, 2026. It produces near-perfect text inside images, 2K native resolution, supports 16 reference images for brand consistency, and reasons before generating. For Shopify operators it's the strongest single tool for product photos, hero banners, packaging mockups, and ad creative — especially anything with text. Access via ChatGPT Plus ($20/mo, no code) or the OpenAI API ($0.04–$0.35 per image, paid tier required). Commercial use is explicitly granted by OpenAI. Migrate from DALL-E before May 12, 2026.
- ~99% text-in-image accuracy across Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts — the headline feature that changes what's practical for Shopify.
- 16 reference images per call + Thinking Mode generates 8 coherent images with character/product continuity in one call.
- Token-based pricing: $8/M image input, $30/M image output, $2/M cached. Per-image cost: $0.04–$0.35.
- API requires paid tier + Org Verification. Free tier is explicitly NOT supported for gpt-image-2.
- DALL-E 2 + DALL-E 3 retire May 12, 2026. Migrate code that calls those endpoints now.
- OpenAI grants you full commercial rights to outputs — usable on products, ads, packaging, merch.
- Brand logos still need real-vector composite post-generation. The model understands "logo" but doesn't reliably reproduce exact vector shapes.
- Transparent backgrounds NOT supported via Responses API. Use gpt-image-1.5 for PNG-with-alpha, or key out in Photoshop.
- Knowledge cutoff: December 2025. Post-cutoff brands and events may render inaccurately unless Thinking Mode pulls from live web search.
- Counterintuitive: the model performs strongest with simpler prompts. Stop tag-stuffing.
Who this master class serves
This class is structured to deliver value at four distinct expertise levels. Pick the lane that matches you, but read the others — the cross-pollination is where the real leverage lives. If you've completed the Multi-Model AI Image Generation Routing master class or the Multi-Model AI Video Routing master class, this class extends both with deep OpenAI-specific tactics for Shopify production work.
Solo Shopify Operator
You run the store, take the photos, write the copy. ChatGPT Plus path. No code. $20/mo replaces $300/shoot.
DTC Brand Marketing Lead
You ship campaigns weekly. Custom GPT setup, brand consistency at scale, hero/social/ad multi-format.
Agency / Freelancer
You deliver visuals across multiple client brands. Per-brand Custom GPTs, white-label workflows, retainer math.
Architect-CEO / Vibe Coder
You automate. API code paths, Shopify Files API GraphQL, batch generation, programmatic pipelines.
Why GPT Image 2 changes Shopify visual production
For three years, AI image generators had a single embarrassing failure: they couldn't reliably spell. Ask DALL-E 3 for a Mexican restaurant menu and you'd get "enchuita" and "burrto." Ask Midjourney for a poster headline and you'd get plausible-looking arrangements of the wrong letters. Every model — DALL-E, Midjourney, Stable Diffusion, Nano Banana, gpt-image-1.5 — failed at this to some degree.
gpt-image-2 broke that. ~99% character-level text accuracy across Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts. Mixed-script layouts work — a Japanese poster with Latin product names, a Chinese subtitle layered over an English title. For the first time, a model can carry real reading load inside a generated image.
If you build for Shopify — packaging mockups, sale banners, infographics, product labels, social ads with copy, multilingual marketing assets — this is the feature you've been waiting for. It changes what's practical.
The five capability shifts that matter for Shopify
Text rendering you can ship
Posters, packaging, UI mockups, social ads with copy, multilingual marketing assets — all render readably on the first try. You stop rebuilding layouts in Photoshop because the AI couldn't spell.
Reasoning before pixels
Thinking Mode plans the image structure, can web-search for references, and self-checks outputs before rendering. Complex prompts with conflicting constraints ("square infographic, title centered, three columns, small CTA at bottom") resolve sensibly on the first attempt instead of arriving as four columns with no title.
Brand consistency at scale
Up to 16 reference images per call. The model reasons about them as a set — product photo plus brand-style references plus competitor packshot. n=8 with Thinking Mode generates eight coherent images with character and product continuity in one call. Lookbooks and storyboards become a one-prompt operation.
Native 2K + 100+ object scenes
Output up to 2048×2048 native, with 4K experimental beta. Aspect ratios from 3:1 ultra-wide to 1:3 ultra-tall. Scene complexity that previously broke models — 100 distinct objects in one frame — now holds together with maintained count integrity.
One model, two surfaces
The same model powers ChatGPT chat (no code, $20/mo Plus) and the OpenAI API (programmatic automation). Prototype in chat, scale via API. The visual output is identical between surfaces. This is the architectural decision that makes vibe-coder workflows compose: start where you can think, ship where you need throughput. The DDS Vibe Coding methodology in Claude Code Part 2 covers this dual-surface pattern in depth.
This is not a "look how cool AI is" tour. It is a working Shopify operator's reference. Every spec is verified against OpenAI's official model card, dev docs, and announcement page. Every limitation is documented. Where we don't know something exactly (Thinking Mode token overhead, exact tier dollar thresholds), we say so. Use the dossier as a source you can cite, not a hype piece.
How GPT Image 2 compares to Midjourney V8, Nano Banana 2, and FLUX 2
No single model wins every use case. The honest answer for Shopify operators is hybrid: gpt-image-2 for text-heavy and brand-consistency work, Midjourney V8 for editorial mood, Nano Banana 2 for high-volume cheap backgrounds, FLUX 2 for photorealistic illustration via Replicate. Here's the matrix.
| Capability | gpt-image-2 | Midjourney V8 | Nano Banana 2 | FLUX 2 Pro |
|---|---|---|---|---|
| Text in image | ~99% accuracy, multilingual ✓ | Weak, often gibberish | Improved but behind | Good but inconsistent |
| Photorealism | Strong, neutral color | Painterly bias | Cinematic ✓ | Strong photoreal ✓ |
| Reference images | Up to 16 per call ✓ | Style references only | Limited | Single reference |
| Reasoning / planning | Thinking Mode ✓ | No | No | No |
| Multi-image consistency | 8 coherent / call ✓ | Manual seed work | Limited | LoRA required |
| Generation speed | Medium (Thinking adds 30-60s) | Medium | Fast (<10s) ✓ | Fast ✓ |
| Cost per image | $0.04–$0.35 | ~$0.10/img (relax mode) | ~$0.04 ✓ | ~$0.03 (Replicate) ✓ |
| Brand logo accuracy | Unreliable | Unreliable | Unreliable | Unreliable |
| Transparent BG (PNG-α) | Not via Responses API | Manual key-out | Limited | Native ✓ |
| Aesthetic / cinematic mood | Strong, neutral | Champion ✓ | Good | Good |
| Knowledge cutoff | Dec 2025 + web search | Pre-training only | Pre-training only | Pre-training only |
| Free tier access | ChatGPT Free: Instant Mode only. API: NOT supported. | Paid only | Free in Gemini ✓ | Pay-per-use |
Decision framework — when to pick each
Text appears inside the image (posters, packaging, UI mockups, sale banners, infographics, multilingual marketing). Brand-consistent product variants from one source photo are required. Multi-frame storyboards or carousel sequences need character continuity. Conversational iterative editing matters more than raw speed.
Editorial fashion mood, painterly illustrations, or specific film-stock aesthetic precision is the deliverable. Cinematic concept art for lookbook covers where text isn't critical. Style references that demand artistic interpretation over instruction-following.
You need fast, cheap, high-volume background generation or simple lifestyle shots with no embedded text requirements. Free in Gemini for exploratory work. ~3x cheaper at scale.
You need transparent backgrounds natively, photorealistic illustration, or programmatic Replicate workflows at $0.03/image scale. Strong for ghost mannequin and isolated product shots.
Use gpt-image-2 for text-heavy work and brand assets. Use Midjourney V8 for hero campaign mood. Use Nano Banana 2 or FLUX for high-volume backgrounds. Composite real vector logos in Photoshop or Figma post-generation. Upscale finals through Topaz Gigapixel for print resolution. This stack costs roughly $40–$80/month combined and replaces traditional photography for 90% of Shopify visual needs.
Image Arena leaderboard context
gpt-image-2 debuted on the Image Arena Text-to-Image leaderboard at 1512, leading second-place Nano Banana 2 by +242 points — the largest single-model lead ever recorded on that leaderboard. The lead is concentrated in: text rendering accuracy, complex composition handling, multi-image consistency, and instruction following. The lead is not in pure aesthetic photorealism — Midjourney and Nano Banana still hold ground there.
Three ways to access gpt-image-2 — pick your lane
Before any prompt or code, decide which surface you'll use. The model is identical across all three paths. The economics, control level, and automation ceiling are not.
| Path | Cost | Code Required | Thinking Mode | Best For |
|---|---|---|---|---|
| ChatGPT Free | $0 | No | No (Instant only) | Exploration, occasional use, learning the model |
| ChatGPT Plus | $20/mo | No | Yes ✓ | Solo Shopify operators, daily use, no automation needed |
| ChatGPT Pro | $200/mo | No | Yes ✓ | Heavy users, "unlimited subject to abuse guardrails" |
| ChatGPT Business | ~$25/seat/mo | No | Yes ✓ | Marketing teams sharing brand context |
| OpenAI API (paid tier) | $0.04–$0.35 / image | Yes | Yes (variable cost) ✓ | Bulk variants, automation, Shopify integration pipelines |
The OpenAI API does NOT support a free tier for gpt-image-2 — the official model card explicitly lists Free as "Not Supported." You must be on a paid usage tier and complete Organization Verification in your OpenAI developer console before API calls work. This is a one-time setup that takes ~10 minutes but is non-negotiable.
Recommended path for serious Shopify operators: Path C — both ChatGPT Plus AND OpenAI API. ChatGPT Plus is your daily driver for thinking, exploring, iterative editing. The API is your throughput layer when a workflow becomes repeatable. The same prompts work on both surfaces.
Mastering ChatGPT chat for Shopify imagery
The chat surface is more capable than most developers give it credit for. Used well, it replaces 80% of what a serious DTC brand needs visually — without writing a line of code. Here's the complete operator's playbook.
Thinking Mode vs Instant Mode — when to use each
Instant Mode (default for Free, available to all)
Standard fast generation, ~10–20 seconds per image. Use for: simple product shots, single subject scenes, quick exploration, drafts. Quality is significantly improved over gpt-image-1.5 even without reasoning. This is the right call for 80% of solo-operator Shopify use cases.
Thinking Mode (Plus / Pro / Business / Enterprise only)
Reasoning + web search + self-check before generation. ~30–60 seconds per image. Use for: complex layouts with multiple constraints, multilingual posters, infographics with structured data, multi-frame storyboards, packaging with verbatim text, brand-consistency batches via n=8. The latency cost buys first-try quality on the work where rerolls are expensive.
Custom GPTs — the brand consistency unlock
A Custom GPT lets you encode your entire brand context into a persistent assistant. Every chat with that Custom GPT inherits the system prompt, uploaded brand assets, and behavioral rules. This is how you stop typing "use forest green and gold, sans-serif body, serif display, photographed in soft natural light" into every prompt. If you've followed the Claude Code Part 1 master class, the Custom GPT pattern is structurally identical to Claude Code's CLAUDE.md + Skills architecture — same persistent-context principle, different platform.
How to build a Shopify-grade Custom GPT (10 minutes)
Open the Custom GPT builder
In ChatGPT (Plus/Pro), click your name → "My GPTs" → "Create a GPT". Use the Configure tab, not the Create wizard, for control.
Write the system prompt
Document your brand exhaustively: brand name, mission sentence, target customer, brand colors with hex codes, font stack, photography style, lighting recipes, do's and don'ts. Paste a template (provided below).
Upload knowledge files
Drop in a PDF brand guidelines doc, your top 5 hero product photos, your logo as PNG, color swatches, a font specimen sheet. The Custom GPT can reference these in every conversation.
Enable image generation capability
In the Capabilities checklist, ensure "DALL-E Image Generation" is checked (this surfaces the gpt-image-2 model). Web Browsing optional but recommended for Thinking Mode use.
Test, refine, lock
Generate 5 test images across categories (product, lifestyle, social, packaging). If the brand voice drifts, tighten the system prompt. Save and pin the GPT to your sidebar. This is now your daily driver.
Custom GPT system prompt template (paste-ready)
You are the [BRAND NAME] visual director. You generate production-quality images for [BRAND NAME], a [BRAND CATEGORY] brand based in [CITY]. Every image you generate must match these brand specifications: BRAND IDENTITY - Name: [BRAND NAME] - Mission: [ONE-SENTENCE MISSION] - Target customer: [DEMOGRAPHIC + PSYCHOGRAPHIC] - Brand voice: [3 ADJECTIVES — e.g., honest, refined, never preachy] COLOR PALETTE (use these hex codes only) - [PRIMARY DARK]: #XXXXXX - [PRIMARY MID]: #XXXXXX - [ACCENT METALLIC]: #XXXXXX - [BACKGROUND CREAM]: #XXXXXX - [TEXT WHITE]: #XXXXXX TYPOGRAPHY - Display: [SERIF FONT NAME] - Body: [SANS-SERIF FONT NAME] - All in-image text: render in quotation marks, verbatim, no extras PHOTOGRAPHY STYLE - Lighting: [e.g., soft natural daylight from window, slight directional] - Color grading: [e.g., warm neutral, slight green undertone] - Mood: [e.g., quiet confidence, lived-in, never sterile] - Models: [diversity statement and how to specify] - Backgrounds: [preferred surfaces and contexts] DO'S - Render exact text verbatim with "no duplicate text" / "no watermark" - Use shallow depth of field for product detail shots - Composite real logos post-generation; do not request brand logo reproduction - Default to 4:5 vertical for social and product, 16:9 for hero, 1:1 for IG DON'TS - Never reference fake certifications - Never generate copyrighted IP (Disney/Marvel/named brand logos) - Never generate real-celebrity likenesses - Never use empty terms like "premium feel" or "viral quality" WORKFLOW - Default to Instant Mode for exploration - Switch to Thinking Mode when prompt has 5+ constraints or text-heavy - For brand consistency batches, use n=8 with explicit character anchor - Always end prompts with: "No extra text, no duplicate text, no watermark" When user asks for an image, ask one clarifying question if intent is ambiguous (aspect ratio, primary use case). Otherwise generate.
ChatGPT Projects — persistent context for repeated work
Projects are ChatGPT's persistent workspace feature. Unlike a single chat, a Project remembers files, instructions, and context across every conversation inside it. For Shopify operators, this is the right home for per-brand or per-campaign work.
Project setup pattern for Shopify brands
- Create a Project named
[BRAND NAME] · Visual Library - Upload to Project files: brand guidelines PDF, hero product photos (3-5), logo PNG/SVG, color reference image, font specimen
- Set Project instructions to your Custom GPT system prompt (above)
- Start chats inside the Project for: Spring 2026 campaign, new product launch, holiday Q4 ads, etc.
- Each chat inherits the brand context — you skip the "remind me of your brand colors" preamble every session
Memory + Custom Instructions
Even without a Custom GPT, ChatGPT's Memory and Custom Instructions features can encode your brand context globally. Settings → Personalization → Custom Instructions. Drop your brand specifications there and every chat will reference them.
This is the lightweight option. Custom GPTs are stronger for serious work because they have file knowledge bases. Use Memory + Custom Instructions if you're operating a single brand and don't want to build a full Custom GPT.
The reverse-the-prompt trick
One of the most useful patterns: feed ChatGPT an existing brand photo you love and ask it to generate the prompt that would create something similar.
[Upload your reference image to the chat] Analyze this image as a generation prompt. Describe in detail: - The lighting (source, direction, quality, color temperature) - The camera (lens focal length, aperture, height, angle) - The composition (subject placement, framing, negative space) - The color palette (dominant hues, accent colors, grading) - The mood (emotional register, time of day, season) - The style (photography genre, era, aesthetic reference) Then write a single paragraph prompt that would produce a NEW image in the same style but with different subject matter. Format the prompt ready to paste into gpt-image-2. Use quotation marks around any text that should appear in-image.
Multi-turn conversational editing
The most underrated capability of ChatGPT chat is iterative refinement. Instead of writing one perfect prompt, generate a base image and refine in conversation.
Generate base image → "Make the background slightly warmer" → "Move the product 10% to the left" → "Replace the wooden surface with marble" → "Add subtle morning light from the upper left." Each follow-up is a single change. The model preserves everything else — face, identity, pose, lighting recipe, framing — automatically. This is dramatically more reliable than trying to specify all changes in one prompt.
Mobile workflow — camera roll → reference → generate
The ChatGPT mobile app supports image upload from camera roll. The complete on-the-go workflow: photograph your physical product or sample → upload to ChatGPT → "Generate this product on a marble countertop with morning light" → download result → upload directly to Shopify Admin app. End-to-end in under 90 seconds. This is the workflow that makes solo Shopify operators 10x faster than they were six months ago.
OpenAI API setup — Python, Node, curl, edit endpoint
The API is where automation lives. ChatGPT chat is for thinking; the API is for shipping at scale. If you're generating 50 product variants, batch-creating social ads from a content calendar, or running gpt-image-2 inside a Shopify product creation hook, this is your surface. For builds where you need to run image-related models on hardware you own with zero recurring cost, see the Ollama for Windows complete guide — note that as of April 2026, Ollama's open-source image-generation models trail gpt-image-2 substantially in text rendering and instruction following, so the right pattern is gpt-image-2 for production visuals plus Ollama for sovereign reasoning fallback.
Pre-flight checklist (one-time setup)
Create OpenAI account + add payment method
Visit platform.openai.com. Add credit card. Free tier is NOT supported for gpt-image-2 — you must be on a paid usage tier.
Complete Organization Verification
Settings → Organization → Verify. Required before any GPT Image model API calls work. Takes ~10 minutes.
Create API key
API keys → Create new secret key. Name it descriptively (e.g., shopify-image-pipeline-prod). Copy immediately — you cannot view it again.
Set environment variable
In your shell config (.bashrc, .zshrc, .env): export OPENAI_API_KEY="sk-...". Never hardcode keys in committed files.
Install SDK
Python: pip install openai>=1.50.0. Node: npm install openai. The 1.50.0+ version is required for native gpt-image-2 support.
Verify access
Run the smallest test call (below) to confirm your tier supports gpt-image-2. If you get a 429 immediately, you're rate-limited at Tier 1 (5 IPM). If you get an org-verification error, return to step 02.
Python — text-to-image (verified working)
from openai import OpenAI
import base64
import os
client = OpenAI() # reads OPENAI_API_KEY from env
response = client.images.generate(
model="gpt-image-2",
prompt=(
"A clean studio product photograph of a forest-green organic cotton "
"hoodie centered on a pure white seamless background. Soft, even "
"lighting from the upper left, subtle shadow grounding the garment. "
"Sharp focus, true-to-life color, no reflections, no clutter. "
"Professional e-commerce style. No text, no watermark, no logo overlay."
),
size="1024x1024", # also supports 1024x1536, 1536x1024, custom
quality="high", # "low" | "medium" | "high"
n=1 # up to 10 per call
)
# Default response is base64-encoded PNG
image_b64 = response.data[0].b64_json
image_bytes = base64.b64decode(image_b64)
# Save to disk
output_path = "hoodie_hero.png"
with open(output_path, "wb") as f:
f.write(image_bytes)
print(f"Image saved to {output_path}")
print(f"Tokens used: {response.usage}")
Node / TypeScript — text-to-image (verified working)
import OpenAI from "openai";
import fs from "fs";
const openai = new OpenAI(); // reads OPENAI_API_KEY from env
const result = await openai.images.generate({
model: "gpt-image-2",
prompt: `A clean studio product photograph of a forest-green organic cotton
hoodie centered on a pure white seamless background. Soft, even lighting
from the upper left, subtle shadow grounding the garment. Sharp focus,
true-to-life color, no reflections, no clutter. Professional e-commerce
style. No text, no watermark, no logo overlay.`,
size: "1024x1024",
quality: "high",
n: 1
});
const imageBase64 = result.data[0].b64_json;
const imageBytes = Buffer.from(imageBase64, "base64");
fs.writeFileSync("hoodie_hero.png", imageBytes);
console.log("Image saved to hoodie_hero.png");
curl — text-to-image (any language fallback)
curl https://api.openai.com/v1/images/generations \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "A clean studio product photograph of a forest-green organic cotton hoodie centered on a pure white seamless background. Soft, even lighting from upper left, subtle shadow. Professional e-commerce style. No text, no watermark.",
"size": "1024x1024",
"n": 1,
"quality": "high"
}'
Image editing with mask (inpainting / outpainting)
The images.edit endpoint lets you make surgical changes to an existing image — change the background, remove an unwanted element, swap a product color, extend a square into a 16:9 hero. Mask the region you want changed; everything outside the mask stays locked.
from openai import OpenAI
import base64
client = OpenAI()
# source.png: your existing product photo
# mask.png: same dimensions as source. Transparent pixels mark the region
# to edit. Opaque pixels are preserved.
result = client.images.edit(
model="gpt-image-2",
image=open("source.png", "rb"),
mask=open("mask.png", "rb"),
prompt=(
"Replace the white seamless background with a marble countertop. "
"Keep the product, lighting, shadow, and product color exactly as is. "
"Marble should be cool gray with subtle veining, soft natural light. "
"No text, no watermark."
),
quality="high",
n=1
)
image_bytes = base64.b64decode(result.data[0].b64_json)
with open("background_swap.png", "wb") as f:
f.write(image_bytes)
gpt-image-2 processes image inputs at maximum quality regardless of your quality parameter setting. If your workflow involves uploading reference images and iterating on them — common for product mockups, character consistency, ad variations — your real cost per asset runs higher than the baseline per-image number suggests. Cached image inputs cost 75% less ($2/M vs $8/M), so design iterative workflows to reuse the same source image where possible.
Multi-image input — reference-driven generation
Pass up to 16 reference images per call. Label each by index in your prompt. Use this for: virtual try-on, style transfer, brand-consistent product variants, character consistency.
from openai import OpenAI
import base64
client = OpenAI()
result = client.images.edit(
model="gpt-image-2",
image=[
open("base_product.jpg", "rb"), # Image 1: source
open("brand_style_ref.jpg", "rb"), # Image 2: style reference
open("color_swatch.jpg", "rb"), # Image 3: color palette
],
prompt=(
"Image 1 is the source product. Preserve product geometry, materials, "
"and proportions exactly. Image 2 is the brand style reference — "
"match its lighting recipe, depth of field, and editorial mood. "
"Image 3 is the brand color palette — apply these colors only to "
"the background and prop styling, never to the product itself. "
"Generate a new lifestyle product photograph combining all three. "
"4:5 vertical aspect ratio. No text, no watermark, no logo drift."
),
quality="high",
n=1
)
image_bytes = base64.b64decode(result.data[0].b64_json)
with open("brand_consistent_variant.png", "wb") as f:
f.write(image_bytes)
Batch generation with n parameter
Generate up to 10 variants in a single call. Combined with Thinking Mode, this produces 8 coherent images with character/object continuity — the workflow that makes lookbooks and storyboards practical.
from openai import OpenAI
import base64
client = OpenAI()
# Establish character anchor in the prompt itself.
# Then describe 8 distinct scenes that share the character.
response = client.images.generate(
model="gpt-image-2",
prompt=(
"CHARACTER ANCHOR: Mara, 28-year-old, wavy auburn shoulder-length "
"hair, warm olive skin, light freckles. Wearing the brand's signature "
"forest-green hoodie. Editorial lifestyle photography style. "
"Soft natural daylight. Same character across every frame.\n\n"
"Generate 8 connected lifestyle frames showing Mara's morning routine:\n"
"1. Waking up in bed, soft window light\n"
"2. Stretching at the window, looking out\n"
"3. Making coffee in the kitchen\n"
"4. Walking out of her apartment building\n"
"5. Morning walk on urban street, coffee in hand\n"
"6. Arriving at a corner coffee shop\n"
"7. Working on a laptop at the cafe\n"
"8. Smiling, looking up at someone off-frame\n\n"
"Maintain identical character appearance across all 8 frames. "
"Identical color grading: warm morning light fading to cool cafe interior. "
"4:5 per frame. No text, no watermark."
),
size="1024x1536",
quality="high",
n=8 # up to 10 per call; n=8 with Thinking is sweet spot
)
for idx, img in enumerate(response.data):
image_bytes = base64.b64decode(img.b64_json)
with open(f"mara_morning_{idx+1:02d}.png", "wb") as f:
f.write(image_bytes)
print(f"Generated {len(response.data)} frames")
Production retry pattern with exponential backoff
The API returns HTTP 429 when you exceed rate limits. In production, you must implement retry logic — sequential tight loops just queue and timeout. Use the Tenacity library or built-in async retry.
from openai import OpenAI
from tenacity import (
retry,
stop_after_attempt,
wait_random_exponential,
retry_if_exception_type,
)
from openai import RateLimitError, APIError
import base64
client = OpenAI()
@retry(
wait=wait_random_exponential(min=1, max=60),
stop=stop_after_attempt(6),
retry=retry_if_exception_type((RateLimitError, APIError)),
)
def generate_with_backoff(**kwargs):
return client.images.generate(**kwargs)
# Usage
response = generate_with_backoff(
model="gpt-image-2",
prompt="...",
size="1024x1024",
quality="high",
n=1
)
image_bytes = base64.b64decode(response.data[0].b64_json)
with open("output.png", "wb") as f:
f.write(image_bytes)
Bulk variant generation pipeline
The pattern that replaces a $300/product photoshoot with a $2/product API call. Read SKUs from a CSV or your Shopify catalog, generate 5 visual variants per product, save each to disk for review.
import csv
import base64
import os
import time
from openai import OpenAI
from tenacity import retry, wait_random_exponential, stop_after_attempt
client = OpenAI()
@retry(wait=wait_random_exponential(min=2, max=60), stop=stop_after_attempt(5))
def generate(prompt, size="1024x1024", quality="high"):
return client.images.generate(
model="gpt-image-2",
prompt=prompt,
size=size,
quality=quality,
n=1
)
# Read products.csv with columns: sku, name, color, material, category
def bulk_generate_variants(csv_path, output_dir="output"):
os.makedirs(output_dir, exist_ok=True)
with open(csv_path, "r") as f:
reader = csv.DictReader(f)
for row in reader:
sku = row["sku"]
name = row["name"]
color = row["color"]
material = row["material"]
category = row["category"]
variants = [
("white_bg", f"Clean white background product photo of {color} {material} {name}, soft upper-left lighting, e-commerce style. No text."),
("lifestyle", f"Lifestyle product photo of {color} {material} {name} on light oak surface, warm window light, shallow depth of field. No text."),
("flat_lay", f"Top-down flat lay of {color} {material} {name} with complementary props on cream linen, soft overhead light. No text."),
("detail", f"Macro close-up of {color} {material} {name} showing texture and material detail, shallow DOF. No text."),
("ghost_mannequin", f"Ghost mannequin product photo of {color} {material} {name}, white background, soft lighting. No text."),
]
for variant_name, prompt in variants:
try:
response = generate(prompt)
image_bytes = base64.b64decode(response.data[0].b64_json)
output_path = os.path.join(output_dir, f"{sku}_{variant_name}.png")
with open(output_path, "wb") as out:
out.write(image_bytes)
print(f"✓ {sku} {variant_name}")
time.sleep(0.5) # respect rate limit
except Exception as e:
print(f"✗ {sku} {variant_name}: {e}")
if __name__ == "__main__":
bulk_generate_variants("products.csv")
Rate limits — what to expect at each tier
| Tier | TPM | IPM (images/min) | To Reach | 1,000 Image Batch Time |
|---|---|---|---|---|
| Free | — | — | NOT SUPPORTED | — |
| Tier 1 | 100K | 5 | Org verified, payment method | ~3 hours 20 min |
| Tier 2 | 250K | 20 | $50+ spent | ~50 min |
| Tier 3 | 800K | 50 | $100+, 7+ day account | ~20 min |
| Tier 4 | 3M | 150 | $250+, 14+ day account | ~7 min |
| Tier 5 | 8M | 250 | $1,000+, 30+ day account | ~4 min |
Failed prompts (content policy refusals) STILL consume your quota. Limits are enforced at organization + project level, not per user. If you're running multiple API keys under one org, they share the same IPM ceiling. Plan tier ramp before launch — don't discover Tier 1's 5 IPM ceiling during a Black Friday push.
The prompt engineering principles that work
OpenAI published an official prompting guide in the Cookbook. Combined with community testing across thousands of generations since launch, these are the rules that consistently produce production-quality output. Skip these and you'll waste credits.
The master structure (per OpenAI Cookbook)
Write prompts in a consistent order: Background or scene first to set the world. Subject second — the who or what. Key details third — materials, lighting, camera, composition. Constraints last — what NOT to include, text rules, "no watermark / no extra text / no duplicate text." For complex requests, use short labeled segments or line breaks instead of one long paragraph.
The six-block formula (community-validated)
Scene / Background
Where the image takes place. Sunlit cafe, dark studio, abstract gradient, crowded city street. Anchors everything that follows.
Subject
The who or what — described with specificity. Demographic for people, materials and form for products.
Composition
Layout and spatial relationships. Centered, rule-of-thirds, split layout, grid, off-frame elements.
Lighting
Source, direction, mood. "Soft window light from camera left," "neon glow," "dramatic rim light," "golden hour backlight."
Style
Medium, aesthetic, era. "35mm film photography," "editorial magazine," "flat illustration," "watercolor."
Format / Constraints
Aspect ratio, output use, things to exclude. "4:5 vertical for IG story," "no extra text, no duplicate text, no watermark."
The text-in-image rules (HARD constraints)
1. Place exact text in "quotation marks" or ALL CAPS.
2. Specify font style, weight, color, placement explicitly: "Bold sans-serif, white, centered at the bottom third."
3. Spell out tricky brand names letter-by-letter if they have unusual spelling.
4. Add "verbatim — no extra characters, no substitutions" for accuracy-critical text.
5. Add "no extra text" and "no duplicate text" to prevent watermark or repetition artifacts.
6. Use quality: medium or high for dense text panels.
7. For multilingual, state language and script explicitly.
Multi-image input rules
When passing reference images, label each by index and describe the role: "Image 1 is the product photo. Image 2 is the style reference. Apply Image 2's color palette and texture to Image 1." Be explicit about preservation: "Same character, same lighting, same pose — only change the jacket."
The iteration rules
- Don't overload single prompts. Start with a clean base, refine with single-change follow-ups.
- State what stays the same when editing. Without this, the model treats everything as fair game.
- Specify framing/angle/distance. "Close-up" and "wide shot" of the same subject are completely different images.
- Don't mix more than one dominant style. "Watercolor meets cyberpunk vintage" confuses the model.
- Don't use empty aesthetic terms. "Premium feel," "viral quality," "stunning" actually dilute prompts.
- Use natural language scene description. Avoid tag-stuffing like "8K masterpiece cinematic."
Counterintuitive findings (post-launch testing)
Promptolis and ImagineArt both report: gpt-image-2 "performs strongest with simpler prompts" and "becomes less reliable when the creative demand becomes too layered." This runs opposite to Midjourney where complex prompt stacking often improves output. With gpt-image-2, describe ONE clear intent per prompt rather than stacking multiple style modifiers. The model handles 200-word dense prompts well — but only when each clause adds new information rather than restating the same idea differently.
The edit prompt pattern (per fal.ai)
Change: [exactly what should change]
Preserve: [face, identity, pose, lighting, framing, background,
geometry, text, layout — list everything that stays locked]
Constraints: [no extra objects, no redesign, no logo drift, no watermark]
The 70-category Shopify prompt library
Every visual a Shopify store will ever need, organized by category, with a paste-ready prompt template for each. Replace the bracketed placeholders with your specifics. All prompts follow the master structure and apply the text-in-image rules.
Each prompt has placeholders like [PRODUCT], [BRAND COLOR], [DEMOGRAPHIC]. Replace them with your real values. The structural patterns are validated — only the placeholders change. Click COPY on any prompt to put it on your clipboard, then paste into ChatGPT or your API call.
Product Visuals
The core visuals every product page needs — white background, lifestyle, flat-lay, variants, on-model.
Marketing Visuals
Hero banners, sale graphics, email headers, popups — the on-site assets driving conversion.
Social Ads
Platform-native ad creative for IG, FB, Pinterest, YouTube, and carousel sequences — every aspect ratio. (TikTok, X, LinkedIn variants follow the same patterns — adapt aspect ratio.)
Brand Assets
Logos, wordmarks, hangtags, business cards — the foundational identity assets.
Product Packaging
Boxes, mailers, labels, hangtag backs, insert cards — the unboxing experience.
Editorial / Lookbook
Lookbook covers, editorial spreads, seasonal heroes, behind-the-brand portraits, studio environments.
Infographics + Content
Sustainability charts, sizing guides, care instructions, how-it's-made, comparison graphics — the trust-building visuals.
UGC-Style
Authentic-feeling user-generated visuals — OOTDs, mirror selfies, unboxing moments. (Coffee-shop and desk flat-lay variants follow Category A flat-lay pattern.)
OG + Meta Images
Open Graph share image — the universal pattern. (Twitter Cards and Pinterest Rich Pins use the same composition with adjusted aspect ratios.)
Account / Storefront
Favicons, app icons, About hero, reviews mood, 404 page, empty cart — the storefront polish layer.
Locking brand consistency at scale
One-off images are easy. The hard problem is producing 50 lifestyle shots that all look like they came from the same brand, the same photographer, the same season. This is the section that separates serious operators from prompt tourists. The pattern parallels the multi-agent orchestration covered in the Google Antigravity S-Tier master class — coordinated specialized contexts producing coherent output across many calls.
gpt-image-2 ships four native tools for consistency: 16-image reference inputs, the character anchor pattern, Custom GPTs, and ChatGPT Projects. Used together, they replace what used to require a $5K-per-day photographer and a week of post-production color grading.
Tool 1 — The 16-image reference set
gpt-image-2 accepts up to 16 reference images per call and reasons about them as a set. Not as separate inputs to be averaged, but as a coherent system the model interprets together.
The pattern that works: pass your source product photo + 3-4 brand-style references + 1-2 competitor packshots that capture the energy you want. The model picks up the lighting recipe, color grading, depth of field, and editorial mood from the references and applies them to your product.
Always label each input by index in your prompt: "Image 1 is the source product. Image 2 is the lighting reference. Image 3 is the color palette reference. Image 4 is the composition reference." The model uses these labels to know which aspect of which image to apply where. Without labels, it averages them — and averaging is how you get muddy, generic-looking output.
Tool 2 — The character anchor pattern
For human subjects (models, founders, mascots), establish a complete description in the prompt itself — and reuse that exact description in every subsequent prompt. Then add scene-specific details after the anchor.
CHARACTER ANCHOR (paste at the top of every prompt for this character): A young woman named Mara. She has short dark hair with blunt bangs, warm brown skin, light freckles across her nose, and dark brown eyes. She is wearing the brand's signature forest-green oversized hoodie and dark indigo straight-leg jeans. Illustrated in flat modern character design style with clean lines and a muted warm palette. This is her character reference — do not redesign her appearance. SCENE PROMPT 1: [Anchor above] Mara is sitting cross-legged on a bedroom floor surrounded by open books, studying late at night. A desk lamp is the only light source. Same character, do not change her appearance, outfit, or illustration style. SCENE PROMPT 2: [Anchor above] Mara is walking through a rainy street at night, hood pulled up over her sweater, holding a dripping umbrella. Same character, do not change her appearance, outfit, or illustration style.
The anchor adds tokens to every prompt — but it's the difference between getting "a character that looks vaguely like Mara" and "Mara, every time." For brand mascots, founder illustrations, recurring lifestyle models, this pattern is non-negotiable.
Tool 3 — The Custom GPT route (deep dive)
You already saw the basic Custom GPT setup in Section 04. Here's how to take it further for industrial brand consistency.
The brand specification document
Don't paste loose notes into your Custom GPT system prompt. Build a structured brand spec doc that you upload to the GPT's knowledge base. The GPT will reference it in every conversation.
# [BRAND NAME] — Visual Specification v1.0 ## IDENTITY - Founded: [YEAR] - Location: [CITY] - Mission: [ONE SENTENCE] - Customer: [DEMOGRAPHIC + PSYCHOGRAPHIC] ## COLOR PALETTE | Token | Hex | Usage | |-------|-----|-------| | Deep | #XXXXXX | Page backgrounds, headers | | Forest | #XXXXXX | Section backgrounds, depth | | Teal | #XXXXXX | Mid-tone accents | | Emerald | #XXXXXX | Active states, highlights | | Gold | #XXXXXX | Primary accent, CTAs | | Cream | #XXXXXX | Body text on dark, paper backgrounds | | White | #XXXXXX | Pure highlights only | ## TYPOGRAPHY - Display: [FONT NAME] (serif/sans, weight) - Body: [FONT NAME] - Mono: [FONT NAME] for code or technical labels - Hierarchy: Display 32-48px, H2 24-32px, body 16-18px ## PHOTOGRAPHY RECIPE - Camera: 85mm or 50mm lens equivalent - Aperture: f/2.0 to f/2.8 (shallow DOF) - Lighting: Soft natural daylight from window, camera left - Color grade: Warm neutral, slight green undertone in shadows - Mood: Quiet confidence, lived-in, never sterile or aspirational - Skin tones: Natural, real pores, no heavy retouching - Backgrounds: Concrete, light oak, cream linen, soft gray seamless ## ILLUSTRATION RECIPE - Style: Flat with subtle paper texture, hand-drawn quality - Line weight: Medium, slightly imperfect - Color: Brand palette only, no gradients - Mood: Warm, inviting, never childish ## DO'S - Render exact text in quotation marks, verbatim, no extras - Use 4:5 for product, 16:9 for hero, 1:1 for IG, 9:16 for stories - Include "no extra text, no duplicate text, no watermark" in every prompt - Composite real logos post-generation, never request brand logo reproduction - Use shallow DOF for product detail shots - Default to soft natural light over studio strobes ## DON'TS - Never reference fake certifications - Never generate copyrighted IP (Disney/Marvel/named brand logos) - Never generate real-celebrity likenesses - Never use empty terms like "premium feel" or "viral quality" or "stunning" - Never use HDR look or oversaturated colors - Never include text in foreign scripts unless explicitly specified ## DEFAULT OUTPUT SETTINGS - Aspect ratio: ask if ambiguous; otherwise 4:5 - Quality: high for finals, medium for drafts, low for exploration - Format: PNG (compress to AVIF/WebP for Shopify upload) - Resolution: 1024x1536 default, 2K for hero use only
Iterating on the Custom GPT
Treat your Custom GPT like a piece of code. After every 50 generations, audit: where is the brand voice drifting? Tighten the system prompt. Update the brand spec doc. Re-test. The GPT only stays useful if you maintain it.
Tool 4 — ChatGPT Projects deep dive
Projects are where you organize campaigns, not just brand context. Per-brand Custom GPT handles the brand voice; per-campaign Project handles the specific seasonal mood, hero products, target demographics for this drop.
Create a Project per campaign
"Spring 2026 — Forest Collection" / "Holiday Q4 Drop" / "Year One Anniversary." Not per brand, per campaign. Each gets its own folder.
Upload campaign-specific files
Hero product photos for this drop, color reference for this season's mood, model casting references, music or vibe references that capture the energy.
Set Project instructions
Layer on top of the brand-level Custom GPT: "This campaign emphasizes [SPECIFIC MOOD]. Use [PRIMARY HERO COLOR]. The hero product is [SKU]. Target tone is [SPRING ENERGY / HOLIDAY WARMTH / ANNIVERSARY GRATITUDE]."
Start chats inside the Project
"Hero banner desktop." "IG carousel 5 frames." "Email header for launch day." Each chat inherits the campaign context. You skip the preamble.
Tool 5 — The reverse-the-prompt trick (revisited)
Already shown in Section 04, but worth emphasizing as a brand-consistency tool. When you have one image that perfectly captures your brand voice, reverse-engineer it into a reusable prompt — then reuse that prompt skeleton across every new asset.
Workflow: Generate 100 candidate images for a campaign. Pick the 3 that nailed the brand voice. Feed each to ChatGPT and ask for the reverse prompt. Compare the 3 reverse prompts — the patterns that appear in all 3 are your real brand recipe. Codify those patterns into your Custom GPT.
The aesthetic-drift mitigation system
Even with all four tools, brand voice drifts over hundreds of generations. Build a scoring system to catch it before it ships.
| Score Dimension | What to Check | Pass Threshold |
|---|---|---|
| Headline legibility | Read at mobile 400px? At 100px thumbnail? | Yes at both |
| Color contrast | WCAG AA on text? Brand color match within 5%? | Both pass |
| Logo clearance | Logo (composited post-gen) has min 1x its height clearance? | Yes |
| Lighting consistency | Same direction, same temperature as last 5 brand images? | Yes |
| Crop resilience | Survives crop to 1:1, 4:5, 16:9 without losing subject? | 2 of 3 |
| "Brand DNA" gut check | Side-by-side with hero photo from launch — same world? | Subjective yes |
Reject promising visuals that break the rules. You're designing a system, not a one-off image. Regenerate with tighter constraints — "preserve layout; change only background hue ±5%" — instead of accepting close-but-wrong outputs.
Editing, variations, and upscaling for production
Generation is half the workflow. The other half is the post-generation polish that takes a 1024×1024 AI image and ships it as a 3300×2550 print-ready hangtag, a 4K hero banner, or a perfectly-cropped product variant. Here's the toolkit.
Native gpt-image-2 editing — mask inpainting
The images.edit endpoint with a mask is your scalpel. Mask the region you want changed, leave the rest opaque, and gpt-image-2 will modify only the masked area while preserving everything else.
The mask is a PNG with the same dimensions as your source image. Transparent pixels mark the region to edit. Opaque pixels are preserved. For precise control, use Photoshop or Figma to paint the mask manually. For quick masks, you can ask gpt-image-2 itself to generate the mask — pass the source image with prompt "create a mask isolating just the [OBJECT]."
Common edit recipes
| Edit Goal | Mask | Prompt Pattern |
|---|---|---|
| Background swap | Transparent everywhere except product | "Replace background with [NEW SCENE]. Keep product exactly as is." |
| Product color swap | Transparent on product only | "Change the [PRODUCT] color to [NEW COLOR]. Keep material, lighting, shadow." |
| Element removal | Transparent on element | "Remove the [ELEMENT]. Reconstruct the background naturally." |
| Outpainting (extend canvas) | Transparent in extension area | "Extend the scene naturally. Match lighting, color, perspective." |
| Add element | Transparent in target area | "Add [ELEMENT] at [POSITION]. Match the existing lighting and style." |
Upscaling — Topaz vs Magnific decision tree
gpt-image-2 outputs at 1K to 2K natively. Hangtag print at 300 DPI requires 3300×2550. Hero banner at retina-ready 4K requires 3840×2160. You need an upscaler.
The two best-in-class options are Topaz Gigapixel and Magnific. They are not the same tool. Picking the wrong one will ruin output.
Topaz preserves the original image character. It looks at low-res input and asks "what did this originally look like?" — then reconstructs detail without altering identity. Conservative. Safe. Faithful. Use for: portraits where face must stay the same, product photos where logo and label must be preserved, photography where authenticity matters more than embellishment. One-time ~$99 + cloud rendering credits.
Magnific generates new pixels based on semantic understanding. It looks at low-res input and asks "what could this look like?" — then invents skin pores, fabric weave, atmospheric depth. Aggressive. Creative. Will alter facial features at high "Creativity" slider settings. Use for: AI-generated illustrations where embellishment is wanted, dreamscapes, creative concept art. NEVER for portraits where identity must be preserved. Subscription from $39/month.
Decision tree
Is the input a portrait or has identifiable face?
Yes → Topaz only. Magnific will alter the face at Creativity > 3.
No → either tool works.
Is the input a product with brand logo, text, or labels?
Yes → Topaz. Magnific may distort small text and logo details.
No → either tool works.
Is this for print at 300 DPI or above?
Yes → Topaz with the "Standard MAX" or "Recover v3" model. Output at exact print dimensions.
No → either tool, choose by aesthetic preference.
Is the input AI-generated illustration / concept art?
Yes → Magnific shines here. Use Creativity 3-5 for added skin texture and fabric detail. Above 6 starts changing the image.
No → Topaz preserves photographic feel better.
Print resolution math
Print needs density, not just size. At 300 DPI, you need 300 pixels per inch of final print. Always upscale to the exact pixel count for your physical print size.
| Print Asset | Physical Size | Required Pixels (300 DPI) | Upscale Factor from 2K |
|---|---|---|---|
| Hangtag | 3.5" × 2" | 1050 × 600 | 0.5× (downsize) |
| Business card | 3.5" × 2" | 1050 × 600 | 0.5× (downsize) |
| Postcard insert | 5" × 7" | 1500 × 2100 | 1× (use 2K direct) |
| Letter / flyer | 8.5" × 11" | 2550 × 3300 | 1.7× |
| Magazine spread | 17" × 11" | 5100 × 3300 | 2.5× |
| 11x17 poster | 11" × 17" | 3300 × 5100 | 2.5× |
| 18x24 poster | 18" × 24" | 5400 × 7200 | 3.5× |
| Bus stop poster | 4' × 6' | 14400 × 21600 | 10× |
The film grain trick (mask AI smoothness for print)
AI images often look "too smooth" — like melted plastic when printed. The fix is adding a controlled grain layer that mimics real photographic noise.
1. Create a new layer above your image, fill with 50% gray.
2. Filter → Noise → Add Noise. Amount: 5-8%. Distribution: Gaussian. Monochromatic: checked.
3. Set layer blend mode to Soft Light or Overlay.
4. Reduce layer opacity to 10-15%.
5. Save and convert to CMYK if printing.
Why this works: ink binds to paper differently than light hits a screen. Real photography has grain. Adding subtle grain bridges the gap and visually hides upscaling artifacts.
RGB to CMYK conversion (print-ready output)
Screens use light (RGB). Printers use ink (CMYK). Converting RGB to CMYK without an ICC profile makes neon colors look dull. Use the right ICC profile for your print partner.
- Get the ICC profile from your print vendor (e.g., GRACoL 2013 for US sheetfed, FOGRA39 for European offset).
- In Photoshop: Edit → Convert to Profile. Destination: vendor's CMYK profile. Intent: Perceptual (for photography) or Relative Colorimetric (for graphics).
- Soft-proof in Photoshop: View → Proof Colors. Spot-check saturated colors — reds and blues shift most.
- Re-save as TIFF or PDF with embedded profile.
The hybrid workflow — putting it together
The workflow that ships production-quality assets:
Generate base in gpt-image-2
1024×1536 high quality. Iterate in chat until brand voice is locked.
Inpaint problem areas
Use images.edit with mask for any single-element fixes — wrong color, awkward position, missing detail.
Composite real brand elements
Open in Photoshop or Affinity. Add real logo from vector source. Add real product photos for hero shots if available. Composite real cert badges.
Upscale to target resolution
Topaz Gigapixel for portraits and products. Magnific for AI illustrations. Output at exact print pixel dimensions.
Add film grain layer
5-8% Gaussian noise on Soft Light overlay at 10-15% opacity. Hides AI smoothness and binds for print.
Color + sharpening pass
Final color correction in Lightroom or Photoshop. Output sharpening (Photoshop: Filter → Sharpen → Smart Sharpen) sized for delivery medium.
Export for delivery
Web: AVIF or WebP at 85% quality, sRGB. Print: TIFF or PDF at exact size, CMYK with embedded ICC profile.
This is the workflow that turns a $0.21 API call into a $300 photoshoot replacement. Every step matters. Skip the upscale and your hangtag prints muddy. Skip the grain and it looks like AI. Skip the CMYK conversion and your brand colors print wrong.
Shopify Files API — automated image upload pipeline
Generating images is one half of the workflow. Getting them into Shopify cleanly, with proper alt text and optimized formats, is the other half. The Shopify Admin GraphQL API provides a two-step staged upload flow that handles this at scale — and it's the right path for any operator generating more than 10 images at a time. For the full case study of building a production Shopify store with AI from scratch (including image asset pipelines), see How I Built My Shopify Store With Claude AI.
The two-step staged upload flow
Don't try to upload directly. Don't pass binary data through your app server. Use Shopify's staged upload pattern: get a temporary URL, push the file directly to Shopify's storage, then create the file asset.
Call stagedUploadsCreate mutation
Returns: a temporary upload URL + a list of authentication parameters (Shopify uses Google Cloud Storage under the hood). You'll POST to this URL.
POST file with FormData
Append parameters first (in returned order), then append the file last. Order matters — getting it wrong returns "Cannot create buckets" errors. Don't manually set Content-Type; let FormData generate the boundary.
Call fileCreate mutation
Pass the staged URL as originalSource. Specify contentType: IMAGE and alt text. Returns a file ID that can be referenced everywhere.
Poll fileStatus until READY
Files process asynchronously. Poll the file's status field. Once READY, the file is on Shopify's CDN and reference-able from products, variants, collections, themes.
Constraints to know before you build
- 20MB file size limit per upload. Compress to AVIF or WebP before staging.
- 250 files per
fileCreatebatch. Loop in chunks for larger jobs. - Files process asynchronously. Don't assume immediate availability — poll
fileStatus. - One file ID can be referenced from multiple resources. Don't re-upload the same image for different products.
fileDeleteis permanent. Any product referencing the deleted file will have broken media.
The complete GraphQL — stagedUploadsCreate
mutation stagedUploadsCreate($input: [StagedUploadInput!]!) {
stagedUploadsCreate(input: $input) {
stagedTargets {
resourceUrl
url
parameters {
name
value
}
}
userErrors {
field
message
}
}
}
# Variables:
{
"input": [
{
"resource": "IMAGE",
"filename": "hero-shot-v1.png",
"mimeType": "image/png",
"fileSize": "1842301",
"httpMethod": "POST"
}
]
}
The complete GraphQL — fileCreate
mutation fileCreate($files: [FileCreateInput!]!) {
fileCreate(files: $files) {
files {
id
fileStatus
alt
createdAt
... on MediaImage {
image {
url
width
height
}
}
}
userErrors {
field
message
}
}
}
# Variables:
{
"files": [
{
"alt": "Sustainable streetwear hoodie in forest green on model in urban setting",
"contentType": "IMAGE",
"originalSource": "https://shopify-staged-uploads.storage.googleapis.com/tmp/..."
}
]
}
End-to-end Node.js pipeline (gpt-image-2 → Shopify)
The complete operational pipeline. Generate via OpenAI, optimize with Sharp, upload via staged flow, register with fileCreate. This is the code that runs your nightly catalog refresh.
import OpenAI from "openai";
import sharp from "sharp";
import fs from "fs";
import { GraphQLClient, gql } from "graphql-request";
const openai = new OpenAI();
const shopify = new GraphQLClient(
`https://${process.env.SHOPIFY_STORE}.myshopify.com/admin/api/2026-04/graphql.json`,
{
headers: { "X-Shopify-Access-Token": process.env.SHOPIFY_ADMIN_TOKEN }
}
);
// Step 1: Generate image with gpt-image-2
async function generateImage(prompt) {
const result = await openai.images.generate({
model: "gpt-image-2",
prompt,
size: "1024x1536",
quality: "high",
n: 1
});
return Buffer.from(result.data[0].b64_json, "base64");
}
// Step 2: Optimize for Shopify upload (AVIF, ~85% quality, strip EXIF)
async function optimizeForShopify(pngBuffer) {
return await sharp(pngBuffer)
.avif({ quality: 85, effort: 6 })
.withMetadata({ exif: {} })
.toBuffer();
}
// Step 3: Get staged upload URL
async function getStagedUpload(filename, mimeType, fileSize) {
const STAGED_UPLOADS_CREATE = gql`
mutation stagedUploadsCreate($input: [StagedUploadInput!]!) {
stagedUploadsCreate(input: $input) {
stagedTargets {
resourceUrl
url
parameters { name value }
}
userErrors { field message }
}
}
`;
const result = await shopify.request(STAGED_UPLOADS_CREATE, {
input: [{
resource: "IMAGE",
filename,
mimeType,
fileSize: fileSize.toString(),
httpMethod: "POST"
}]
});
return result.stagedUploadsCreate.stagedTargets[0];
}
// Step 4: Push file to staged URL
async function pushToStaged(target, buffer, filename) {
const formData = new FormData();
// Parameters first, file last — order matters
for (const param of target.parameters) {
formData.append(param.name, param.value);
}
formData.append("file", new Blob([buffer]), filename);
const response = await fetch(target.url, {
method: "POST",
body: formData
});
if (!response.ok) {
throw new Error(`Staged upload failed: ${response.status}`);
}
return target.resourceUrl;
}
// Step 5: Create Shopify file asset
async function createShopifyFile(stagedUrl, altText) {
const FILE_CREATE = gql`
mutation fileCreate($files: [FileCreateInput!]!) {
fileCreate(files: $files) {
files {
id
fileStatus
alt
... on MediaImage {
image { url width height }
}
}
userErrors { field message }
}
}
`;
const result = await shopify.request(FILE_CREATE, {
files: [{
alt: altText,
contentType: "IMAGE",
originalSource: stagedUrl
}]
});
return result.fileCreate.files[0];
}
// Step 6: Generate alt text via gpt-4o vision
async function generateAltText(imageBuffer, productContext) {
const base64 = imageBuffer.toString("base64");
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{
role: "user",
content: [
{
type: "text",
text: `Generate SEO + accessibility alt text for this product image. Context: ${productContext}. Keep under 125 characters. Describe what's actually visible plus the product name. No keyword stuffing.`
},
{
type: "image_url",
image_url: { url: `data:image/avif;base64,${base64}` }
}
]
}],
max_tokens: 80
});
return response.choices[0].message.content.trim();
}
// Full pipeline — orchestrate it all
async function generateAndUpload(prompt, productContext, filename) {
console.log(`Generating: ${filename}`);
const pngBuffer = await generateImage(prompt);
const avifBuffer = await optimizeForShopify(pngBuffer);
const altText = await generateAltText(avifBuffer, productContext);
const target = await getStagedUpload(filename, "image/avif", avifBuffer.length);
const stagedUrl = await pushToStaged(target, avifBuffer, filename);
const file = await createShopifyFile(stagedUrl, altText);
console.log(`✓ Uploaded: ${file.id} — ${altText}`);
return file;
}
// Usage
generateAndUpload(
"A clean studio product photograph of a forest-green organic cotton hoodie centered on white seamless background, soft upper-left lighting, e-commerce style, no text",
"Forest green organic cotton hoodie, [BRAND], unisex",
"hoodie-forest-green-hero.avif"
);
CDN behavior — Shopify auto-serves AVIF/WebP
Once your file is in Shopify, the platform's CDN handles modern format delivery automatically. AVIF for browsers that support it, WebP for fallback, JPEG/PNG for legacy. You don't manage this — but you do need to use the right Liquid filters.
<img src="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2868): invalid url input"
alt=""
loading="lazy"><img src="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2873): invalid url input"
srcset="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2874): invalid url input 400w,
Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2875): invalid url input 800w,
Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2876): invalid url input 1200w,
Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2877): invalid url input 1600w"
sizes="(max-width: 768px) 100vw, 50vw"
alt=""
loading="lazy"
width="800" height="1200"><picture>
<source srcset="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2885): invalid url input"
type="image/avif">
<source srcset="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2887): invalid url input"
type="image/webp">
<img src="Liquid error (templates/page.chatgpt-image-shopify-masterclass line 2889): invalid url input"
alt=""
loading="lazy"
width="1200" height="1500">
</picture>
Alt text generation — apps + AI vision pipeline
Every uploaded image needs alt text for SEO and accessibility. AI search engines (ChatGPT, Claude, Perplexity) and Google all use alt text to understand image content. Missing alt text means missing visibility.
Shopify alt text apps (compared)
| App | Free Credits | Paid Plans | Languages | Notable |
|---|---|---|---|---|
| AltText.ai | 25 | $49–$489 | 130+ | Includes product data in alt; auto-translate sync to Shopify multi-language |
| Caseo.ai | 50 | Tiered credits | 8 (EN, ES, FR, PT, IT, DE, NL, LT) | Vision + product data + WCAG 2.1 AA compliance; meta tags + descriptions too |
| Squirai AI SEO | — | Tiered | — | Alt text + image minification + page speed optimization combined |
| SEO HERO AI Alt Text | — | One-time credit packs | Custom | Brand-aware, custom tone, custom keywords, no subscription required |
| Alt Text Generator AI | — | Tiered | EN, ES, DE+ | Supports JPEG, PNG, GIF, SVG, AVIF, WEBP — full format coverage |
Direct gpt-4o vision pipeline (no app required)
If you're already running the gpt-image-2 generation pipeline programmatically, generating alt text via gpt-4o vision adds maybe $0.001 per image and gives you total control. The code is in the full pipeline example above — it's the generateAltText function.
Alt text best practices for Shopify SEO
Do: Describe what's actually visible. Include product name + key attribute (color, material). Keep under 125 characters. Use natural language. Mention context if relevant ("worn by model in urban setting").
Don't: Stuff keywords. Use "image of" or "picture of" (screen readers already announce it's an image). Repeat product title verbatim. Leave it blank or put filename garbage.
Example good: "Forest green organic cotton hoodie on model walking through urban street at golden hour"
Example bad: "image_001_final_v2.jpg" or "best hoodie sustainable eco-friendly Boston DTC streetwear"
File size optimization (pre-upload)
- AVIF for hero and lifestyle images — best compression, 90%+ browser support in 2026, ~50% smaller than WebP at equivalent quality.
- WebP for product variants and gallery shots — universal browser support, ~30% smaller than JPEG.
- 85% quality for product photos (preserve detail), 75% for lifestyle (forgiving on noise).
- Strip EXIF metadata before upload — reduces size and removes camera/location data leaks.
- Resize to delivery dimensions. Don't upload 4K source if Shopify will only render 1200px on PDP. Shopify handles size variants but you save CDN bandwidth and storage.
Done right, this pipeline ingests dozens of generated images per night, optimizes them, alt-texts them, uploads to Shopify CDN, and surfaces them ready to attach to products. The whole loop runs unattended on a schedule — the architecture that turns one operator into a content-production team of ten.
Cost & budget math — what gpt-image-2 actually costs to run
The real question isn't "how much does an image cost?" It's "what does this replace, and what's the breakeven point?" Here's the honest math. For the broader economics framework — how to model AI labor replacement across an entire creative function — see the Synthetic Director case study, which documents a full autonomous creative agency built with the same constraint-first methodology applied here.
Token-based pricing (official OpenAI)
| Token Type | Price per 1M tokens | What It Charges |
|---|---|---|
| Image input | $8.00 | Reference images you upload (full quality always) |
| Image cached input | $2.00 | Repeat reference images (75% discount on cache hits) |
| Image output | $30.00 | Pixels the model generates (varies with size and quality) |
| Text input | $5.00 | Your prompt text |
| Text cached input | $1.25 | System prompt repeated across calls |
| Text output | $10.00 | Reasoning tokens (Thinking Mode adds these) |
Per-image cost ranges
| Resolution × Quality | Approx. Cost | Use Case |
|---|---|---|
| 1024×768 · low | ~$0.01 | Drafts, exploration, batch thumbnails |
| 1024×1024 · medium | ~$0.05 | Standard social post, IG feed |
| 1024×1024 · high | ~$0.21 | Production e-commerce product photo |
| 1024×1536 · high | ~$0.18 | Vertical hero, IG story, lookbook |
| 2048×2048 · high | ~$0.35 | Hero banner, large print at native res |
| 4K (via fal.ai) · high | ~$0.41 | Print-bound assets, billboard |
gpt-image-2 processes image inputs at maximum quality regardless of your quality parameter. If your workflow involves uploading reference images and iterating, your actual per-asset cost runs higher than the baseline table suggests. Cached image tokens cost 75% less ($2/M vs $8/M) — design iterative workflows to reuse the same source image to capture this discount.
Real-world batch math
Scenario A: Solo Shopify operator, 50 SKUs, 5 images per product
- 50 SKUs × 5 images = 250 images
- 250 × $0.21 (1024×1024 high) = $52.50 total
- Time at Tier 2 (20 IPM): ~13 minutes generation + ~1 hour curation
- Replaces: Traditional product photography at $300–500 per product = $15,000–$25,000
- Savings: ~99.7%
Scenario B: Agency client, 500 SKUs full catalog refresh, 5 images each
- 500 × 5 = 2,500 images
- 2,500 × $0.21 = $525 total
- Time at Tier 3 (50 IPM): ~50 minutes generation + ~6 hours curation
- Bills client at: $25,000–$75,000 for the full catalog refresh deliverable
- Margin: 98%+ before labor and Photoshop polish
Scenario C: Weekly social ad rotation, 4 platforms, 20 ads per week
- 20 ads × 52 weeks × ~$0.18 average = $187/year
- Time at Tier 2: ~10 minutes per week
- Replaces: Freelance designer at $50/ad × 20 × 52 = $52,000/year
Scenario D: Full-resolution print campaign (8 hangtags + 4 hero posters)
- 12 images × $0.35 (2K high) = $4.20 in generation
- + ~$15 in Topaz Gigapixel cloud render credits
- + ~$5 in print proofing
- Total: ~$25
- Replaces: Photographer + designer for print campaign at $5,000–$15,000
AI vs traditional — the real comparison matrix
| Approach | Cost per Product (5 images) | Time | Flexibility | Brand Consistency |
|---|---|---|---|---|
| Traditional photoshoot | $300–$500 | 1-2 weeks | Low (reshoot for changes) | High (same photographer) |
| Freelance designer + stock | $80–$150 | 3-5 days | Medium | Medium |
| Shopify Magic (built-in) | Free with plan | Minutes | Low (1MP cap) | Low |
| Photoroom / Pebblely apps | ~$0.10–$0.50/image | Minutes | Medium | Medium |
| gpt-image-2 (this master class) | $1.05 ($0.21 × 5) | Minutes | Total (re-prompt anything) | High (Custom GPT + 16 refs) |
ChatGPT subscription decision matrix
If you're operating a single brand without automation, ChatGPT subscription often beats API for cost-per-image. If you're running automation, API wins for control. Here's the breakpoint math.
| Monthly Volume | ChatGPT Plus ($20/mo) | API @ $0.21 avg | Recommendation |
|---|---|---|---|
| 0–95 images | $20/mo flat | $0–$20 | API (only if automation needed) |
| 96–500 images | $20/mo flat | $20–$105 | ChatGPT Plus (sweet spot) |
| 500+ images | Plus may rate-limit | $105+ but predictable | API + Tier 2/3 + ChatGPT Plus for exploration |
| 2,000+ images | — | $420+ | API only, Tier 3+ recommended |
Cost optimization tactics
1. Use quality: low for iteration drafts. Promote to high only when composition is locked.
2. Cache reference images. Same source image across 10 edits = 75% discount on subsequent calls.
3. Right-size resolution. 400px mobile display doesn't need 2K input. Generate at 1024 for web, 2K only for print.
4. Batch via n parameter. n=4 is more token-efficient than 4 separate calls.
5. Use Instant Mode by default. Reserve Thinking Mode for prompts where reasoning measurably improves output.
6. Skip the Responses API for simple image gen. Direct images.generate avoids mainline-model overhead.
7. Use gpt-image-1-mini for high-volume drafts at ~25% of gpt-image-2 cost.
8. Hybrid stack: Nano Banana 2 for cheap backgrounds at $0.04, gpt-image-2 for hero work where text matters.
Risk, compliance, and the legal reality of AI imagery on Shopify
This section is not legal advice. It is operator-grounded risk awareness — what OpenAI's terms grant, what they don't, where US copyright law actually stands in 2026, and how to protect a Shopify store from preventable mistakes. For specific legal questions consult a lawyer.
Commercial use rights — what OpenAI grants you
OpenAI's Terms of Use (current as of April 2026) are unambiguous on output ownership. The relevant clause:
"As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output."
Translation:
- You own what you generate. Sell it, modify it, license it, distribute it.
- OpenAI does not claim copyright on outputs.
- You can use gpt-image-2 outputs on Shopify product pages, ads, packaging, social media, and physical merchandise.
- No mandatory AI disclosure attribution to OpenAI.
- No exclusivity guarantee — other users may receive similar outputs from similar prompts.
What you remain responsible for
OpenAI grants you ownership of their rights in the output. They cannot grant you rights they don't have. You remain on the hook for third-party rights.
1. Don't generate copyrighted IP. No Disney, Marvel, Pixar, Nintendo, named-brand logos, recognizable game characters, or trademarked designs.
2. Don't generate real-celebrity likenesses without consent. OpenAI's Visual Capabilities terms explicitly prohibit using the model to "reproduce the likeness of any person without express consent and all necessary rights."
3. Don't deceive consumers. FTC concern: depicting a product in a way that misleads about what they're buying. AI-rendered "hand-stitched" detail when the actual product is machine-stitched is a problem.
4. Don't generate fake reviews or testimonials. Showing AI-generated customer faces alongside fake quotes is FTC violation territory regardless of how the image was made.
5. Don't generate medical, legal, or financial claims imagery. A "before/after" weight loss image generated in AI could be deceptive advertising.
US copyright legal status (2026)
Here's where it gets nuanced. OpenAI assigning rights to you doesn't mean US copyright law actually grants traditional copyright protection to the image.
US Copyright Office guidance has held that works generated entirely by AI without meaningful human authorship may not be eligible for copyright protection. Multiple court rulings since 2023 have reinforced this.
What this means practically: you can use, sell, and modify AI-generated images commercially, but you may not be able to stop others from copying them. There is no exclusive ownership in the traditional sense.
To strengthen your copyright claim, add meaningful human contribution beyond "typing a prompt": composite editing, art direction decisions, layered compositing of AI elements with photography, color grading, retouching. The bar is "meaningful human authorship" — not just clicking generate.
Disclosure best practices (2026 regulatory landscape)
- FTC (US): Be transparent if AI imagery would mislead the consumer about a material product attribute. If your "lifestyle" shot shows the product in a setting that misrepresents how it actually performs, disclose.
- EU AI Act 2026: AI-generated content used in commercial advertising may require labeling depending on member-state implementation. Watch this space — rules are evolving.
- Shopify policy: AI-generated product imagery is currently allowed. Verify before launch via Shopify's Acceptable Use Policy.
- Industry shift: Voluntary "AI-assisted" labels are becoming common for trust in DTC. Some brands include a small line in About Us: "Some imagery on this site is AI-assisted. Real product photography is also used."
- Marketplaces: Stricter rules. Shutterstock, Adobe Stock, and Etsy have specific AI-content policies. Verify before listing AI-generated assets there.
Content moderation reality
gpt-image-2 has a separate output review model that is strict on copyrighted IP, real-person likenesses, NSFW, violence, and politically sensitive content. This affects practical workflow.
Refusals consume your quota. A blocked generation still counts against your IPM rate limit and burns the API call cost. False positives happen — the model occasionally refuses benign prompts that pattern-match to restricted categories.
Workarounds within policy: Be specific about safety-relevant elements. "A fictional character" instead of letting the model assume a real person. "Abstract symbol" instead of a generic logo placeholder. "Inspired by [public domain reference]" instead of riffing on a copyrighted character.
If you hit unjustified refusals, file feedback through OpenAI's developer console. The moderation model evolves, but unfounded refusals don't auto-correct without reports.
The DDS-style defensible policy
For any Shopify brand using gpt-image-2 at scale, document your AI usage policy as a brand asset. This protects you with customers, regulators, and platforms. Here's the template that holds up under scrutiny.
# [BRAND] AI Imagery Policy v1.0 ## What we do - We use AI image generation tools (including OpenAI gpt-image-2) to create marketing imagery, lifestyle scenes, packaging mockups, and brand assets for [BRAND]. - We always composite real product photography for hero product pages. - We always composite our real logo from vector source — we do not rely on AI to reproduce it. - We use AI imagery primarily for: lifestyle context, social ad variants, seasonal mood, infographics, packaging concepts. ## What we don't do - We do not generate fake customer photos or testimonials. - We do not generate likenesses of real people without their consent. - We do not generate fake certification logos or compliance badges. - We do not represent AI-rendered features as actual product attributes if those features are not accurate to the physical product. - We do not generate copyrighted characters, logos, or trademarked IP. ## What we maintain - All product page hero images are real product photography (not AI). - All certification badges are sourced from official certification bodies. - All claims (sustainability, materials, sizing) are independently verifiable through documentation we maintain. - We retain at least one real photograph per product for transparency. ## Our position on AI disclosure - We voluntarily disclose AI-assisted imagery on our About page. - We use the term "AI-assisted" not "AI-generated" because all final imagery passes through human editorial review. - We comply with FTC, EU AI Act, and platform-specific rules as they evolve. ## Our customer commitment - If any image on our site materially misrepresents the physical product you receive, we offer free returns + a refund. - Questions about how a specific image was made: customersupport@[brand-domain].com
This policy is genuinely defensible because it limits the surface area where you can be challenged. It also signals trustworthiness to customers — paradoxically, transparent disclosure of AI usage builds more trust than hiding it.
The hangtag problem (a worked example)
You generated a beautiful hangtag with gpt-image-2, including verbatim text and 5 certification icons. The icons are visually similar to GOTS, GRS, OCS, etc. — but they're AI-rendered, not real.
The mistake: shipping that hangtag with the AI-rendered icons. Even if the certifications are real and you have the documentation, displaying AI-rendered versions of certification logos likely violates the certification body's trademark and brand guidelines. Most certifications require use of their official logo files.
The fix: composite the real official certification logos (downloaded from the certifying body) over the AI-rendered hangtag in Photoshop or Affinity. Or generate the hangtag with empty placeholder areas and add real logos in post.
The lesson: AI generates the canvas. Real assets fill the slots that matter for compliance.
Hero showcase — 5 prompts that demonstrate peak capability
The 70-category catalog covers what you'll use every day. This section is different. These are the prompts that demonstrate just how far this model has moved — the ones that make the audience say "the model can really do that?" For coverage of every other AI model that shipped recently — Sora 2.5, Veo 3.1, Nano Banana Pro, Gemini 3.1 Ultra, and the rest of the model avalanche — see the Vibe Coder's Haven March 2026 edition.
Each one stress-tests a specific gpt-image-2 superpower: mixed-script multilingual rendering, multi-frame character continuity, UI mockups with verbatim small text, print-ready packaging with barcode, mixed-language storefront signage.
Run them in Thinking Mode at quality: high. Be patient — these prompts can take 60+ seconds. The output is shareable.
These prompts demonstrate capability, not production-readiness. The model invents statistics on infographics. Logo reproduction is unreliable. Verify every number, composite real logos post-generation, and never publish data charts without confirming the figures externally.
B1. Multilingual editorial magazine cover (mixed Latin + Japanese)
Stress test: ~99% text accuracy across mixed scripts in a layout-heavy editorial composition. Demonstrates the headline feature — text-in-image at production quality with cultural typography awareness.
B3. 8-panel storyboard with character continuity
Stress test: the n=8 Thinking Mode multi-frame consistency feature. Same character, same wardrobe, same stylistic voice across 8 distinct scenes — the workflow that makes lookbooks practical.
B4. UI mockup with realistic small text rendering
Stress test: dense small text at multiple type sizes inside a recognizable iOS layout. Tab labels, product names, prices, headlines — all verbatim, all readable.
B5. Print-ready packaging label with verbatim copy + barcode
Stress test: precise structured text inside a wrapped 3D label, with barcode and small fine print. Demonstrates the model's ability to render production-ready packaging concepts.
B7. Mixed-language storefront signage
Stress test: realistic environmental rendering with multiple text elements in different scripts at varying scales — main signage, window decals, sandwich board chalk script. Cultural typography awareness.
Don't just paste and run. Use them as capability probes — generate, study what worked, study what failed, then adapt the patterns into your own prompts. The structure (verbatim text rules, layout specifications, mood anchors) is more valuable than the specific subjects. These are templates for your own showcase work.
Commercial-intent FAQ
The 12 questions Shopify operators actually ask before adopting gpt-image-2. Answered without hype, with sources where applicable.
gpt-image-2. It replaces DALL-E 3 (which retires May 12, 2026) and introduces native reasoning ("Thinking Mode") that plans, web-searches, and self-checks images before generation. Key wins over DALL-E: ~99% character-level text accuracy across Latin, CJK, Hindi, and Bengali scripts; native 2K resolution; 16 reference images for brand consistency; 8 coherent images per call with character continuity.
stagedUploadsCreate to get a temporary upload URL, POST your file to that URL with the returned auth parameters, then call fileCreate with the staged URL as originalSource. Files process asynchronously — poll fileStatus until READY. Maximum 20MB per file, 250 files per fileCreate batch. The complete Node.js end-to-end pipeline is in Section 10.
The bottom line
gpt-image-2 is the first AI image generator that ships production-ready Shopify imagery on the first try — text legible, brand consistent, scaled to need. The combination of ChatGPT Plus for daily exploration ($20/mo) and the OpenAI API for automated pipelines ($0.04–$0.35/image) replaces what used to require photographers, designers, and stock subscriptions adding to thousands per month.
This master class is your reference. Bookmark it. Reference the prompt catalog. Build your Custom GPT. Run the pipeline. Ship better visuals than your competitors at a fraction of the cost. The tooling has changed faster than most operators have noticed. The window to gain advantage is now.
Continue the DDS Vibe Academy curriculum
This master class is one node in a 25-node constellation. Below is the recommended next-step reading by ring — Foundation, Development, Application, and Mastery. Every link goes to a free, full-length DDS Vibe Academy master class. No paywall. No email gate.
Multi-Model AI Image Generation Routing
Route across all 6 Gemini image models — Nano Banana Pro, Gemini 3.1 Flash Image, Imagen 4 Ultra. Live pricing, character-consistency chain, paste-ready Python SDK. Pairs directly with this gpt-image-2 master class for full multi-provider coverage.
APPLICATION · SISTER CLASSMulti-Model AI Video Routing
The AGI-CORE-Pro pattern, end to end. Architect a constraint-aware router across Veo 3.1 Lite, Fast, and Standard with Nano Banana Pro reference frames. Live Gemini API pricing, real SDK code.
FOUNDATION · START HEREShopify Sidekick Masterclass
Complete AI prompting guide for Shopify's most powerful assistant. 12 modules covering theme customization, SEO, Flow automation, analytics, and CRO. The Foundation ring of the curriculum.
DEVELOPMENT · FLAGSHIPClaude Code Masterclass — Part 1
The Foundation & Mastery class. 13 modules covering installation, CLAUDE.md, Skills, Subagents, Agent Teams, Hooks, MCP, and the full Ollama sovereign fallback. The starting point for all serious Claude work.
DEVELOPMENT · FLAGSHIPClaude Code Masterclass — Part 2: Production Playbook
12 modules on git workflows, the DDS Vibe Coding methodology, React and Shopify build-alongs, multi-agent orchestration, cost control, and three DDS case studies with receipts.
DEVELOPMENT · S-TIERGoogle Antigravity Masterclass: S-Tier Edition
22 modules, 81 paste-ready prompts. The complete S-Tier playbook for Google Antigravity including agent swarms, ocean logic, and self-healing systems. Pairs with this image-gen class for full multi-tool coverage.
DEVELOPMENT · DEEP DIVEGemini 3.1 Pro Definitive Vibe Coding Guide
Deep-dive on Google's flagship reasoning model. Multi-agent orchestration, thinking levels, 1M token strategies. Direct counterpart to the OpenAI workflows taught here.
APPLICATION · APP BUILDEROne Prompt App Library for Shopify Store Owners
Ship functional Shopify apps from a single prompt. Library of paste-ready app blueprints covering everything from custom inventory dashboards to AI-powered product recommendations.
APPLICATION · CASE STUDYHow I Built My Shopify Store With Claude AI
The full DDS Boston build journey. Real receipts, real timelines, real architecture decisions. The case study that anchors the entire methodology.
APPLICATION · CASE STUDYThe Synthetic Director — Autonomous Creative Agency
Production AI case study: a fully autonomous creative agency replacing $525K/yr of human labor. The economics framework that makes the cost math in this master class actionable.
MASTERY · SOVEREIGN STACKOllama for Windows Complete Guide
Run AI locally. RTX 3060 hardware benchmarks. 20+ downloadable models. API setup with code examples. Zero hosting cost. The Mastery-ring foundation for sovereign-stack operators.
MASTERY · CASE STUDYAtelier OS — Multi-Agent System Case Study
The $15.5M multi-agent AI system built in 52 hours. Production architecture, agent coordination, real outputs. The blueprint for industrial-scale vibe coding.
MASTERY · MONTHLYThe Vibe Coder's Haven — March 2026
The Model Avalanche edition. GPT-5.4, Cursor Composer 2, Gemini 3.1 Ultra, NVIDIA Nemotron 3 Super, the Claude Mythos leak. Monthly coverage of every model and tool that shipped.
MASTERY · WHITEBOARDAGI Nexus V9 — Autonomous Digital Office
Technical whiteboard for the $1M autonomous digital office architecture. The Mastery-ring engineering reference for operators building toward sovereign AI infrastructure.
PORTFOLIO · ARCHITECTRobert McCullock Architect Portfolio 2026
The full DDS Sovereign AGI Suite. 11 synthetic employees automating $10.9M+/yr of labor. The production stack the entire DDS Vibe Academy curriculum teaches you to build.
HUB · ALL 25 NODESDDS Vibe Academy — Full Constellation
The complete map. 25 master classes, guides, case studies, apps, and games organized into four orbital rings: Foundation, Development, Application, Mastery. Everything indexed and filterable.
Every page above is a verified live DDS Vibe Academy master class. The curriculum is interconnected because the methodology is interconnected — Custom GPT setup mirrors Claude Code Skills, prompt engineering compounds across providers, and the production economics scale across image, video, and agent work. Treating each tool as an isolated skill is how operators stay stuck. The architects who pull ahead read across the constellation.
Sources cited in this master class
Every fact, spec, and capability claim in this master class is verifiable against these sources. Tier 1 sources are official OpenAI and Shopify documentation. Tier 2 are major press from launch week. Tier 3 are verified community references. Where any claim could not be verified, it is marked "Not verified" inline.
Tier 1 — Official OpenAI
- openai.com/index/introducing-chatgpt-images-2-0/ — official launch announcement
- developers.openai.com/api/docs/models/gpt-image-2 — model card + snapshots
- developers.openai.com/api/docs/guides/image-generation — official API guide
- developers.openai.com/cookbook/examples/multimodal/image-gen-models-prompting-guide — prompting cookbook
- developers.openai.com/api/docs/pricing — current token pricing
- developers.openai.com/api/docs/guides/rate-limits — rate-limit documentation
- community.openai.com/t/introducing-gpt-image-2-available-today-in-the-api-and-codex/1379479 — dev forum announcement
- openai.com/policies/row-terms-of-use/ — Terms of Use (commercial rights)
- openai.com/policies/service-terms/ — Service Terms (Visual Capabilities)
- help.openai.com/en/articles/5008634 — copyright assignment confirmation
Tier 2 — Launch press
- TechCrunch — "ChatGPT's new Images 2.0 model is surprisingly good at generating text" (April 21, 2026)
- VentureBeat — Multilingual + infographic + manga capability review (April 21, 2026)
- 9to5Mac — Launch coverage and ChatGPT integration details (April 21, 2026)
Tier 3 — Shopify official
- shopify.dev/docs/api/admin-graphql/latest/mutations/stagedUploadsCreate — staged upload mutation
- shopify.dev/docs/api/admin-graphql/latest/mutations/fileCreate — file create mutation
- shopify.dev/docs/apps/build/online-store/product-media — media management guide
- shopify.com/blog/ai-image-generator — Shopify's overview of AI tools
Tier 4 — Verified community + 3rd-party
- fal.ai prompting guide + model page (openai/gpt-image-2 endpoint)
- Replicate model docs (openai/gpt-image-2)
- WaveSpeedAI builder review (production integration patterns)
- ImagineArt prompt guide (70 prompts catalog)
- Promptolis honest 25-prompts review (limitations + workarounds)
- ZeroLu/awesome-gpt-image GitHub (X-sourced viral prompts)
- PixVerse review + prompt guide (5-scenario stress tests)
- MindWiredAI breakdown (capability shifts)
- Lushbinary developer guide (token pricing breakdown)
- Findskill.ai (Etsy/Shopify solo seller workflow)
- Apidog API testing guide (parameter reference)
- chasejarvis.com Topaz vs Magnific comparison
- aiweiweiseeds.com print-resolution upscale guide
- AltText.ai, Caseo.ai, SEO HERO AI, Squirai AI, Alt Text Generator AI app pages
This master class was compiled April 29, 2026 against sources verified that day. AI products evolve quickly — pricing, rate limits, and policies may change. Before basing major business decisions on any spec, verify against the live OpenAI dashboard, current rate-limit page, and current pricing page. Where this master class is wrong, please surface it through customer support so the document can be corrected.
