VisionDraft8 min readJune 23, 2026

VisionDraft: MCP-Native Video Editing Infrastructure

VisionDraft is MCP-native video editing infrastructure for AI agents — timeline JSON, cloud renders, caption tools, and MCP API reference for developers.

By VisionDraft Team

Most "AI video" products ship another timeline with a chat sidebar. VisionDraft takes a different bet: MCP-native video editing infrastructure for AI agents — the execution layer ChatGPT, Claude, and custom automations call when they need real projects, captions, and MP4 exports.

This page is the definitive overview of what VisionDraft is, how it is built, and how it fits your agent stack.

Positioning

VisionDraft isVisionDraft is not
MCP server + render infraA gen-AI text-to-video toy
Timeline JSON engineA Premiere replacement for cinema
Agent-first APIA black-box auto-clipper only
Async FFmpeg pipelineA synchronous chat gimmick

Tagline: MCP-native video editing infrastructure for AI agents.

System Architecture

┌─────────────────────────────────────────┐
│  MCP Hosts: Claude, ChatGPT, Cursor     │
└──────────────────┬──────────────────────┘
                   │ HTTPS JSON-RPC
                   ▼
┌─────────────────────────────────────────┐
│  VisionDraft /api/mcp                   │
│  Auth: Bearer vd_...                    │
│  Tools: create_project, render_project… │
└──────────────────┬──────────────────────┘
                   │
     ┌─────────────┼─────────────┐
     ▼             ▼             ▼
 Projects      Supabase       Render queue
 + timeline    storage        → FFmpeg worker
   JSON                         → exports

Components map to src/lib/engine/ modules: projects, assets, timeline, captions, render queue, MCP server.

MCP Endpoint

  • URL: https://visiondraft.space/api/mcp (configurable via NEXT_PUBLIC_MCP_SERVER_URL)
  • Auth: Authorization: Bearer vd_<api_key>
  • Protocol: MCP over JSON-RPC — tools/list, tools/call

Setup UI: /mcp. Docs: /docs.

MCP Tools (V1)

Project lifecycle

create_project — Creates project with empty timeline:

{
  "version": 1,
  "duration": 0,
  "resolution": { "width": 1920, "height": 1080 },
  "fps": 30,
  "clips": [],
  "audioTracks": [],
  "captions": [],
  "overlays": [],
  "effects": []
}

list_projects — Returns authenticated user's projects.

Asset ingest

upload_asset — Base64 for smaller files (under ~4MB practical limit).

create_upload_url + complete_upload — Signed direct-to-storage path for large video.

list_assets — Filter by asset_type: video, image, audio.

Captions

generate_captions — Downloads asset, runs Faster-Whisper, persists caption rows, mutates timeline with timed segments. Params: project_id, asset_id, optional language (default en).

Render & delivery

render_project — Enqueues FFmpeg job. Auto-places first video asset on timeline if clips empty. Params: project_id, export_name, burn_captions (default true).

get_render_status — Poll job_id until completed or failed.

download_export — Signed URL for finished file (typically 1-hour expiry).

Usage Enforcement

Each tool call passes enforceMcpAction gates:

  • project — project count limits
  • storage — bytes per upload
  • caption — caption minutes
  • render — render minutes
  • read — list/get operations

Limits vary by plan on pricing.

Render Workers

Renders do not run on Vercel serverless functions. External render workers (workers/render-worker.ts) claim render_jobs, execute FFmpeg (trim, merge, caption burn), upload to exports bucket.

Deploy workers on Railway, Fly.io, or VPS with ffmpeg, ffprobe, and Faster-Whisper dependencies.

REST APIs (Dashboard)

MCP is the agent path; REST mirrors capabilities for UI:

  • POST /api/projects/:projectId/assets — multipart upload
  • GET /api/projects/:projectId/assets — list assets
  • GET /api/render/:jobId — job status

Agents should prefer MCP for schema discovery.

Typical Agent Session

  1. User: "Caption and export my webinar."
  2. Agent: create_project → ingest → generate_captionsrender_project
  3. Agent polls get_render_status
  4. Agent: download_export URL to user

Guides: ChatGPT, Claude, complete automation guide.

Why MCP-Native?

MCP-native software exposes tools as the product API. Benefits:

  • Host portability — same server for Claude and ChatGPT
  • Live schemas — hosts refresh tools without prompt hacks
  • Composable stacks — video MCP beside Slack, filesystem, CRM

VisionDraft is a case study in future AI agent workflows.

Plans

PlanTypical user
Starter $5/moExperimentation
Creator $15/moSolo creators
Pro $49/moSmall teams
Agency $99/moHigh render volume

Checkout via Dodo Payments; features include MCP integration and cloud rendering. Details: pricing.

Comparison to "AI Editors"

See best AI video editing tools 2026 and traditional vs agent editing.

VisionDraft wins when you need programmable video execution, not template transitions.

Security

  • API keys per user; regenerate in dashboard
  • RLS policies on Supabase tables
  • Signed export URLs time-limited
  • Webhook billing sync for subscription state

Getting Started Checklist

  • /signup — create account
  • /mcp — copy Server URL + API key
  • Connect Claude or ChatGPT — MCP guides
  • Run first create_projectrender_project chain
  • Deploy or confirm render workers for production volume

Database Schema Overview

Core tables include projects (timeline JSON column), assets, captions, render_jobs, exports. RLS policies scope rows to user_id matching API key owner.

Agents never SQL directly — MCP tools abstract mutations.

Subscription and Billing Integration

Dodo Payments webhooks update subscriptions in Supabase — plan_id, status. enforceMcpAction reads subscription state before tool execution.

Plans: Starter $5, Creator $15, Pro $49, Agency $99 — see pricing.

Environment Variables (Self-Hosting Context)

Deployers configuring workers set NEXT_PUBLIC_SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY, WORKER_ID. MCP URL defaults via NEXT_PUBLIC_MCP_SERVER_URL.

Reference .env.example in repository for full list.

Roadmap Philosophy

VisionDraft adds tools when agents need new verbs — trim APIs, overlay text, aspect ratio presets — without breaking existing schemas. Follow /docs changelog.

Contributing and Integration Partners

ISVs embedding VisionDraft should contact via site signup for partnership discussion — white-label render infra for vertical SaaS adding agent video.

Comparison to Self-Hosted FFmpeg Scripts

DIY scripts lack multi-tenant auth, quota enforcement, dashboard, and MCP discovery. VisionDraft bundles ops patterns teams reinvent poorly.

Uptime and Status Communication

Subscribe to status communications if offered; pipeline SLAs should not assume 100% uptime — design retries and customer communication for render delays.

Extending VisionDraft With Webhooks (Future)

Watch /docs for outbound webhooks on render complete — today poll get_render_status; webhooks reduce pipeline complexity when shipped.

SDK and Client Libraries

Until official SDKs ship, thin JSON-RPC wrappers in your language of choice suffice — ~100 lines for tool call helper.

SLA Expectations for Builders

VisionDraft provides execution infra; your SLA to end users combines VisionDraft uptime, worker capacity, and agent host reliability. Communicate composite SLA honestly.

Multi-Tenant Product Builders

SaaS companies embedding VisionDraft issue sub-accounts or proxy MCP with their auth — architecture patterns in /docs as partnership models mature.

Rate Limit Behavior

When enforceMcpAction denies calls, errors return as strings agents can read — design upstream retry with backoff, not infinite tight loops burning tokens.

FFmpeg Pipeline Internals (Conceptual)

Worker claims job, reads timeline JSON, resolves asset paths from storage, builds filter graph for caption burn, writes output to exports bucket, marks job complete. Failures capture stderr on job row for debugging.

Changelog Discipline

Subscribe to product updates; tool additions like trim or overlay APIs change what agents can promise stakeholders — communicate roadmap in QBRs with enterprise customers.

Reference Appendix: Implementation Notes

Production teams should treat this guide as a living document tied to VisionDraft's MCP tool surface at /docs. Before any batch automation goes live, run a golden path test on a five-second sample clip: create_project, ingest, generate_captions, render_project, poll get_render_status, and download_export. Archive the resulting job_id and export_id as regression fixtures.

Credential hygiene remains the top security issue. API keys from /mcp belong in host connector settings or secrets managers — never in blog comments, ticket attachments, or Git repositories. Rotate keys when employees leave or when a connector was exposed in a screen share. For agencies, separate keys per client prevent accidental cross-posting of exports between brands.

Quota planning on pricing avoids mid-campaign surprises. Model monthly demand: number of episodes × (caption minutes + render minutes per episode) + Shorts derivative factor. Upgrade tier before Black Friday or conference season, not after queue saturation. VisionDraft enforces limits server-side; agents surface errors but cannot override billing.

Async discipline separates hobby workflows from production. Every operator must internalize: render_project returns immediately; completion requires get_render_status polling until completed or failed. Scripts should use exponential backoff (30s, 45s, 60s caps) and alert if p95 latency exceeds SLA. Do not chain duplicate render calls hoping to "speed up" a stuck job — diagnose the existing job_id first.

Human review gates protect brand and compliance. Automate mechanical captioning and encoding; keep humans on claims, regulated statements, music rights, and talent releases. Download URLs from download_export expire — copy files to your CDN or DAM within the signed URL window (typically one hour).

Cross-host portability is a core benefit of MCP-native infrastructure. The same VisionDraft project namespace works from Claude Desktop, ChatGPT connectors, or headless JSON-RPC clients. If one host has an outage, failover procedures should document alternate host configuration hitting identical Server URL and a backup API key.

Observability: log project_id, asset_id, job_id, and export_id for every production run. When stakeholders ask "which export went live Tuesday?", IDs answer definitively unlike chat transcripts. Pair logs with VisionDraft dashboard render history during postmortems.

Related reading: what is MCP, complete guide to AI video automation, VisionDraft MCP infrastructure. Next step: create your account and configure /mcp to run the golden path test today.

Frequently Asked Questions

What is VisionDraft?

MCP-native video infrastructure: projects, captions, cloud renders via MCP tools.

Human video editor?

Dashboard for visibility; MCP is the primary automation API.

MCP tools?

create_project through download_export — see /docs.

Renders?

External FFmpeg workers via render queue.

Get started?

/signup + /mcp + /docs.


VisionDraft is the video layer your agents are missing. Create an account and connect /mcp today.

Frequently asked questions

What is VisionDraft?

VisionDraft is MCP-native video editing infrastructure for AI agents — cloud projects, timeline JSON, asset storage, caption generation, and FFmpeg rendering exposed as MCP tools.

Is VisionDraft an AI video editor for humans?

It includes a dashboard for visibility, but the primary product surface is the MCP server at /api/mcp for ChatGPT, Claude, Cursor, and custom agents — not a traditional NLE competing on timeline UX.

Which MCP tools does VisionDraft expose?

create_project, list_projects, upload_asset, create_upload_url, complete_upload, list_assets, generate_captions, render_project, get_render_status, and download_export.

How do renders execute?

render_project enqueues jobs processed by external FFmpeg workers connected to Supabase — designed for long-running exports outside serverless timeouts.

How do I get started?

Sign up at visiondraft.space, copy MCP Server URL and API key from the dashboard MCP Setup page, and connect your agent host per documentation.

Build video workflows with AI agents

VisionDraft is MCP-native video editing infrastructure. Connect ChatGPT or Claude, upload assets, generate captions, render, and export — without a timeline editor.

Related articles

Model Context Protocol (MCP) lets AI agents call real tools securely. Learn how MCP works and why it matters for video, automation, and SaaS.

VisionDraft TeamRead

Why MCP-native software is replacing API-afterthought SaaS: tool-first design, agent interoperability, and video infra leaders like VisionDraft.

VisionDraft TeamRead

Everything you need for AI video automation: MCP setup, ingest, captions, renders, pipelines, troubleshooting, and VisionDraft infrastructure reference.

VisionDraft TeamRead