VisionDraft8 min readJune 23, 2026

VisionDraft: MCP-Native Video Editing Infrastructure

VisionDraft is MCP-native video editing infrastructure for AI agents — timeline JSON, cloud renders, caption tools, and MCP API reference for developers.

By VisionDraft Team

Most "AI video" products ship another timeline with a chat sidebar. VisionDraft takes a different bet: MCP-native video editing infrastructure for AI agents — the execution layer ChatGPT, Claude, and custom automations call when they need real projects, captions, and MP4 exports.

This page is the definitive overview of what VisionDraft is, how it is built, and how it fits your agent stack.

Positioning

VisionDraft is	VisionDraft is not
MCP server + render infra	A gen-AI text-to-video toy
Timeline JSON engine	A Premiere replacement for cinema
Agent-first API	A black-box auto-clipper only
Async FFmpeg pipeline	A synchronous chat gimmick

Tagline: MCP-native video editing infrastructure for AI agents.

System Architecture

┌─────────────────────────────────────────┐
│  MCP Hosts: Claude, ChatGPT, Cursor     │
└──────────────────┬──────────────────────┘
                   │ HTTPS JSON-RPC
                   ▼
┌─────────────────────────────────────────┐
│  VisionDraft /api/mcp                   │
│  Auth: Bearer vd_...                    │
│  Tools: create_project, render_project… │
└──────────────────┬──────────────────────┘
                   │
     ┌─────────────┼─────────────┐
     ▼             ▼             ▼
 Projects      Supabase       Render queue
 + timeline    storage        → FFmpeg worker
   JSON                         → exports

Components map to src/lib/engine/ modules: projects, assets, timeline, captions, render queue, MCP server.

MCP Endpoint

URL: https://visiondraft.space/api/mcp (configurable via NEXT_PUBLIC_MCP_SERVER_URL)
Auth: Authorization: Bearer vd_<api_key>
Protocol: MCP over JSON-RPC — tools/list, tools/call

Setup UI: /mcp. Docs: /docs.

MCP Tools (V1)

Project lifecycle

create_project — Creates project with empty timeline:

{
  "version": 1,
  "duration": 0,
  "resolution": { "width": 1920, "height": 1080 },
  "fps": 30,
  "clips": [],
  "audioTracks": [],
  "captions": [],
  "overlays": [],
  "effects": []
}

list_projects — Returns authenticated user's projects.

Asset ingest

upload_asset — Base64 for smaller files (under ~4MB practical limit).

create_upload_url + complete_upload — Signed direct-to-storage path for large video.

list_assets — Filter by asset_type: video, image, audio.

Captions

generate_captions — Downloads asset, runs Faster-Whisper, persists caption rows, mutates timeline with timed segments. Params: project_id, asset_id, optional language (default en).

Render & delivery

render_project — Enqueues FFmpeg job. Auto-places first video asset on timeline if clips empty. Params: project_id, export_name, burn_captions (default true).

get_render_status — Poll job_id until completed or failed.

download_export — Signed URL for finished file (typically 1-hour expiry).

Usage Enforcement

Each tool call passes enforceMcpAction gates:

project — project count limits
storage — bytes per upload
caption — caption minutes
render — render minutes
read — list/get operations

Limits vary by plan on pricing.

Render Workers

Renders do not run on Vercel serverless functions. External render workers (workers/render-worker.ts) claim render_jobs, execute FFmpeg (trim, merge, caption burn), upload to exports bucket.

Deploy workers on Railway, Fly.io, or VPS with ffmpeg, ffprobe, and Faster-Whisper dependencies.

REST APIs (Dashboard)

MCP is the agent path; REST mirrors capabilities for UI:

POST /api/projects/:projectId/assets — multipart upload
GET /api/projects/:projectId/assets — list assets
GET /api/render/:jobId — job status

Agents should prefer MCP for schema discovery.

Typical Agent Session

User: "Caption and export my webinar."
Agent: create_project → ingest → generate_captions → render_project
Agent polls get_render_status
Agent: download_export URL to user

Guides: ChatGPT, Claude, complete automation guide.

Why MCP-Native?

MCP-native software exposes tools as the product API. Benefits:

Host portability — same server for Claude and ChatGPT
Live schemas — hosts refresh tools without prompt hacks
Composable stacks — video MCP beside Slack, filesystem, CRM

VisionDraft is a case study in future AI agent workflows.

Plans

Plan	Typical user
Starter $5/mo	Experimentation
Creator $15/mo	Solo creators
Pro $49/mo	Small teams
Agency $99/mo	High render volume

Checkout via Dodo Payments; features include MCP integration and cloud rendering. Details: pricing.

Comparison to "AI Editors"

See best AI video editing tools 2026 and traditional vs agent editing.

VisionDraft wins when you need programmable video execution, not template transitions.

Security

API keys per user; regenerate in dashboard
RLS policies on Supabase tables
Signed export URLs time-limited
Webhook billing sync for subscription state

Getting Started Checklist

/signup — create account
/mcp — copy Server URL + API key
Connect Claude or ChatGPT — MCP guides
Run first create_project → render_project chain
Deploy or confirm render workers for production volume

Database Schema Overview

Core tables include projects (timeline JSON column), assets, captions, render_jobs, exports. RLS policies scope rows to user_id matching API key owner.

Agents never SQL directly — MCP tools abstract mutations.

Subscription and Billing Integration

Dodo Payments webhooks update subscriptions in Supabase — plan_id, status. enforceMcpAction reads subscription state before tool execution.

Plans: Starter $5, Creator $15, Pro $49, Agency $99 — see pricing.

Environment Variables (Self-Hosting Context)

Deployers configuring workers set NEXT_PUBLIC_SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY, WORKER_ID. MCP URL defaults via NEXT_PUBLIC_MCP_SERVER_URL.

Reference .env.example in repository for full list.

Roadmap Philosophy

VisionDraft adds tools when agents need new verbs — trim APIs, overlay text, aspect ratio presets — without breaking existing schemas. Follow /docs changelog.

Contributing and Integration Partners

ISVs embedding VisionDraft should contact via site signup for partnership discussion — white-label render infra for vertical SaaS adding agent video.

Comparison to Self-Hosted FFmpeg Scripts

DIY scripts lack multi-tenant auth, quota enforcement, dashboard, and MCP discovery. VisionDraft bundles ops patterns teams reinvent poorly.

Uptime and Status Communication

Subscribe to status communications if offered; pipeline SLAs should not assume 100% uptime — design retries and customer communication for render delays.

Extending VisionDraft With Webhooks (Future)

Watch /docs for outbound webhooks on render complete — today poll get_render_status; webhooks reduce pipeline complexity when shipped.

SDK and Client Libraries

Until official SDKs ship, thin JSON-RPC wrappers in your language of choice suffice — ~100 lines for tool call helper.

SLA Expectations for Builders

VisionDraft provides execution infra; your SLA to end users combines VisionDraft uptime, worker capacity, and agent host reliability. Communicate composite SLA honestly.

Multi-Tenant Product Builders

SaaS companies embedding VisionDraft issue sub-accounts or proxy MCP with their auth — architecture patterns in /docs as partnership models mature.

Rate Limit Behavior

When enforceMcpAction denies calls, errors return as strings agents can read — design upstream retry with backoff, not infinite tight loops burning tokens.

FFmpeg Pipeline Internals (Conceptual)

Worker claims job, reads timeline JSON, resolves asset paths from storage, builds filter graph for caption burn, writes output to exports bucket, marks job complete. Failures capture stderr on job row for debugging.

Changelog Discipline

Subscribe to product updates; tool additions like trim or overlay APIs change what agents can promise stakeholders — communicate roadmap in QBRs with enterprise customers.

Reference Appendix: Implementation Notes

Production teams should treat this guide as a living document tied to VisionDraft's MCP tool surface at /docs. Before any batch automation goes live, run a golden path test on a five-second sample clip: create_project, ingest, generate_captions, render_project, poll get_render_status, and download_export. Archive the resulting job_id and export_id as regression fixtures.

Credential hygiene remains the top security issue. API keys from /mcp belong in host connector settings or secrets managers — never in blog comments, ticket attachments, or Git repositories. Rotate keys when employees leave or when a connector was exposed in a screen share. For agencies, separate keys per client prevent accidental cross-posting of exports between brands.

Quota planning on pricing avoids mid-campaign surprises. Model monthly demand: number of episodes × (caption minutes + render minutes per episode) + Shorts derivative factor. Upgrade tier before Black Friday or conference season, not after queue saturation. VisionDraft enforces limits server-side; agents surface errors but cannot override billing.

Async discipline separates hobby workflows from production. Every operator must internalize: render_project returns immediately; completion requires get_render_status polling until completed or failed. Scripts should use exponential backoff (30s, 45s, 60s caps) and alert if p95 latency exceeds SLA. Do not chain duplicate render calls hoping to "speed up" a stuck job — diagnose the existing job_id first.

Human review gates protect brand and compliance. Automate mechanical captioning and encoding; keep humans on claims, regulated statements, music rights, and talent releases. Download URLs from download_export expire — copy files to your CDN or DAM within the signed URL window (typically one hour).

Cross-host portability is a core benefit of MCP-native infrastructure. The same VisionDraft project namespace works from Claude Desktop, ChatGPT connectors, or headless JSON-RPC clients. If one host has an outage, failover procedures should document alternate host configuration hitting identical Server URL and a backup API key.

Observability: log project_id, asset_id, job_id, and export_id for every production run. When stakeholders ask "which export went live Tuesday?", IDs answer definitively unlike chat transcripts. Pair logs with VisionDraft dashboard render history during postmortems.

Related reading: what is MCP, complete guide to AI video automation, VisionDraft MCP infrastructure. Next step: create your account and configure /mcp to run the golden path test today.

Frequently Asked Questions

What is VisionDraft?

MCP-native video infrastructure: projects, captions, cloud renders via MCP tools.

Human video editor?

Dashboard for visibility; MCP is the primary automation API.

MCP tools?

create_project through download_export — see /docs.

Renders?

External FFmpeg workers via render queue.

Get started?

/signup + /mcp + /docs.

VisionDraft is the video layer your agents are missing. Create an account and connect /mcp today.

Frequently asked questions

What is VisionDraft?

VisionDraft is MCP-native video editing infrastructure for AI agents — cloud projects, timeline JSON, asset storage, caption generation, and FFmpeg rendering exposed as MCP tools.

Is VisionDraft an AI video editor for humans?

It includes a dashboard for visibility, but the primary product surface is the MCP server at /api/mcp for ChatGPT, Claude, Cursor, and custom agents — not a traditional NLE competing on timeline UX.

Which MCP tools does VisionDraft expose?

create_project, list_projects, upload_asset, create_upload_url, complete_upload, list_assets, generate_captions, render_project, get_render_status, and download_export.

How do renders execute?

render_project enqueues jobs processed by external FFmpeg workers connected to Supabase — designed for long-running exports outside serverless timeouts.

How do I get started?

Sign up at visiondraft.space, copy MCP Server URL and API key from the dashboard MCP Setup page, and connect your agent host per documentation.

Build video workflows with AI agents

VisionDraft is MCP-native video editing infrastructure. Connect ChatGPT or Claude, upload assets, generate captions, render, and export — without a timeline editor.

Start free trial MCP setup guide Documentation

MCP

11 min read

What Is MCP And Why It Changes How AI Agents Use Software

Model Context Protocol (MCP) lets AI agents call real tools securely. Learn how MCP works and why it matters for video, automation, and SaaS.

VisionDraft TeamRead

MCP

9 min read

The Rise Of MCP-Native Software

Why MCP-native software is replacing API-afterthought SaaS: tool-first design, agent interoperability, and video infra leaders like VisionDraft.

VisionDraft TeamRead

VisionDraft

9 min read

The Complete Guide To AI Video Automation

Everything you need for AI video automation: MCP setup, ingest, captions, renders, pipelines, troubleshooting, and VisionDraft infrastructure reference.

VisionDraft TeamRead

View all articles →

Positioning

System Architecture

MCP Endpoint

MCP Tools (V1)

Project lifecycle

Asset ingest

Captions

Render & delivery

Usage Enforcement

Render Workers

REST APIs (Dashboard)

Typical Agent Session

Why MCP-Native?

Plans

Comparison to "AI Editors"

Security

Getting Started Checklist

Database Schema Overview

Subscription and Billing Integration

Environment Variables (Self-Hosting Context)

Roadmap Philosophy

Contributing and Integration Partners

Comparison to Self-Hosted FFmpeg Scripts

Uptime and Status Communication

Extending VisionDraft With Webhooks (Future)

SDK and Client Libraries

SLA Expectations for Builders

Multi-Tenant Product Builders

Rate Limit Behavior

FFmpeg Pipeline Internals (Conceptual)

Changelog Discipline

Reference Appendix: Implementation Notes

Frequently Asked Questions

What is VisionDraft?

Human video editor?

MCP tools?

Renders?

Get started?

Frequently asked questions

What is VisionDraft?

Is VisionDraft an AI video editor for humans?

Which MCP tools does VisionDraft expose?

How do renders execute?

How do I get started?

Build video workflows with AI agents

Related articles

What Is MCP And Why It Changes How AI Agents Use Software

The Rise Of MCP-Native Software

The Complete Guide To AI Video Automation