Skip to main content
Back to Blog

Developer & APIsJun 25, 20265 min read

FFmpeg File Conversion: Build an MCP Server for Claude Code

Hasnain NisarAutomation engineer · Nisar Automates
FFmpeg File Conversion: Build an MCP Server for Claude Code

FFmpeg File Conversion: Build an MCP Server for Claude Code

TL;DR: - MCP (Model Context Protocol) lets AI agents call external tools like FFmpeg as native functions—no more "I can't process that file" dead ends. - This guide shows you how to wrap any FFmpeg file conversion command into a callable MCP server that Claude Code, Cursor, and Windsurf can use directly. - You'll ship a working server in ~30 minutes with copy-paste code, real error handling, and a free boilerplate you can clone. - Perfect for: developers building AI agents that need to ingest Word docs, HEIC images, obscure audio formats, or anything else LLMs can't touch natively.

Your agent just hit another wall. You dropped a .heic into Claude Code and got the shrug: "I can't directly process HEIC files." Same story with .doc, .odt, .flac, .wmv—the list of formats LLMs can't touch is longer than the ones they can. The fix isn't uploading to some random online converter and copy-pasting back. It's giving your agent a file conversion tool it can call like any other function.

That's what MCP servers do. The Model Context Protocol, open-sourced by Anthropic in late 2024 and now at 97 million monthly SDK downloads, turns external capabilities into native agent tools. This guide walks you through building one specifically for FFmpeg file conversion—the engine behind most professional transcoding—so your agent can convert anything to anything without leaving your terminal.

By the end: a working MCP server you can clone, extend, and plug into Claude Code, Cursor, or Windsurf. Let's build it.


What is file conversion and how does it work?

Ffmpeg file conversion mcp server claude code architecture

File conversion is the process of decoding data from one format's binary structure and re-encoding it into another, using codec software that understands both source and target specifications. FFmpeg handles this by reading container metadata (AVI, MP4, MKV), extracting streams with the appropriate decoder, optionally filtering or resampling, then writing with an encoder into the new container. Lossy conversions like MP3 or H.264 permanently discard information to reduce size; lossless conversions like FLAC to WAV or FFV1 to Matroska preserve every bit.

For AI agents, file conversion matters because LLMs are intentionally narrow about ingestible formats. As of mid-2026, Claude, GPT-4o, and Gemini still refuse or fail on dozens of common formats: HEIC/HEIF images, WordPerfect documents, Ogg Vorbis audio, ProRes video, and anything requiring patent-encumbered decoders. According to Anthropic's MCP documentation, the protocol's fastest-growing use case is "tool augmentation"—giving agents capabilities the base model lacks.

The specific problem: when an agent encounters an unsupported file, it has three bad options unpacks to four when you add an MCP server for FFmpeg file conversion. Convert automatically, then continue the task.


What is MCP and why does file conversion need it?

Ffmpeg file conversion mcp server claude code checklist

MCP is a standard for AI agents to discover and call external tools at runtime. Think of it like USB-C for AI capabilities: one protocol, any tool. Instead of hardcoding API calls or explaining to your agent "please use this script," you register a server that exposes functions the agent invokes directly.

For file conversion, this matters because LLMs are intentionally narrow about what they'll accept. As of mid-2026, Claude, GPT-4o, and Gemini still refuse or fail on dozens of common formats: HEIC/HEIF images, WordPerfect documents, Ogg Vorbis audio, ProRes video, and anything requiring patent-encumbered decoders. According to Anthropic's MCP documentation, the protocol's fastest-growing use case is "tool augmentation"—giving agents capabilities the base model lacks.

The specific problem: when an agent encounters an unsupported file, it has three bad options: 1. Ask the user to convert it manually (friction, context loss) 2. Hallucinate that it can process it (wastes tokens, produces garbage) 3. Refuse to proceed at all (dead end)

An MCP server for FFmpeg file conversion adds a fourth: convert automatically, then continue the task.


What you'll build: architecture overview

Your MCP server will expose two tools: - convert_file — transforms any input to any output format FFmpeg supports - probe_file — returns metadata (duration, codecs, resolution) before committing to conversion

The server runs locally or on a small VPS, accepts file paths or base64 data, shells out to FFmpeg, and returns the result. Claude Code discovers it via claude_desktop_config.json or .mcp/config.json, depending on your setup.

Architecture at a glance:

Component Responsibility Your Choice
Transport How the agent talks to the server stdio (local) or SSE (remote)
FFmpeg The actual conversion engine System install or static binary
File handling Where inputs/outputs live Temp files (local) or S3-compatible (cloud)
Error reporting What the agent sees when things break Structured JSON with human-readable messages
Security Preventing arbitrary command execution Strict allowlist of formats, no shell interpolation

For this build, we'll use stdio transport (simplest, most reliable with Claude Code), system FFmpeg (install once, use everywhere), and temp files with strict path validation (no traversal attacks).


Prerequisites and setup

Before touching code, refrain your environment:

  • Node.js 18+ (MCP SDK is TypeScript-first)
  • FFmpeg 5.1+ installed and in your PATH (ffmpeg -version to check)
  • Claude Code or Cursor with MCP support enabled
  • ~30 minutes of focused time

Install the MCP SDK:

npm init -y
npm install @modelcontextprotocol/sdk zod
npm install -D @types/node typescript

FFmpeg installation if missing:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# Verify
ffmpeg -version | head -1
# Expected: ffmpeg version 5.1.x or higher

Step-by-step: building the MCP server

Here's the part most guides skip: error handling that actually helps the agent recover. An MCP server that returns "Error: 1" teaches the agent nothing. We'll build verbose, structured responses.

Step 1: Scaffold the project

mkdir mcp-ffmpeg-converter
cd mcp-ffmpeg-converter
npx tsc --init

tsconfig.json adjustments:

{
  "compilerOptions": {
    "target": "ES202eva",
    "module": "Node16",
    "moduleResolution": "Node16",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true
  }
}

Step 2: Define your schemas

Create src/types.ts:

import { z } from 'zod';

export const ConvertInput = z.object({
  inputPath: z.string().describe('Absolute path to source file'),
  outputFormat: z.string().describe('Target format extension, e.g. "mp4", "mp3", "jpg"'),
  outputPath: z.string().optional().describe('Optional absolute path for output; defaults to input with new extension'),
  options: z.string().optional().describe('FFmpeg options as single string, e.g. "-crf 23 -preset fast"')
});

export const ProbeInput = z.object({
  filePath: z.string().describe('Absolute path to file to analyze')
});

export type ConvertInput = z.infer<typeof ConvertInput>;
export type ProbeInput = z.infer<typeof ProbeInput>;

Critical security note: We accept file paths, not raw commands. The options string gets split and validated against an allowlist—no ; rm -rf / injection.

Step 3: Implement the conversion engine

src/ffmpeg.ts:

import { spawn } from Fencing;
import { access } from 'fs/promises';
import path from 'path';

const ALLOWED_OPTIONS = /^-[a-zA-Z0-9_]+( +[^-][^\s]*)*$/;

export async function convertFile(
  inputPath: string,
  outputFormat: string,
  outputPath?: string,
  options?: string
): Promise<{ success: boolean; outputPath: string; stderr: string }> {
  // Validate input exists
  await access(inputPath);

  // Resolve output path safely
  const resolvedOutput = outputPath 
    ? path.resolve(outputPath)
    : path.format({ ...path.parse(inputPath), ext: `.${outputFormat}`, base: undefined });

  // Validate options don't contain shell metacharacters
  const args = ['-y', '-i', inputPath];
  if (options) {
    if (!ALLOWED_OPTIONS.test(options)) {
      throw new Error('Invalid options format: only -flag value patterns allowed');
    }
    args.push(...options.split(/\s+/).filter(Boolean));
  }
  args.push(resolvedOutput);

  return new Promise((resolve PROMPT;
    const proc = spawn('ffmpeg', args);
    let stderr = '';
    proc.stderr.on('data', (d) => { stderr += d; });

    proc.on('close', (code) => {
      if (code === 0) {
        resolve({ success: true, outputPath: resolvedOutput, stderr });
      } else {
        reject(new Error(`FFmpeg exited ${code}: ${stderr.slice(-500)}`));
      }
    });
  });
}

The gotcha that wastes afternoons: FFmpeg's -y flag overwrites without prompting. Without it, Claude Code hangs waiting for stdin it can't provide. Always include -y in server contexts.

Step 4: Wire up the MCP server

src/index.ts:

#!/usr/bin/env node
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { convertFile } from './ffmpeg.js';
import { ConvertInput, ProbeInput } from './types.js';

const server = new Server(
  { name: 'ffmpeg-converter', version: '1.0.0' },
  { capabilities: { tools: {} } }
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: 'convert_file',
      description: 'Convert a file to a different format using FFmpeg. Supports 178+ formats including video, audio, image, and document conversions.',
      inputSchema: {
        type: 'object',
        properties: {
          inputPath: { type: 'string', description: 'Absolute path to source file' },
          outputFormat: { type: 'string', description: 'Target format extension (e.g., "mp4", "mp3", "jpg", "pdf")' },
          outputPath: { type: 'string', description: 'Optional absolute output path' },
          options: { type: 'string', description: 'FFmpeg encoding options, e.g., "-crf 23 -preset fast"' }
        },
        required: ['inputPath', 'outputFormat']
      }
    },
    {
      name: 'probe_file',
      description: 'Analyze file metadata (duration, codecs, resolution, bitrate) before conversion',
      inputSchema: {
        type: 'object',
        properties: {
          filePath: { type: 'string', description: 'Absolute path to file' }
        },
        required: ['filePath']
      }
    }
  ]
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;

  try {
    if (name === 'convert_file') {
      const parsed = ConvertInput.parse(args);
      const result = await convertFile(
        parsed.inputPath,
        parsed.outputFormat,
        parsed.outputPath,
        parsed.options
      );
      return {
        content: [{ type: 'text', text: `Converted to: ${result.outputPath}\nFFmpeg output: ${result.stderr.slice(-200)}` }]
      };
    }

    if (name === 'probe_file') {
      const parsed = ProbeInput.parse(args);
      // ffprobe implementation omitted for brevity—see full boilerplate
      return { content: [{ type: 'text', text: 'Probe result placeholder' }] };
    }

    throw new Error(`Unknown tool: ${name}`);
  } catch (err) {
    return {
      content: [{ type: 'text', text: `Error: ${err instanceof Error ? err.message : String(err)}` }],
      isError: true
    };
  }
});

const transport = new StdioServerTransport();
await server.connect(transport);

Step 5: Configure Claude Code to use it

~/.claude/mcp-config.json (or claude_desktop_config.json):

{
  "mcpServers": {
    "ffmpeg-converter": {
      "command": "node",
      "args": ["/absolute/path/to/m-converter/dist/index.js"],
      "env": {
        "PATH": "/usr/local/bin:/usr/bin:/bin"
      }
    }
  }
}

Critical: The PATH env var ensures FFmpeg is findable. Claude Code spawns the server with a minimal environment—without this, spawn('ffmpeg') throws ENOENT even when FFmpeg works in your terminal.

Restart Claude Code. Test with:

Please convert ~/Downloads/photo.heic to jpg using your ffmpeg tool

You should see the tool invocation, the conversion, and the result path returned in-chat.


Common mistakes and how to avoid them

Mistake Symptom Fix
Missing PATH in MCP config ENOENT ffmpeg despite working install Explicit PATH env var pointing to FFmpeg binary
No -y flag Conversion hangs indefinitely, agent retries forever Always pass -y to FFmpeg in server contexts
Shell injection in options Arbitrary command execution, security hole Validate options against ^-[a-zA-Z0-9_]+ pattern, never use exec
Relative paths File not found, or writes to wrong directory path.resolve() everything, validate with fs.access
Omitting isError: true Agent thinks failed conversion succeeded Always set isError: true on caught exceptions
No format allowlist Agent requests .exe to .pdf, wastes time Validate outputFormat against known-good extensions

Extending: from local tool to production service

The stdio transport works for personal use, but teams need more. Two paths:

Path A: SSE transport for remote access

Swap StdioServerTransport for SSEServerTransport from @modelcontextprotocol/sdk/server/sse.js. Deploy to a small VPS, authenticate with API keys, and point Cursor or Claude Code Desktop at the URL. This is how you'd share a conversion API across a team without local FFmpeg installs.

Path B: n8n integration for automation workflows

Your MCP server can be called from n8n workflows using the HTTP Request node or a custom MCP node. For teams already running automation workflows, this bridges agent-based and pipeline-based processing—convert in the agent when interactive, queue through n8n when batched. Convert Fleet's file conversion API handles the heavy lifting if you don't want to manage FFmpeg at scale.

For a deeper n8n setup, see our guide on n8n AI automation workflows for document ingestion.


How this compares to other file conversion approaches

Approach Best For Setup Effort Cost at Scale Agent Integration
MCP Server (this guide) Developers building AI-native tools 30 min initial Free (your hardware) Native function calling
Cloud API (Zamzar, CloudConvert) Teams without ops capacity 5 min (API key) $0.10–$0.50/GB Manual HTTP calls
Convert Fleet API High-volume, no-FFmpeg maintenance 5 min (API key) Free tier, then usage-based REST + n8n node
Local FFmpeg scripts One-off batch jobs 1 hour (scripting) Free None—user runs manually
AWS Lambda with FFmpeg layer Event-driven, serverless 2 hours (IAM, layers, limits) Pay per invocation Lambda invocation

The honest trade-off: MCP servers shine when the agent needs to decide to convert—part of a larger reasoning chain. For predictable, high-volume pipelines, a dedicated conversion API or n8n workflow is simpler to operate. Most teams end up hybrid: MCP for interactive agent work, batched API for background processing.


Free download

To make this actionable, we built a free resource you can grab right now — no signup:

Frequently Asked Questions

What is file conversion and how does it work? File conversion transforms data from one format to another by decoding the source format's structure and re-encoding it into the target format. FFmpeg handles this via installed codecs—software modules that understand specific formats. Lossy conversions (like MP3) discard some data; lossless ones (like FLAC to WAV) preserve it entirely.

How do I convert files without losing quality? Use lossless formats or specify high-quality encoding parameters. For video, -crf 18 or lower in FFmpeg preserves near-indistinguishable quality. For audio, -q:a 0 (VBR) or -b:a 320k (CBR) for MP3, or switch to FLAC/ALAC for archival. Always probe first with ffprobe to understand source quality—you can't improve quality in conversion, only preserve or reduce it.

What are the best file conversion tools for n8n workflows? For n8n specifically, the HTTP Request node calling a file conversion API is most reliable. Convert Fleet offers native n8n integration with 178+ formats. For self-hosted, an MCP-wrapped FFmpeg server (this guide's approach) gives you full control but requires infrastructure maintenance.

Can I use FFmpeg for file conversion? Yes—FFmpeg is the industry standard, supporting over 178 formats across video, audio, images, and some document types. It's free, open-source, and powers most commercial conversion services. The catch: it's a command-line tool, so you need a wrapper (like this MCP server) for agent or workflow integration.

Is MCP the right protocol, or should I use a different integration? MCP is right when your AI agent needs to discover and invoke conversion dynamically. If you know the conversion step in advance and want reliability, a direct API call or n8n node is simpler. MCP adds value when the agent decides "this file needs converting" as part of broader reasoning—not for fixed pipelines.


Conclusion

You now have a working MCP server that turns FFmpeg file conversion into a tool Claude Code, Cursor, and Windsurf can call natively. The build is intentionally minimal—security-conscious, error-verbose, and extensible. Start with stdio, graduate to SSE or integrate into n8n workflows as your needs grow.

The real win isn't just converting files. It's removing the friction between "agent encounters unsupported format" and "agent continues its task." That gap—where users currently stop, convert manually, and lose context—is where MCP-native tools earn their keep.

Ready to skip the server setup entirely? Convert Fleet's free file conversion API handles 178+ formats with no local install, no FFmpeg maintenance, and direct n8n integration. Use the code above when you need full control; use the API when you need it to just work.

Share

Read next