Testing MCP tools, vitest + in-memory transport
How to write tests for MCP tools without spawning subprocesses. Unit tests for handlers, integration tests via in-memory transport, smoke tests for stdio mode.
Testing MCP tools, vitest + in-memory transport
Tests for MCP servers fall into three layers. Layer 1 is unit tests on your handlers, fast, easy, the bulk of your tests. Layer 2 is integration tests through an in-memory MCP transport, slow enough to be real, fast enough to run in CI. Layer 3 is one smoke test that actually spawns the binary. This recipe covers all three.
Schritt 1: Make your handlers exportable + pure
The handler should not assume the SDK shape:
// src/tools/create-contact.ts
import { z } from 'zod';
export const CreateContactInput = z.object({
email: z.string().email().toLowerCase(),
name: z.string().min(1),
});
export interface ToolResponse {
content: Array<{ type: 'text'; text: string }>;
isError?: boolean;
}
export async function handleCreateContact(
args: unknown,
ctx: { db: Db },
): Promise<ToolResponse> {
const parsed = CreateContactInput.safeParse(args);
if (!parsed.success) {
return {
isError: true,
content: [{ type: 'text', text: JSON.stringify({ error: 'INVALID_INPUT' }) }],
};
}
// ... do the work
return { content: [{ type: 'text', text: JSON.stringify({ id: '...' }) }] };
}
The router just dispatches:
// src/server.ts
case 'crm_create_contact':
return handleCreateContact(req.params.arguments, { db });
This separation makes the handler trivially testable. No SDK mocks needed.
Schritt 2: Layer 1, unit tests on handlers
// tests/create-contact.test.ts
import { describe, it, expect, vi } from 'vitest';
import { handleCreateContact } from '../src/tools/create-contact.js';
describe('handleCreateContact', () => {
it('rejects missing email', async () => {
const r = await handleCreateContact({}, { db: mockDb() });
expect(r.isError).toBe(true);
expect(r.content[0].text).toContain('INVALID_INPUT');
});
it('normalizes email to lowercase', async () => {
const db = mockDb();
await handleCreateContact({ email: '[email protected]', name: 'Foo' }, { db });
expect(db.lastCall.params[0]).toBe('[email protected]'); // .toLowerCase() applied
});
it('is idempotent on email', async () => {
const db = mockDb();
const r1 = await handleCreateContact({ email: '[email protected]', name: 'X' }, { db });
const r2 = await handleCreateContact({ email: '[email protected]', name: 'X' }, { db });
expect(JSON.parse(r1.content[0].text).id).toBe(JSON.parse(r2.content[0].text).id);
});
});
function mockDb() {
const calls: Array<{ sql: string; params: unknown[] }> = [];
return {
query: vi.fn(async (sql: string, params: unknown[]) => {
calls.push({ sql, params });
return { rows: [{ id: 'mock-id', email: params[0], created: true }] };
}),
get lastCall() { return calls[calls.length - 1]; },
};
}
Run with npx vitest run. These tests are pure, no subprocess, no MCP wire protocol, just function calls.
Schritt 3: Layer 2, integration via in-memory transport
For end-to-end coverage of the MCP protocol (capabilities, tool listing, call dispatch), the SDK ships in-memory transports:
// tests/server.integration.test.ts
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { buildServer } from '../src/server.js';
describe('MCP server integration', () => {
let client: Client;
beforeAll(async () => {
const server = buildServer({ db: mockDb() });
const [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();
await server.connect(serverTransport);
client = new Client({ name: 'test-client', version: '0.0.1' }, { capabilities: {} });
await client.connect(clientTransport);
});
afterAll(async () => {
await client.close();
});
it('lists tools', async () => {
const r = await client.listTools();
const names = r.tools.map((t) => t.name);
expect(names).toContain('crm_create_contact');
});
it('calls a tool end-to-end', async () => {
const r = await client.callTool({
name: 'crm_create_contact',
arguments: { email: '[email protected]', name: 'X' },
});
expect(r.isError).toBeFalsy();
const body = JSON.parse(r.content[0].text);
expect(body.id).toBeDefined();
});
});
The buildServer factory is your full server minus the transport:
// src/server.ts
export function buildServer(deps: { db: Db }) {
const server = new Server({ name: 'my-mcp', version: '0.1.0' }, { capabilities: { tools: {} } });
// ... register tools
return server;
}
// stdio entry point at the bottom of the file:
if (import.meta.url === `file://${process.argv[1]}`) {
const server = buildServer({ db: realDb() });
await server.connect(new StdioServerTransport());
}
Factory + entry guard, same pattern Anthropic uses internally. Integration tests get the factory, production gets the entry.
Schritt 4: Layer 3, one smoke test that spawns
You want one test that actually runs your dist/server.js to catch packaging bugs (missing shebang, wrong bin entry, runtime imports that fail under Node ESM):
// tests/smoke.test.ts
import { describe, it, expect } from 'vitest';
import { spawn } from 'node:child_process';
describe('stdio smoke', () => {
it('spawns + responds to initialize within 2s', async () => {
const proc = spawn('node', ['dist/server.js'], { stdio: ['pipe', 'pipe', 'pipe'] });
const initRequest = JSON.stringify({
jsonrpc: '2.0', id: 1, method: 'initialize',
// protocolVersion: use the spec date your installed @modelcontextprotocol/sdk
// ships with, current as of mid-2026 is "2025-11-25". Older clients still
// accept "2024-11-05"; the SDK negotiates whichever is supported on both ends.
params: { protocolVersion: '2025-11-25', capabilities: {}, clientInfo: { name: 't', version: '0' } },
}) + '\n';
proc.stdin.write(initRequest);
const response = await new Promise<string>((resolve, reject) => {
const t = setTimeout(() => reject(new Error('timeout')), 2000);
proc.stdout.once('data', (chunk) => { clearTimeout(t); resolve(chunk.toString()); });
});
proc.kill();
expect(JSON.parse(response.split('\n')[0]).result).toBeDefined();
});
});
This catches:
- Missing
#!/usr/bin/env nodeshebang - Imports that fail at runtime (often missing
.jsextension under Node ESM) - Server hanging instead of responding to initialize
- Anything you log to stdout that corrupts the wire (the most common bug, see 6.5)
Schritt 5: package.json scripts
{
"scripts": {
"build": "tsc",
"test": "vitest run",
"test:watch": "vitest",
"test:smoke": "npm run build && vitest run tests/smoke.test.ts"
}
}
Smoke after build, regular tests on every change. CI runs npm test && npm run test:smoke.
Schritt 6: Verify
Run academy_validate_step. The validator checks package.json has @modelcontextprotocol/sdk plus a bin or main entry. If you also added a scripts.test field, you're production-ready.
What to test, what to skip
Test: input validation paths (Layer 1), idempotency (Layer 1), tool listing (Layer 2), one happy path per tool (Layer 2), the smoke (Layer 3).
Skip: mocking every Stripe/Supabase response (test against staging instead), perfect coverage chasing (60-70% on critical paths beats 100% on getters), tests that just re-implement the type checker.
The point of MCP tests is to catch regressions before users do. Six well-chosen tests beat sixty trivial ones.