Phase 6 · Build Your Own MCP Server·6 steps

Tool design, what makes a good MCP tool

Naming, descriptions, input schemas, idempotency, error responses. The difference between a tool the LLM uses correctly and one it ignores or breaks.

6 steps0%

Du liest ohne Account. Mit Login speichern wir Step-Fortschritt + Notes.

Tool design, what makes a good MCP tool

You can have a perfect server architecture and still build tools the LLM will refuse to call. Tool design decides whether your work shows up in production. This recipe is the checklist we apply to every MCP tool we ship.

Schritt 1: Name the tool the way the LLM thinks

Tool names are the first thing the LLM scans. Patterns that work:

<server>_<verb>_<noun> (crm_get_contact, academy_list_recipes). The server prefix prevents collisions when multiple MCP servers expose similar tools. Claude has 10+ memory tools across servers, only the prefix tells them apart.
Use a verb the LLM uses in conversation: get, list, search, create, update, delete, send. Avoid fetch, do, process, handle, execute, they read as generic plumbing.
Singular noun for one-of, plural for collections: get_contact vs list_contacts. The LLM mirrors this to pick the right one.

Bad: tool1, crmAction, processData. The LLM treats them as last-resort options.

Schritt 2: Write descriptions for the LLM, not for users

The description is the only context the LLM sees before calling. Three rules:

const goodTool: Tool = {
  name: 'crm_get_contact',
  description:
    'Look up one CRM contact by email or by contact ID. Returns ' +
    'name, company, last interaction date, and tags. Returns 404 ' +
    'if no match. Use this when the user asks about a person; for ' +
    'company-level data use crm_get_company instead.',
  // ...
};

What this description does:

States the purpose in one sentence, "Look up one CRM contact".
Lists what comes back, so the LLM knows whether to call it for a given user question.
Names the disambiguation, when NOT to use this tool ("for company-level data use crm_get_company"). This is the single biggest predictor of correct tool selection.

Anti-patterns: "Helper for contact data", "Wraps the CRM API", "Internal tool". The LLM downgrades these.

Schritt 3: Make input schemas teach the LLM how to call

JSON Schema is the contract. The LLM reads description per field. Use it.

inputSchema: {
  type: 'object',
  properties: {
    email: {
      type: 'string',
      format: 'email',
      description: 'Contact email. Mutually exclusive with contactId, pass one or the other.',
    },
    contactId: {
      type: 'string',
      description: 'Internal contact UUID. Mutually exclusive with email.',
    },
    includeNotes: {
      type: 'boolean',
      default: false,
      description: 'Include the timeline of notes (last 20). Slow, only set true when the user explicitly asks for history.',
    },
  },
  // No `required`, the OR-relation is documented in the descriptions.
  additionalProperties: false,
}

Three patterns the LLM picks up:

additionalProperties: false. Claude will not pass extra fields it isn't sure about.
default values. Claude treats fields with defaults as optional.
Per-field description. Claude reads each description before constructing the call.

Schritt 4: Idempotency, the single most important property

If a user asks "create that contact" twice, your tool should return the same contact, not two contacts. Two patterns:

Natural-key idempotency (preferred):

case 'crm_create_contact': {
  const { email, name } = parsed.data;
  // INSERT ... ON CONFLICT (email) DO UPDATE
  const r = await db.query(
    `INSERT INTO contacts (email, name) VALUES ($1, $2)
     ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name
     RETURNING id, email, name, (xmax = 0) AS created`,
    [email, name],
  );
  const row = r.rows[0];
  return {
    content: [{ type: 'text', text: JSON.stringify({
      id: row.id, email: row.email, name: row.name,
      result: row.created ? 'created' : 'existed',
    })}],
  };
}

Idempotency-key pattern (when no natural key exists):

const { idempotencyKey, ...payload } = parsed.data;
const existing = await db.query(`SELECT result FROM idempotency WHERE key = $1`, [idempotencyKey]);
if (existing.rows.length) return { content: [{ type: 'text', text: existing.rows[0].result }] };
// ... do the actual work, then:
await db.query(`INSERT INTO idempotency (key, result) VALUES ($1, $2)`, [idempotencyKey, JSON.stringify(out)]);

Why it matters: the LLM will retry on errors, network glitches, or because the user paraphrased. Without idempotency you end up with five "John Doe" contacts and an angry user.

Schritt 5: Error responses that help the LLM recover

Bad error: { isError: true, content: [{ text: 'Failed' }] }. The LLM has nothing to act on.

Good error:

function toErrorResponse(code: string, message: string, hint?: string) {
  return {
    isError: true,
    content: [{
      type: 'text',
      text: JSON.stringify({ error: code, message, hint }),
    }],
  };
}

// Use:
if (!parsed.success) return toErrorResponse(
  'INVALID_INPUT',
  'email or contactId is required',
  'Pass exactly one of: email (string) or contactId (UUID).',
);

if (!found) return toErrorResponse(
  'NOT_FOUND',
  `No contact with email ${email}`,
  'Did you mean to call crm_search_contacts to find similar emails?',
);

The hint field is what makes the difference. The LLM reads it and recovers, calls the suggested next tool, fixes the input, or asks the user the right question. Without hints, the LLM just apologizes and stops.

Schritt 6: Verify

Run academy_validate_step. The validator re-checks package.json for @modelcontextprotocol/sdk and a bin/main entry, the plumbing from 6.1. You'll add the actual tool design quality checks in tests in 6.4.

Quick checklist before shipping a tool

Name follows <server>_<verb>_<noun>
Description names the purpose, the return shape, and one disambiguation
Each field in inputSchema has its own description
additionalProperties: false
Idempotent on natural keys (or via idempotency-key)
Errors are typed { error, message, hint } with actionable hints
The tool description names another tool in at least one place ("for X use Y")

If a tool fails any of these, it will work, but it will be the tool the LLM picks when nothing else fits. You want to be the first choice, not the fallback.

← Build Your Own MCP, minimal se Input validation. Zod patterns →