Loading...
    • Developer Guide
    • API Reference
    • MCP
    • Resources
    • Release Notes
    Search...
    ⌘K

    First steps

    Intro to ClaudeQuickstart

    Models & pricing

    Models overviewChoosing a modelWhat's new in Claude 4.5Migrating to Claude 4.5Model deprecationsPricing

    Build with Claude

    Features overviewUsing the Messages APIContext windowsPrompting best practices

    Capabilities

    Prompt cachingContext editingExtended thinkingStreaming MessagesBatch processingCitationsMultilingual supportToken countingEmbeddingsVisionPDF supportFiles APISearch resultsGoogle Sheets add-on

    Tools

    OverviewHow to implement tool useToken-efficient tool useFine-grained tool streamingBash toolCode execution toolComputer use toolText editor toolWeb fetch toolWeb search toolMemory tool

    Agent Skills

    OverviewQuickstartBest practicesUsing Skills with the API

    Agent SDK

    OverviewTypeScript SDKPython SDK

    Guides

    Streaming InputHandling PermissionsSession ManagementHosting the Agent SDKModifying system promptsMCP in the SDKCustom ToolsSubagents in the SDKSlash Commands in the SDKAgent Skills in the SDKTracking Costs and UsageTodo ListsPlugins in the SDK

    MCP in the API

    MCP connectorRemote MCP servers

    Claude on 3rd-party platforms

    Amazon BedrockVertex AI

    Prompt engineering

    OverviewPrompt generatorUse prompt templatesPrompt improverBe clear and directUse examples (multishot prompting)Let Claude think (CoT)Use XML tagsGive Claude a role (system prompts)Prefill Claude's responseChain complex promptsLong context tipsExtended thinking tips

    Test & evaluate

    Define success criteriaDevelop test casesUsing the Evaluation ToolReducing latency

    Strengthen guardrails

    Reduce hallucinationsIncrease output consistencyMitigate jailbreaksStreaming refusalsReduce prompt leakKeep Claude in character

    Administration and monitoring

    Admin API overviewUsage and Cost APIClaude Code Analytics API
    Console
    Test & evaluate

    Using the Evaluation Tool

    The Claude Console features an Evaluation tool that allows you to test your prompts under various scenarios.

    Accessing the Evaluate Feature

    To get started with the Evaluation tool:

    1. Open the Claude Console and navigate to the prompt editor.
    2. After composing your prompt, look for the 'Evaluate' tab at the top of the screen.

    Accessing Evaluate Feature

    Ensure your prompt includes at least 1-2 dynamic variables using the double brace syntax: {{variable}}. This is required for creating eval test sets.

    Generating Prompts

    The Console offers a built-in prompt generator powered by Claude Opus 4.1:

    1. 1

      Click 'Generate Prompt'

      Clicking the 'Generate Prompt' helper tool will open a modal that allows you to enter your task information.

    2. 2

      Describe your task

      Describe your desired task (e.g., "Triage inbound customer support requests") with as much or as little detail as you desire. The more context you include, the more Claude can tailor its generated prompt to your specific needs.

    3. 3

      Generate your prompt

      Clicking the orange 'Generate Prompt' button at the bottom will have Claude generate a high quality prompt for you. You can then further improve those prompts using the Evaluation screen in the Console.

    This feature makes it easier to create prompts with the appropriate variable syntax for evaluation.

    Prompt Generator

    Creating Test Cases

    When you access the Evaluation screen, you have several options to create test cases:

    1. Click the '+ Add Row' button at the bottom left to manually add a case.
    2. Use the 'Generate Test Case' feature to have Claude automatically generate test cases for you.
    3. Import test cases from a CSV file.

    To use the 'Generate Test Case' feature:

    1. 1

      Click on 'Generate Test Case'

      Claude will generate test cases for you, one row at a time for each time you click the button.

    2. 2

      Edit generation logic (optional)

      You can also edit the test case generation logic by clicking on the arrow dropdown to the right of the 'Generate Test Case' button, then on 'Show generation logic' at the top of the Variables window that pops up. You may have to click `Generate' on the top right of this window to populate initial generation logic.

      Editing this allows you to customize and fine tune the test cases that Claude generates to greater precision and specificity.

    Here's an example of a populated Evaluation screen with several test cases:

    Populated Evaluation Screen

    If you update your original prompt text, you can re-run the entire eval suite against the new prompt to see how changes affect performance across all test cases.

    Tips for Effective Evaluation

    Use the 'Generate a prompt' helper tool in the Console to quickly create prompts with the appropriate variable syntax for evaluation.

    Understanding and comparing results

    The Evaluation tool offers several features to help you refine your prompts:

    1. Side-by-side comparison: Compare the outputs of two or more prompts to quickly see the impact of your changes.
    2. Quality grading: Grade response quality on a 5-point scale to track improvements in response quality per prompt.
    3. Prompt versioning: Create new versions of your prompt and re-run the test suite to quickly iterate and improve results.

    By reviewing results across test cases and comparing different prompt versions, you can spot patterns and make informed adjustments to your prompt more efficiently.

    Start evaluating your prompts today to build more robust AI applications with Claude!

    • Accessing the Evaluate Feature
    • Generating Prompts
    • Creating Test Cases
    • Tips for Effective Evaluation
    • Understanding and comparing results
    © 2025 ANTHROPIC PBC

    Products

    • Claude
    • Claude Code
    • Max plan
    • Team plan
    • Enterprise plan
    • Download app
    • Pricing
    • Log in

    Features

    • Claude and Slack
    • Claude in Excel

    Models

    • Opus
    • Sonnet
    • Haiku

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Claude Developer Platform

    • Overview
    • Developer docs
    • Pricing
    • Amazon Bedrock
    • Google Cloud’s Vertex AI
    • Console login

    Learn

    • Blog
    • Catalog
    • Courses
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Help and security

    • Availability
    • Status
    • Support center

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy

    Products

    • Claude
    • Claude Code
    • Max plan
    • Team plan
    • Enterprise plan
    • Download app
    • Pricing
    • Log in

    Features

    • Claude and Slack
    • Claude in Excel

    Models

    • Opus
    • Sonnet
    • Haiku

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Claude Developer Platform

    • Overview
    • Developer docs
    • Pricing
    • Amazon Bedrock
    • Google Cloud’s Vertex AI
    • Console login

    Learn

    • Blog
    • Catalog
    • Courses
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Help and security

    • Availability
    • Status
    • Support center

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy
    © 2025 ANTHROPIC PBC