navbarLogo
EXPERIMENT

AI can make mistakes. Google is not responsible for third-party model output.

open_in_new

Documentation

settings

Settings

chevron_left

Evaluator Gallery

chevron_right

Configure Evaluator

Select model

info
tune

Define Evaluator Prompt

Prompt

info

Define the prompt for your LLM evaluator. This prompt instructs a 'judge' LLM on how to score your AI's output, letting you measure nuanced criteria like brand voice or creativity. Your prompt should clearly define the assessment criteria and provide a rubric of categories for the judge to select from.

Your instructions must use {{output}} for the AI's generated response and can optionally use {{input}} for the original user prompt, {{history}} for chat history, {{expected_output}} for the ground truth, and {{metadata.key}} for variables defined in metadata. These variables will be replaced with actual data from your project during evaluation.

Map Rubric Categories to Scores

Define how rubric categories from the evaluator prompt should be scored and colored to show up in Projects and Analytics.

Rubric category

info

Score mapping

info

Score color

info
keyboard_arrow_down
Green
keyboard_arrow_down
Orange
keyboard_arrow_down
Red