Guides

Welcome to the GuideLLM guides section! Here you'll find comprehensive documentation covering key components and concepts of the GuideLLM platform. These guides will help you understand the inner workings of GuideLLM, how to configure its various components, and how to interpret benchmark results to optimize your LLM deployments.

Whether you're interested in understanding the system architecture, exploring supported backends, configuring datasets, analyzing metrics, or setting service level objectives, these guides provide the detailed information you need to make informed decisions about your LLM deployments.

Key Guides

Architecture

Understanding the modular design of GuideLLM and how core components interact to evaluate LLM deployments.

Architecture Overview

Backends

Learn about supported LLM backends and how to set up OpenAI-compatible servers for benchmarking.

Backend Guide

Datasets

Configure and use different data sources for benchmarking, including synthetic data, Hugging Face datasets, and file-based options.

Dataset Guide

Metrics

Explore the comprehensive metrics provided by GuideLLM to evaluate performance, including latency, throughput, and token-level analysis.

Metrics Guide

Outputs

Learn about supported output formats and how to customize result reporting for your benchmarks.

Output Guide

Service Level Objectives

Define and implement SLOs and SLAs for your LLM deployments to ensure reliability and performance.

SLO Guide

Over-Saturation Stopping

Automatically detect and stop benchmarks when models become over-saturated to prevent wasted compute resources and ensure valid results.

Over-Saturation Guide

Tool Calling

Benchmark multi-turn tool calling workloads with pre-anticipated tool call turns, synthetic data, and dataset-driven tool definitions.

Tool Calling Guide

Multimodal Benchmarking

Set up benchmarks for multimodal models including text+image, video, and audio tasks.

Multimodal Guide

Troubleshooting

How to troubleshoot common errors.

Troubleshooting Guide