AI

Latency

Also known as: Response Time, TTFT

The time between sending an API request and receiving the first token of the response (Time To First Token). XALEN models range from 25ms (Llama 3.2 1B) to 2000ms (o3 reasoning).

Related Terms

Inference Streaming

Build with Latency on XALEN's API.

Get Started

Last updated: 2026-05-21