How does XALEN use Latency?

XALEN's AI models and APIs incorporate Latency as part of its AI infrastructure capabilities, accessible through the /v1/chat/completions endpoint.

Latency

Also known as: Response Time, TTFT

The time between sending an API request and receiving the first token of the response (Time To First Token). XALEN models range from 25ms (Llama 3.2 1B) to 2000ms (o3 reasoning).

Related Terms

Inference Streaming

Build with Latency on XALEN's API.

Get Started

Last updated: 2026-05-21