Is the XALEN API compatible with OpenAI?

Yes. XALEN provides an OpenAI-compatible /v1/chat/completions endpoint. You can use any OpenAI SDK by changing the base URL to https://api.xalen.io and using your XALEN API key. No code changes needed beyond the URL and key.

How do I authenticate with the XALEN API?

Include your API key in the Authorization header as a Bearer token: 'Authorization: Bearer xln_live_YOUR_KEY'. API keys are created in the dashboard and use the xln_live_ prefix for production.

What astrology endpoints are available?

XALEN offers astrology computation endpoints across four systems: Vedic (Parashari/Jaimini), Western (Tropical), KP (Krishnamurti Paddhati), and Vastu (Shastra). Endpoints include kundali generation, dasha prediction, transit analysis, matchmaking, panchang, and muhurta calculations.

Does XALEN support streaming responses?

Yes. Set stream: true in your chat completions request to receive Server-Sent Events (SSE). This works identically to OpenAI's streaming format with delta tokens.

What SDKs are available?

XALEN provides official SDKs for Python (pip install xalen) and JavaScript/TypeScript (npm install xalen-sdk), plus an MCP server (npx xalen-mcp) for AI agent frameworks. All SDKs are open-source.

API Documentation

Switching from OpenAI, Anthropic, or OpenRouter?

XALEN is OpenAI-compatible. Change one line of code:

# Before: base_url = "https://api.openai.com/v1"
# After:
base_url = "https://api.xalen.io/v1"

Your existing SDK code works. Same format, same streaming, same function calling. Plus you get 200+ models, 13 pre-built agents, Studio website builder, and custom agent builder.

Authentication

All API requests require a Bearer token. Include your API key in the Authorization header of every request.

Header Authorization: Bearer xln_live_YOUR_API_KEY

API keys start with xln_live_. Generate one from your Dashboard after signing up.

Keep your key secret

Never expose API keys in client-side code or public repositories. Use environment variables or a backend proxy.

Base URL

All endpoints are served from a single base URL. Append the endpoint path to this URL for every request.

Base URL https://api.xalen.io

For example, to call Chat Completions: POST https://api.xalen.io/v1/chat/completions

Quick Start

Make your first API call in under 60 seconds. Install an SDK or use cURL directly.

Python

JavaScript

cURL

pip install xalen

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

response = client.chat.completions.create(
    model="vedika-standard",
    messages=[{"role": "user", "content": "Hello, world!"}]
)

print(response.choices[0].message.content)

npm install xalen-sdk

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

const response = await client.chat.completions.create({
  model: "vedika-standard",
  messages: [{ role: "user", content: "Hello, world!" }],
});

console.log(response.choices[0].message.content);

curl https://api.xalen.io/v1/chat/completions \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vedika-standard",
    "messages": [{"role": "user", "content": "Hello, world!"}]
  }'

Chat Completions

POST /v1/chat/completions

Generate a model response for a conversation. Compatible with the OpenAI Chat Completions API format, so existing OpenAI SDK code works by changing only the base URL and API key.

Anthropic Claude models now available. Access Claude Opus 4.7, Sonnet 4.6, Opus 4.6, Haiku 4.5, Sonnet 4.5, and Claude 3.5 Haiku through the same /v1/chat/completions endpoint. No code changes needed — just set model to claude-opus-4.7, claude-sonnet-4.6, etc.

Request Body

Parameter	Type	Required	Description
model	string	Required	Model ID. e.g. `vedika-standard`, `claude-sonnet-4.6`, `claude-opus-4.7`, `llama-4-maverick`
messages	array	Required	Array of message objects with `role` (`system`, `user`, `assistant`) and `content`.
temperature	number	Optional	Sampling temperature between 0 and 2. Default: `1`.
max_tokens	integer	Optional	Maximum tokens to generate. Default: model-specific.
stream	boolean	Optional	Stream partial responses as Server-Sent Events. Default: `false`.
top_p	number	Optional	Nucleus sampling threshold. Default: `1`.
stop	string \| array	Optional	Up to 4 sequences where the model will stop generating.

Code Examples

Python

JavaScript

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

response = client.chat.completions.create(
    model="vedika-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is my birth chart?"}
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

const response = await client.chat.completions.create({
  model: "vedika-pro",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is my birth chart?" },
  ],
  temperature: 0.7,
  max_tokens: 1024,
});

console.log(response.choices[0].message.content);

curl https://api.xalen.io/v1/chat/completions \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vedika-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is my birth chart?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Response

JSON Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1717200000,
  "model": "vedika-pro",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "To generate your birth chart, I need your date, time, and place of birth..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  }
}

Try it in Playground →

List Models

GET /v1/models

Returns a list of all available models. Use this to discover model IDs, capabilities, and pricing.

Code Examples

Python

JavaScript

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

models = client.models.list()
for m in models.data:
    print(m.id, m.owned_by)

import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

const models = await client.models.list();
models.data.forEach(m => console.log(m.id, m.owned_by));

curl https://api.xalen.io/v1/models \
  -H "Authorization: Bearer xln_live_YOUR_KEY"

Response

JSON Response

{
  "object": "list",
  "data": [
    {
      "id": "vedika-standard",
      "object": "model",
      "owned_by": "xalen",
      "permission": []
    },
    {
      "id": "vedika-pro",
      "object": "model",
      "owned_by": "xalen",
      "permission": []
    },
    {
      "id": "claude-opus-4.7",
      "object": "model",
      "owned_by": "anthropic",
      "permission": []
    },
    {
      "id": "claude-sonnet-4.6",
      "object": "model",
      "owned_by": "anthropic",
      "permission": []
    }
  ]
}

Embeddings

POST /v1/embeddings

Generate vector embeddings for text input. Use for semantic search, clustering, or recommendation systems.

Request Body

Parameter	Type	Required	Description
model	string	Required	Embedding model ID. e.g. `text-embedding-3-small`
input	string \| array	Required	Text to embed. Can be a single string or array of strings.
encoding_format	string	Optional	`float` (default) or `base64`.

Code Examples

Python

JavaScript

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Vedic astrology birth chart analysis"
)

print(len(response.data[0].embedding))  # 1536

import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

const response = await client.embeddings.create({
  model: "text-embedding-3-small",
  input: "Vedic astrology birth chart analysis",
});

console.log(response.data[0].embedding.length); // 1536

curl https://api.xalen.io/v1/embeddings \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "text-embedding-3-small", "input": "Vedic astrology birth chart analysis"}'

Response

JSON Response

{
  "object": "list",
  "data": [{
    "object": "embedding",
    "index": 0,
    "embedding": [0.0023, -0.0091, 0.0152, ...]
  }],
  "model": "text-embedding-3-small",
  "usage": { "prompt_tokens": 6, "total_tokens": 6 }
}

Image Generation

POST /v1/images/generations

Generate images from text prompts. Returns one or more image URLs or base64-encoded data.

Request Body

Parameter	Type	Required	Description
prompt	string	Required	Text description of the image to generate.
model	string	Optional	Image model ID. Default: platform default.
n	integer	Optional	Number of images. Default: `1`. Max: `4`.
size	string	Optional	`256x256`, `512x512`, or `1024x1024`. Default: `1024x1024`.
response_format	string	Optional	`url` (default) or `b64_json`.

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

response = client.images.generate(
    prompt="A serene Hindu temple at sunrise, watercolor style",
    size="1024x1024"
)

print(response.data[0].url)

curl https://api.xalen.io/v1/images/generations \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A serene Hindu temple at sunrise, watercolor style", "size": "1024x1024"}'

Response

JSON Response

{
  "created": 1717200000,
  "data": [{
    "url": "https://api.xalen.io/files/img-abc123.png"
  }]
}

Text to Speech

POST /v1/audio/speech

Convert text to natural-sounding speech. Supports multiple voices and output formats.

Request Body

Parameter	Type	Required	Description
model	string	Required	TTS model ID. e.g. `tts-1`, `tts-1-hd`
input	string	Required	Text to convert. Max 4096 characters.
voice	string	Required	Voice ID. Options: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`
response_format	string	Optional	`mp3` (default), `opus`, `aac`, `flac`, `wav`
speed	number	Optional	0.25 to 4.0. Default: `1.0`.

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

response = client.audio.speech.create(
    model="tts-1",
    voice="nova",
    input="Welcome to your daily horoscope reading."
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

curl https://api.xalen.io/v1/audio/speech \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "tts-1", "voice": "nova", "input": "Welcome to your daily horoscope reading."}' \
  --output output.mp3

Returns raw audio bytes in the requested format.

Speech to Text

POST /v1/audio/transcriptions

Transcribe audio to text. Supports multiple languages including 14 Indian languages.

Request Body (multipart/form-data)

Parameter	Type	Required	Description
file	file	Required	Audio file (mp3, mp4, mpeg, mpga, m4a, wav, webm). Max 25 MB.
model	string	Required	Transcription model. e.g. `whisper-1`
language	string	Optional	ISO-639-1 code. e.g. `hi`, `ta`, `te`, `en`
response_format	string	Optional	`json` (default), `text`, `verbose_json`

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=f,
        language="hi"
    )

print(transcript.text)

curl https://api.xalen.io/v1/audio/transcriptions \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -F [email protected] \
  -F model=whisper-1 \
  -F language=hi

Response

JSON Response

{
  "text": "Transcribed text content here..."
}

Voice AI

POST /v1/voice/binary

End-to-end voice conversation: send audio, get audio back. Combines speech recognition, AI reasoning, and text-to-speech in a single call. Supports 31 languages with sub-200ms latency.

Request Body (multipart/form-data)

Parameter	Type	Required	Description
audio	file	Required	Audio input file (wav, mp3, webm, ogg).
language	string	Optional	ISO-639-1 code. Auto-detected if omitted.
voice	string	Optional	Response voice ID. Default: `nova`.
context	string	Optional	System prompt for the AI reasoning layer.
birth_details	object	Optional	For astrology queries: `{ "date": "1990-01-15", "time": "14:30", "place": "Mumbai" }`

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

with open("question.wav", "rb") as f:
    response = client.voice.binary(
        audio=f,
        language="hi",
        voice="nova"
    )

with open("answer.mp3", "wb") as f:
    f.write(response.audio)

curl https://api.xalen.io/v1/voice/binary \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -F [email protected] \
  -F language=hi \
  -F voice=nova \
  --output answer.mp3

Returns binary audio in mp3 format by default. The response includes a X-Transcript header with the text transcription and X-Response-Text with the AI's text reply.

Astrology AI Query

POST /v1/chat/completions

Ask any astrology question in natural language. Use model: "vedika-standard" or model: "vedika-fast" in the standard Chat Completions endpoint. The Vedika engine handles birth chart computation, classical text grounding, RAG retrieval, and multi-language response generation automatically.

Example

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

response = client.chat.completions.create(
    model="vedika-standard",
    messages=[
        {"role": "user", "content": "I was born on 15 Jan 1990 at 2:30 PM in Pune. What is my current Mahadasha and its effects?"}
    ]
)

print(response.choices[0].message.content)
# Includes: grounded answer, classical citations, follow-up suggestions

curl -X POST "https://api.xalen.io/v1/chat/completions" \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vedika-standard",
    "messages": [{"role": "user", "content": "What is Rahu in the 7th house according to BPHS?"}]
  }'

The Vedika AI engine supports: birth chart analysis, dasha predictions, transit effects, compatibility matching, muhurta selection, panchang queries, yoga identification, and remedial suggestions, all through natural language conversation.

Structured Astrology Data

For structured JSON endpoints, use Vedika API directly

If you need raw structured data (birth charts, planetary positions, panchang, dasha timelines, divisional charts D1-D60, yoga calculations, compatibility scores), the Vedika API provides 130+ computation endpoints with structured JSON responses.

✓ Birth chart (Kundali) generation

✓ Vimshottari Dasha timelines

✓ Panchang (Tithi, Nakshatra, Yoga)

✓ Ashtakoot compatibility (36 gunas)

✓ Transit & progression analysis

✓ Divisional charts (D1-D60)

✓ Yoga identification (131 yogas)

✓ Western synastry & composite

Vedika API Docs Try Sandbox

When to use XALEN vs Vedika: Use XALEN's /v1/chat/completions with vedika-standard for natural language AI queries with grounding and citations. Use Vedika's structured API directly when you need raw JSON computation data (chart objects, planetary degrees, dasha trees) for building custom UIs.

Kundali Generation

GET /v1/astrology/kundali

Generate a complete Vedic birth chart (Kundali) with planetary positions, house placements, nakshatras, yogas, and dashas. Powered by the Vedika Ephemeris engine with arc-second precision.

Query Parameters

Parameter	Type	Required	Description
date	string	Required	Birth date in `YYYY-MM-DD` format.
time	string	Required	Birth time in `HH:MM` 24-hour format.
lat	number	Required	Latitude of birth place. e.g. `18.5204`
lon	number	Required	Longitude of birth place. e.g. `73.8567`
tz	number	Optional	Timezone offset in hours. e.g. `5.5` for IST. Auto-detected from coordinates if omitted.
system	string	Optional	`vedic` (default), `western`, or `kp`.
language	string	Optional	Response language. ISO-639-1 code. Default: `en`.

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

chart = client.astrology.kundali(
    date="1990-01-15",
    time="14:30",
    lat=18.5204,
    lon=73.8567
)

print(chart.ascendant)       # "Taurus"
print(chart.moon_sign)       # "Scorpio"
print(chart.planets)         # Detailed planetary positions
print(chart.yogas)           # Active yogas

curl "https://api.xalen.io/v1/astrology/kundali?date=1990-01-15&time=14:30&lat=18.5204&lon=73.8567" \
  -H "Authorization: Bearer xln_live_YOUR_KEY"

Response

JSON Response (abbreviated)

{
  "ascendant": { "sign": "Taurus", "degree": 14.32, "nakshatra": "Rohini" },
  "moon_sign": "Scorpio",
  "sun_sign": "Capricorn",
  "planets": [
    { "name": "Sun", "sign": "Capricorn", "house": 9, "degree": 0.85, "nakshatra": "Uttara Ashadha", "retrograde": false },
    { "name": "Moon", "sign": "Scorpio", "house": 7, "degree": 22.41, "nakshatra": "Jyeshtha", "retrograde": false }
  ],
  "houses": [ ... ],
  "yogas": [
    { "name": "Gaja Kesari Yoga", "description": "Jupiter in kendra from Moon", "strength": "strong" }
  ],
  "dasha": { "current": "Venus", "sub": "Mercury", "start": "2024-03-12", "end": "2026-08-04" },
  "engine": "vedika-ephemeris"
}

Run Agent

POST /v1/agents/run

Execute a pre-built AI agent with a single API call. Agents are purpose-built for specific tasks like temple management, devotional content, and spiritual guidance.

Request Body

Parameter	Type	Required	Description
agent_id	string	Required	Agent identifier. e.g. `temple-assistant`, `kundali-reader`, `puja-planner`
input	string	Required	User message or task description.
context	object	Optional	Additional data for the agent (birth details, location, preferences).
stream	boolean	Optional	Stream the response. Default: `false`.

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

result = client.agents.run(
    agent_id="kundali-reader",
    input="What career paths suit my chart?",
    context={
        "birth_date": "1990-01-15",
        "birth_time": "14:30",
        "birth_place": "Pune, India"
    }
)

print(result.output)

curl https://api.xalen.io/v1/agents/run \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "kundali-reader",
    "input": "What career paths suit my chart?",
    "context": {"birth_date": "1990-01-15", "birth_time": "14:30", "birth_place": "Pune, India"}
  }'

Response

JSON Response

{
  "agent_id": "kundali-reader",
  "output": "Based on your chart with Taurus ascendant and strong 10th house...",
  "usage": { "prompt_tokens": 450, "completion_tokens": 320, "total_tokens": 770 }
}

Check Balance

GET /v1/billing/balance

Retrieve your current wallet balance and usage summary.

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

balance = client.billing.balance()
print(f"${balance.available / 100:.2f}")  # Wallet in USD

curl https://api.xalen.io/v1/billing/balance \
  -H "Authorization: Bearer xln_live_YOUR_KEY"

Response

JSON Response

{
  "available": 4250,
  "currency": "usd_cents",
  "total_spent": 1750,
  "plan": "pay-as-you-go"
}

Add Funds

POST /v1/billing/deposit

Add funds to your wallet. Returns a Razorpay payment link. Minimum deposit: $10.

Request Body

Parameter	Type	Required	Description
amount	integer	Required	Amount in USD cents. Minimum: `1000` ($10).
currency	string	Optional	`usd` (default) or `inr`.

Code Examples

Python

cURL

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

payment = client.billing.deposit(amount=5000)  # $50
print(payment.payment_url)  # Razorpay checkout link

curl https://api.xalen.io/v1/billing/deposit \
  -H "Authorization: Bearer xln_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"amount": 5000}'

Response

JSON Response

{
  "payment_id": "pay_abc123",
  "payment_url": "https://rzp.io/l/xalen-deposit",
  "amount": 5000,
  "currency": "usd_cents",
  "status": "pending"
}

Billing Mechanics

XALEN uses a prepaid wallet model. Deposit funds, then pay per API call. No surprise invoices, no credit card holds, no overages unless you opt in.

Wallet Model

Feature	Details
Deposit	Add funds via card or UPI. Minimum deposit: $10 (Pay-as-you-go).
First deposit bonus	$10 bonus on your first $100+ deposit. Applied automatically.
Per-call billing	Each API call deducts from your wallet based on model, tokens, and endpoint type.
Currency	Wallet stored in USD cents. Display values divide by 100.

Balance & Usage Tracking

Endpoint	Method	Description
`/v1/billing/balance`	GET	Real-time wallet balance, total spent, and current plan.
`/v1/billing/usage?period=current`	GET	Detailed usage breakdown for the current billing period.

Plan Switching

Upgrades take effect immediately. Your new rate limits and features are available within seconds. Downgrades are deferred to the end of the current billing cycle to avoid mid-cycle disruption.

Spending Controls

Control	Details
Spending alerts	Configurable notifications at 50%, 80%, 90%, and 100% of your budget threshold.
Hard spending cap	Set via dashboard. API calls return `402 Payment Required` when the cap is exceeded.
Auto-pause	Optional. Stops API calls entirely when your budget is reached instead of allowing overage.

No Free Tier

Every API call costs money. XALEN does not offer free credits, trials, or complimentary usage. Use the Playground to test models before committing to a deposit.

Fine-Tuning

Fine-tune any supported model on your own data. Upload JSONL training files, configure hyperparameters, and deploy a custom model tailored to your use case. XALEN supports standard full-parameter fine-tuning and lightweight LoRA adapters depending on the base model.

Supported Base Models

Model	Type	Min Training Examples
Llama 3.3 70B	Standard	100
DeepSeek V3	Standard	100
Qwen 2.5 72B	Standard	100
Llama 4 Scout 17B	LoRA	50
Qwen3 235B	LoRA	50

Training Data Format

Upload training data as a JSONL file. Each line must be a valid JSON object containing a messages array with system, user, and assistant turns.

JSONL Format (one line per example)

{"messages": [{"role": "system", "content": "You are a Vedic astrology expert."}, {"role": "user", "content": "What does Saturn in the 7th house mean?"}, {"role": "assistant", "content": "Saturn in the 7th house indicates..."}]}
{"messages": [{"role": "system", "content": "You are a Vedic astrology expert."}, {"role": "user", "content": "Explain Rahu Mahadasha effects."}, {"role": "assistant", "content": "During Rahu Mahadasha..."}]}

Upload Training Data

POST /v1/files

Upload your JSONL training file. The response includes a file_id used when creating a fine-tuning job.

Parameter	Type	Required	Description
file	file	Required	The JSONL file to upload.
purpose	string	Required	Must be `fine-tune`.

Create Fine-Tuning Job

POST /v1/fine-tuning/jobs

Create a new fine-tuning job. The job trains asynchronously and you can poll its status or list all jobs.

Parameter	Type	Required	Description
model	string	Required	Base model ID. e.g. `llama-3.3-70b`, `deepseek-v3`, `qwen-2.5-72b`
training_file	string	Required	File ID returned from the upload endpoint.
hyperparameters	object	Optional	Training config: `epochs` (default 3), `learning_rate` (default auto), `batch_size` (default auto).
suffix	string	Optional	Custom suffix for the fine-tuned model name. Max 18 characters.

List Fine-Tuning Jobs

GET /v1/fine-tuning/jobs

Returns a list of all fine-tuning jobs for your account, including status, progress, and result model IDs.

Get Job Status

GET /v1/fine-tuning/jobs/{id}

Retrieve the current status and details of a specific fine-tuning job.

Cancel Job

POST /v1/fine-tuning/jobs/{id}/cancel

Cancel a running fine-tuning job. Jobs that have already completed cannot be cancelled.

Code Examples

Python

JavaScript

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

# 1. Upload training data
training_file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)

# 2. Create fine-tuning job
job = client.fine_tuning.jobs.create(
    model="llama-3.3-70b",
    training_file=training_file.id,
    hyperparameters={
        "epochs": 3,
        "learning_rate": 1e-5,
        "batch_size": 4
    },
    suffix="astro-expert"
)

print(f"Job created: {job.id}, status: {job.status}")

# 3. Poll for completion
import time
while job.status not in ["succeeded", "failed", "cancelled"]:
    time.sleep(30)
    job = client.fine_tuning.jobs.retrieve(job.id)
    print(f"Status: {job.status}")

# 4. Use the fine-tuned model
if job.status == "succeeded":
    response = client.chat.completions.create(
        model=job.fine_tuned_model,
        messages=[{"role": "user", "content": "Analyze my chart"}]
    )
    print(response.choices[0].message.content)

import Xalen from "xalen-sdk";
import fs from "fs";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

// 1. Upload training data
const trainingFile = await client.files.create({
  file: fs.createReadStream("training_data.jsonl"),
  purpose: "fine-tune",
});

// 2. Create fine-tuning job
const job = await client.fineTuning.jobs.create({
  model: "llama-3.3-70b",
  training_file: trainingFile.id,
  hyperparameters: {
    epochs: 3,
    learning_rate: 1e-5,
    batch_size: 4,
  },
  suffix: "astro-expert",
});

console.log(`Job created: ${job.id}, status: ${job.status}`);

// 3. Poll for completion
let status = job.status;
while (!["succeeded", "failed", "cancelled"].includes(status)) {
  await new Promise((r) => setTimeout(r, 30000));
  const updated = await client.fineTuning.jobs.retrieve(job.id);
  status = updated.status;
  console.log(`Status: ${status}`);
}

// 4. Use the fine-tuned model
if (status === "succeeded") {
  const updated = await client.fineTuning.jobs.retrieve(job.id);
  const response = await client.chat.completions.create({
    model: updated.fine_tuned_model,
    messages: [{ role: "user", content: "Analyze my chart" }],
  });
  console.log(response.choices[0].message.content);
}

Response

JSON Response

{
  "id": "ftjob-abc123",
  "object": "fine_tuning.job",
  "model": "llama-3.3-70b",
  "status": "running",
  "training_file": "file-xyz789",
  "hyperparameters": {
    "epochs": 3,
    "learning_rate": 1e-5,
    "batch_size": 4
  },
  "fine_tuned_model": null,
  "created_at": 1717200000,
  "finished_at": null,
  "trained_tokens": 45200
}

Fine-Tuning Pricing

Fine-tuning is billed per training token. Pricing varies by base model. Check the Pricing page for current per-token training rates. Inference on fine-tuned models is billed at the base model rate.

Dedicated Endpoints

Reserved GPU capacity for predictable performance. Dedicated endpoints give you isolated compute with no shared rate limits, built-in prompt caching, and support for hot-swapping LoRA adapters. Ideal for production workloads that demand consistent latency and throughput.

Hardware Options

Hardware	VRAM	Price	Best For
NVIDIA H100 80GB	80 GB	$3.49/GPU-hr	Large models, high throughput
NVIDIA A100 80GB	80 GB	$2.09/GPU-hr	Mid-range models
NVIDIA L40S 48GB	48 GB	$1.19/GPU-hr	Smaller models, cost-optimized

Features

Feature	Details
Prompt Caching	Up to 90% cost reduction on repeated prompt prefixes. Cached automatically on dedicated infrastructure.
Autoscaling	Scale from 1 to 16 replicas based on traffic. Configure min/max replicas and scaling thresholds.
LoRA Adapters	Hot-swap fine-tuned LoRA weights without redeploying. Attach multiple adapters to a single base model.
Custom Scaling Policies	Define scaling rules based on request queue depth, latency percentiles, or GPU utilization.
Isolated Rate Limits	Your endpoint is not affected by other tenants. Full throughput capacity is reserved for your traffic.

Create Dedicated Endpoint

POST /v1/endpoints

Parameter	Type	Required	Description
model	string	Required	Model to deploy. e.g. `llama-3.3-70b`, `deepseek-v3`
hardware	string	Required	GPU type: `h100-80gb`, `a100-80gb`, or `l40s-48gb`
min_replicas	integer	Optional	Minimum replicas. Default: `1`.
max_replicas	integer	Optional	Maximum replicas for autoscaling. Default: `1`.

List Endpoints

GET /v1/endpoints

Returns all dedicated endpoints for your account, including status, hardware, model, and current replica count.

Delete Endpoint

POST /v1/endpoints/{id}/delete

Shut down a dedicated endpoint. Billing stops when the endpoint is fully deprovisioned.

Code Examples

Python

JavaScript

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

# Create a dedicated endpoint
endpoint = client.endpoints.create(
    model="llama-3.3-70b",
    hardware="h100-80gb",
    min_replicas=1,
    max_replicas=4
)

print(f"Endpoint: {endpoint.id}, status: {endpoint.status}")

# Use the dedicated endpoint for inference
response = client.chat.completions.create(
    model=endpoint.model,
    messages=[{"role": "user", "content": "Hello!"}],
    extra_headers={"X-Endpoint-Id": endpoint.id}
)

print(response.choices[0].message.content)

# List all endpoints
endpoints = client.endpoints.list()
for ep in endpoints.data:
    print(f"{ep.id}: {ep.model} on {ep.hardware} ({ep.status})")

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

// Create a dedicated endpoint
const endpoint = await client.endpoints.create({
  model: "llama-3.3-70b",
  hardware: "h100-80gb",
  min_replicas: 1,
  max_replicas: 4,
});

console.log(`Endpoint: ${endpoint.id}, status: ${endpoint.status}`);

// Use the dedicated endpoint for inference
const response = await client.chat.completions.create({
  model: endpoint.model,
  messages: [{ role: "user", content: "Hello!" }],
}, {
  headers: { "X-Endpoint-Id": endpoint.id },
});

console.log(response.choices[0].message.content);

// List all endpoints
const endpoints = await client.endpoints.list();
for (const ep of endpoints.data) {
  console.log(`${ep.id}: ${ep.model} on ${ep.hardware} (${ep.status})`);
}

Available on Scale Tier and Above

Dedicated endpoints require a Scale ($2,499/mo), Dedicated ($5,000+/mo), or Enterprise plan. Pay-as-you-go and Growth plans use shared serverless infrastructure.

Batch Processing

Process large workloads asynchronously at 50% lower cost. Submit up to 1,000 requests per batch, and XALEN processes them in the background with optimized throughput. Ideal for data labeling, content generation, bulk analysis, and any workload that does not require real-time responses.

Create Batch

POST /v1/batch

Parameter	Type	Required	Description
requests	array	Required	Array of up to 1,000 request objects. Each object has the same schema as a `/v1/chat/completions` request body.
model	string	Required	Model to use for all requests in the batch. e.g. `deepseek-v3`, `llama-3.3-70b`
metadata	object	Optional	Key-value metadata to attach to the batch for tracking.

Get Batch Status

GET /v1/batch/{id}

Retrieve the current status and results of a batch job. When the batch completes, the response includes an array of all outputs.

Code Examples

Python

JavaScript

from xalen import Xalen
import time

client = Xalen(api_key="xln_live_YOUR_KEY")

# Create a batch of requests
batch = client.batch.create(
    model="deepseek-v3",
    requests=[
        {"messages": [{"role": "user", "content": f"Summarize article {i}"}]}
        for i in range(100)
    ],
    metadata={"project": "content-pipeline"}
)

print(f"Batch {batch.id} created, status: {batch.status}")

# Poll for completion
while batch.status not in ["completed", "failed"]:
    time.sleep(10)
    batch = client.batch.retrieve(batch.id)
    print(f"Progress: {batch.completed_requests}/{batch.total_requests}")

# Process results
if batch.status == "completed":
    for result in batch.results:
        print(result.choices[0].message.content[:100])

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

// Create a batch of requests
const requests = Array.from({ length: 100 }, (_, i) => ({
  messages: [{ role: "user", content: `Summarize article ${i}` }],
}));

const batch = await client.batch.create({
  model: "deepseek-v3",
  requests,
  metadata: { project: "content-pipeline" },
});

console.log(`Batch ${batch.id} created, status: ${batch.status}`);

// Poll for completion
let status = batch.status;
while (!["completed", "failed"].includes(status)) {
  await new Promise((r) => setTimeout(r, 10000));
  const updated = await client.batch.retrieve(batch.id);
  status = updated.status;
  console.log(`Progress: ${updated.completed_requests}/${updated.total_requests}`);
}

// Process results
if (status === "completed") {
  const completed = await client.batch.retrieve(batch.id);
  for (const result of completed.results) {
    console.log(result.choices[0].message.content.slice(0, 100));
  }
}

Batch Pricing

All batch requests are billed at 50% of standard per-token rates. The same model, the same quality, half the cost. Batches typically complete within 1-6 hours depending on size and model.

Evaluations

Automated model evaluation using LLM-as-a-Judge. Compare models head-to-head, score outputs against custom criteria, and track quality over time. Evaluations run server-side so you can benchmark without writing scoring infrastructure.

Evaluation Types

Type	Description	Output
Classification	Judge assigns a discrete label to each response (e.g. "accurate", "inaccurate", "partial").	Label + reasoning
Scoring	Judge rates each response on a numeric scale (e.g. 1-5 for relevance, accuracy, helpfulness).	Score + reasoning
Comparison	Judge selects the better response from two model outputs for the same prompt.	Winner + reasoning

Supported Judge Models

Any chat model available on XALEN can serve as a judge. Recommended judges for high-quality evaluations:

Model	Strengths
Llama 3.3 70B	Strong reasoning, good at nuanced scoring. Excellent general-purpose judge.
DeepSeek V3	High accuracy on technical and domain-specific evaluations.
Qwen 2.5 72B	Multilingual evaluation strength. Good for non-English content scoring.

Create Evaluation

POST /v1/evaluations

Parameter	Type	Required	Description
type	string	Required	Evaluation type: `classification`, `scoring`, or `comparison`.
judge_model	string	Required	Model ID for the judge. e.g. `llama-3.3-70b`
test_cases	array	Required	Array of test case objects. Each has `input` (messages array) and `output` (string or array of strings for comparison).
criteria	string	Optional	Custom evaluation criteria for the judge. e.g. "Rate accuracy of astrological predictions on a 1-5 scale."
scale	object	Optional	For scoring type: `min` and `max` values. Default: `{"min": 1, "max": 5}`.

Get Evaluation Results

GET /v1/evaluations/{id}

Retrieve the results of a completed evaluation, including per-test-case scores and aggregate metrics.

Code Examples

Python

JavaScript

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

# Score model outputs on a 1-5 scale
evaluation = client.evaluations.create(
    type="scoring",
    judge_model="llama-3.3-70b",
    criteria="Rate the accuracy and helpfulness of each response on a 1-5 scale.",
    scale={"min": 1, "max": 5},
    test_cases=[
        {
            "input": [{"role": "user", "content": "What is Ketu in astrology?"}],
            "output": "Ketu is the south node of the Moon..."
        },
        {
            "input": [{"role": "user", "content": "Explain Venus Mahadasha."}],
            "output": "Venus Mahadasha lasts 20 years..."
        }
    ]
)

# Retrieve results
results = client.evaluations.retrieve(evaluation.id)
print(f"Average score: {results.aggregate.mean_score}")
for case in results.results:
    print(f"Score: {case.score}, Reasoning: {case.reasoning[:80]}")

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

// Compare two model outputs head-to-head
const evaluation = await client.evaluations.create({
  type: "comparison",
  judge_model: "llama-3.3-70b",
  criteria: "Which response is more accurate and detailed?",
  test_cases: [
    {
      input: [{ role: "user", content: "What is Ketu in astrology?" }],
      output: [
        "Ketu is the south node of the Moon in Vedic astrology...",
        "Ketu represents past karma and spiritual liberation..."
      ],
    },
  ],
});

// Retrieve results
const results = await client.evaluations.retrieve(evaluation.id);
for (const r of results.results) {
  console.log(`Winner: Output ${r.winner}, Reasoning: ${r.reasoning}`);
}

GPU Clusters

On-demand GPU clusters for training, large batch jobs, and custom workloads. Self-service provisioning via API or dashboard. Spin up multi-node clusters with high-bandwidth interconnects, run your training jobs, and tear down when finished.

Available GPUs

GPU	VRAM	Interconnect	Price
NVIDIA H100 SXM	80 GB	NVLink 900 GB/s	$3.49/GPU-hr
NVIDIA B200	192 GB	NVLink 1800 GB/s	$5.99/GPU-hr
NVIDIA A100 SXM	80 GB	NVLink 600 GB/s	$2.09/GPU-hr
NVIDIA L40S	48 GB	PCIe Gen4	$1.19/GPU-hr

Features

Feature	Details
Cluster Management	Slurm and Kubernetes orchestration. Choose your preferred scheduler at provisioning time.
Persistent Storage	NVMe SSD storage attached to every node. Data persists across job restarts within the cluster lifetime.
Multi-Region	Clusters available in US, Europe, and Asia Pacific. Select region at creation time for data residency compliance.
Health Monitoring	Real-time GPU utilization, memory, temperature, and error metrics via API and dashboard.
SSH Access	Direct SSH access to cluster nodes for custom setup, debugging, and environment configuration.

Create Cluster

POST /v1/clusters

Parameter	Type	Required	Description
gpu_type	string	Required	GPU model: `h100-sxm`, `b200`, `a100-sxm`, or `l40s`
gpu_count	integer	Required	Number of GPUs. Must be a multiple of 8 for H100/B200/A100.
region	string	Optional	Deployment region: `us-east`, `us-west`, `eu-west`, `ap-south`. Default: `us-east`.
scheduler	string	Optional	Orchestration: `slurm` or `kubernetes`. Default: `kubernetes`.
max_runtime_hours	integer	Optional	Auto-terminate after this many hours. Default: no limit.

Get Cluster Status

GET /v1/clusters/{id}

Retrieve cluster status, node health, GPU utilization, and connection details (SSH host, Kubernetes config).

Delete Cluster

POST /v1/clusters/{id}/delete

Terminate a cluster and release all resources. Billing stops when deprovisioning completes.

Code Examples

Python

JavaScript

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

# Provision an 8xH100 cluster
cluster = client.clusters.create(
    gpu_type="h100-sxm",
    gpu_count=8,
    region="us-east",
    scheduler="kubernetes",
    max_runtime_hours=24
)

print(f"Cluster {cluster.id}: {cluster.status}")
print(f"SSH: {cluster.ssh_host}")
print(f"K8s config: {cluster.kubeconfig_url}")

# Monitor GPU utilization
status = client.clusters.retrieve(cluster.id)
for node in status.nodes:
    print(f"Node {node.id}: GPU util {node.gpu_utilization}%, "
          f"memory {node.gpu_memory_used_gb}/{node.gpu_memory_total_gb} GB")

# Tear down when done
client.clusters.delete(cluster.id)

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

// Provision an 8xH100 cluster
const cluster = await client.clusters.create({
  gpu_type: "h100-sxm",
  gpu_count: 8,
  region: "us-east",
  scheduler: "kubernetes",
  max_runtime_hours: 24,
});

console.log(`Cluster ${cluster.id}: ${cluster.status}`);
console.log(`SSH: ${cluster.ssh_host}`);
console.log(`K8s config: ${cluster.kubeconfig_url}`);

// Monitor GPU utilization
const status = await client.clusters.retrieve(cluster.id);
for (const node of status.nodes) {
  console.log(
    `Node ${node.id}: GPU util ${node.gpu_utilization}%, ` +
    `memory ${node.gpu_memory_used_gb}/${node.gpu_memory_total_gb} GB`
  );
}

// Tear down when done
await client.clusters.delete(cluster.id);

Available on Dedicated and Enterprise Tiers

GPU clusters require a Dedicated ($5,000+/mo) or Enterprise plan. Contact sales for custom cluster configurations and reserved capacity pricing.

Prompt Caching

Reduce costs and latency by caching repeated prompt prefixes. When you send the same system prompt, few-shot examples, or long context prefix across multiple requests, XALEN caches the prefix and reuses it. Cache reads cost up to 90% less than processing the same tokens from scratch.

How It Works

Step	Details
1. Cache Creation	On the first request, the prompt prefix is processed and cached. Cache creation costs the same as standard input tokens.
2. Cache Hit	Subsequent requests with an identical prefix hit the cache. Cached tokens are billed at 90% less than standard input token rates.
3. Cache Eviction	Caches expire after a period of inactivity (typically 5-10 minutes). Frequently used prefixes stay cached indefinitely.

Per-Model Caching Support

Model	Serverless Cache	Dedicated Cache	Cache Discount
Claude Opus 4.7	Yes	N/A	90% on reads
Claude Sonnet 4.6	Yes	N/A	90% on reads
Claude Haiku 4.5	Yes	N/A	90% on reads
DeepSeek V3	No	Yes	90% on reads
Llama 3.3 70B	No	Yes	90% on reads
Qwen 2.5 72B	No	Yes	90% on reads

Claude models receive automatic caching through the inference provider. Other models require a dedicated endpoint for caching support.

Usage Tracking

When prompt caching is active, the response usage object includes two additional fields:

Field	Type	Description
cache_creation_tokens	integer	Number of tokens written to cache on this request. Billed at standard input rate.
cache_read_tokens	integer	Number of tokens read from cache. Billed at 90% discount.

Code Examples

Python

JavaScript

from xalen import Xalen

client = Xalen(api_key="xln_live_YOUR_KEY")

# Long system prompt that stays the same across requests
system_prompt = """You are a Vedic astrology expert trained on classical texts
including BPHS, Phaladeepika, and Brihat Jataka. Provide detailed analysis
based on planetary positions, houses, aspects, and dasha periods.
Always cite the relevant classical reference for each interpretation.
...(2000+ tokens of instructions)..."""

# First request: creates cache (billed at standard rate)
r1 = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "Analyze Saturn in the 10th house."}
    ]
)
print(f"Cache created: {r1.usage.cache_creation_tokens} tokens")

# Second request: hits cache (90% cheaper on cached prefix)
r2 = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What does Jupiter in the 5th house mean?"}
    ]
)
print(f"Cache read: {r2.usage.cache_read_tokens} tokens (90% discount)")

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });

// Long system prompt that stays the same across requests
const systemPrompt = `You are a Vedic astrology expert trained on classical texts
including BPHS, Phaladeepika, and Brihat Jataka. Provide detailed analysis
based on planetary positions, houses, aspects, and dasha periods.
Always cite the relevant classical reference for each interpretation.
...(2000+ tokens of instructions)...`;

// First request: creates cache
const r1 = await client.chat.completions.create({
  model: "claude-sonnet-4.6",
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: "Analyze Saturn in the 10th house." },
  ],
});
console.log(`Cache created: ${r1.usage.cache_creation_tokens} tokens`);

// Second request: hits cache (90% cheaper on cached prefix)
const r2 = await client.chat.completions.create({
  model: "claude-sonnet-4.6",
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: "What does Jupiter in the 5th house mean?" },
  ],
});
console.log(`Cache read: ${r2.usage.cache_read_tokens} tokens (90% discount)`);

Maximize Cache Hits

Structure your prompts with the static content first (system instructions, few-shot examples, reference data) and dynamic content last (user query). The longer the shared prefix, the greater the cost savings. A 2,000-token cached prefix saves roughly $0.027 per request on Claude Sonnet 4.6.

Model Feature Matrix

Not every model supports every feature. Use this matrix to find the right model for your use case. Features marked "Dedicated" are available only when using a dedicated endpoint.

Model	Chat	Vision	Streaming	Fine-tune	Batch	Dedicated	Caching	Functions
Claude Opus 4.7	Yes	Yes	Yes	No	No	No	Yes	Yes
Claude Sonnet 4.6	Yes	Yes	Yes	No	No	No	Yes	Yes
Claude Haiku 4.5	Yes	No	Yes	No	No	No	Yes	Yes
DeepSeek V3	Yes	No	Yes	Yes	Yes	Yes	Dedicated	Yes
DeepSeek R1	Yes	No	Yes	Yes	Yes	Yes	Dedicated	No
Llama 3.3 70B	Yes	No	Yes	Yes	Yes	Yes	Dedicated	Yes
Llama 4 Scout	Yes	Yes	Yes	LoRA	Yes	Yes	Dedicated	Yes
Qwen 2.5 72B	Yes	No	Yes	Yes	Yes	Yes	Dedicated	Yes
Qwen3 235B	Yes	No	Yes	LoRA	Yes	Yes	Dedicated	Yes
GPT-OSS 120B	Yes	No	Yes	No	Yes	Yes	Dedicated	Yes
Vedika Standard	Yes	No	Yes	No	No	No	No	No
Vedika Fast	Yes	No	Yes	No	No	No	No	No

Legend

Yes = fully supported on serverless. Dedicated = available only on dedicated endpoints. LoRA = LoRA adapter fine-tuning only (not full-parameter). No = not available for this model. Use GET /v1/models for the latest capabilities.

SDKs

Official client libraries for Python, JavaScript, and an MCP server for AI coding assistants.

Python

pip install xalen

from xalen import Xalen

# Uses XALEN_API_KEY env var by default
client = Xalen()

# Or pass explicitly
client = Xalen(api_key="xln_live_YOUR_KEY")

# All OpenAI-compatible methods work
response = client.chat.completions.create(
    model="vedika-standard",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript / TypeScript

npm install xalen-sdk

import Xalen from "xalen-sdk";

const client = new Xalen({ apiKey: process.env.XALEN_API_KEY });

const response = await client.chat.completions.create({
  model: "vedika-standard",
  messages: [{ role: "user", content: "Hello" }],
});

MCP Server

Connect XALEN to AI coding assistants like Claude Code, Cursor, and GitHub Copilot using the Model Context Protocol.

npx xalen-mcp

The MCP server exposes 15 tools covering chat, embeddings, image generation, voice, astrology, and billing. See the npm package for configuration details.

Mobile

The XALEN REST API works from any platform, including React Native, Flutter, Swift, and Kotlin. Since the API is OpenAI-compatible, any existing OpenAI mobile wrapper or HTTP client works out of the box.

React Native

Flutter

// Works today — standard fetch from React Native
const response = await fetch('https://api.xalen.io/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer xln_live_YOUR_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'vedika-standard',
    messages: [{ role: 'user', content: 'Hello' }],
  }),
});
const data = await response.json();

// Flutter SDK — pub.dev: xalen ^0.1.0
import 'package:xalen/xalen.dart';

final client = XALEN(apiKey: 'xln_live_YOUR_KEY');

final response = await client.chatCompletion(
  model: 'vedika-standard',
  messages: [ChatMessage(role: 'user', content: 'Hello')],
);
print(response.choices.first.message.content);

Native SDKs Available

React hooks: npm install xalen-react xalen-sdk — includes useChat, useCompletion, useVoice, and useAstrology hooks with XALENProvider context. Flutter/Dart: xalen: ^0.1.0 in pubspec.yaml — typed client with chat completions, astrology endpoints, and error handling. Offline caching and Expo module coming soon.

Errors

XALEN uses standard HTTP status codes. All error responses include a JSON body with error.type and error.message.

Status	Type	Description
400	invalid_request	Missing or invalid parameters.
401	authentication_error	Invalid or missing API key.
403	permission_denied	Key lacks permission for this endpoint.
404	not_found	Endpoint or resource not found.
429	rate_limit_exceeded	Too many requests. Retry after the time in `Retry-After` header.
500	server_error	Internal error. Contact support if persistent.
503	service_unavailable	Temporary overload. Retry with exponential backoff.

Error Response Format

JSON Error

{
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key. Keys must start with xln_live_.",
    "status": 401
  }
}

Retry Strategy

For 429 and 503 errors, use exponential backoff starting at 1 second. The SDKs handle this automatically with up to 3 retries.

Production Retry Examples

For production workloads, implement retry logic with exponential backoff and proper error classification. These examples handle rate limits, transient server errors, and hard failures differently.

Python

JavaScript

import xalen
import time

client = xalen.Client(api_key="your-key")

def query_with_retry(prompt, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="vedika-pro-ultra",
                messages=[{"role": "user", "content": prompt}]
            )
            return response
        except xalen.RateLimitError as e:
            wait = min(2 ** attempt, 60)
            time.sleep(wait)
        except xalen.APIError as e:
            if e.status_code >= 500:
                time.sleep(2 ** attempt)
                continue
            raise
    raise Exception("Max retries exceeded")

import Xalen from 'xalen-sdk';

const client = new Xalen({ apiKey: 'your-key' });

async function queryWithRetry(prompt, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.chat.completions.create({
        model: 'vedika-pro-ultra',
        messages: [{ role: 'user', content: prompt }]
      });
    } catch (err) {
      if (err.status === 429 || err.status >= 500) {
        const wait = Math.min(Math.pow(2, attempt) * 1000, 60000);
        await new Promise(r => setTimeout(r, wait));
        continue;
      }
      throw err;
    }
  }
  throw new Error('Max retries exceeded');
}

Rate Limits

Rate limits are applied per API key and vary by plan. Limits are returned in response headers.

Plan	RPM	TPM
Pay-as-you-go	60 RPM	100K TPM
Growth	300 RPM	500K TPM
Scale	1,000 RPM	2M TPM
Dedicated	5,000 RPM	10M TPM
Enterprise	Custom	Custom

Rate Limit Headers

Header	Description
x-ratelimit-limit-requests	Maximum requests per minute for your plan.
x-ratelimit-remaining-requests	Remaining requests in the current window.
x-ratelimit-reset-requests	Seconds until the request limit resets.
x-ratelimit-limit-tokens	Maximum tokens per minute for your plan.
x-ratelimit-remaining-tokens	Remaining tokens in the current window.

API Versioning & Deprecation

XALEN uses URL-path versioning to guarantee backward compatibility. The current stable version is v1, and all endpoints are prefixed with /v1/.

Versioning Scheme

Aspect	Policy
Current version	`v1` (stable)
Version in URL	Major version in path: `/v1/`, `/v2/`
Backward compatibility	`v1` will be supported for a minimum of 24 months after `v2` launches
Breaking changes	No breaking changes to existing endpoints without a version bump

Deprecation Notice Periods

When an API version or model is deprecated, you will receive advance notice based on your plan tier.

Plan	API Deprecation	Model Deprecation
Enterprise	90 days	60 days + migration guide
Scale	60 days	60 days + migration guide
Growth	30 days	60 days + migration guide
Pay-as-you-go	30 days	60 days + migration guide

Sunset Headers

Deprecated endpoints include a Sunset header indicating the final date of availability, plus a Deprecation header with the date the deprecation was announced.

Deprecation Headers

Sunset: Sat, 01 Nov 2027 00:00:00 GMT
Deprecation: Wed, 01 Aug 2027 00:00:00 GMT
Link: <https://xalen.io/docs#versioning>; rel="sunset"

Model Deprecation

When a model is deprecated, all notices include an equivalent model recommendation so you can migrate with minimal code changes. The deprecated model continues to function for the full 60-day notice period.

Data Portability & Export

XALEN is built on open standards. Your data, your code, your choice of provider. No lock-in, no proprietary formats, no exit fees.

Open Standards

Aspect	Details
Response format	All API responses use standard JSON. No proprietary serialization or binary formats.
OpenAI compatibility	Drop-in compatible with any OpenAI-compatible provider. Migrate to or from XALEN by changing the base URL and API key.
Query language	Standard REST API. No proprietary query language or DSL to learn.
SDKs	Open-source Python and JavaScript SDKs. Portable, no vendor-specific runtime dependencies.

Data Export

GET /v1/account/export

Export all account data as a single JSON archive. Includes usage history, billing records, API key metadata, and account settings.

JSON Response

{
  "account": { "email": "[email protected]", "plan": "growth", "created": "2026-01-15" },
  "usage": [ { "date": "2026-05-14", "requests": 1420, "tokens": 892300, "cost_cents": 178 } ],
  "billing": [ { "id": "pay_abc123", "amount": 5000, "date": "2026-05-01", "status": "completed" } ],
  "api_keys": [ { "id": "key_1", "prefix": "xln_live_a3f...", "created": "2026-01-15", "last_used": "2026-05-14" } ]
}

Code Ownership

Code generated through XALEN Studio is 100% customer-owned. Download your projects as a ZIP archive at any time. No licensing restrictions, no attribution requirements, no runtime dependency on XALEN infrastructure.

Regulatory Compliance

XALEN supports data portability rights under GDPR Article 20 and India's DPDPA. Submit export requests via the dashboard or email [email protected] for assisted exports.