Switching from OpenAI, Anthropic, or OpenRouter?
XALEN is OpenAI-compatible. Change one line of code:
# Before: base_url = "https://api.openai.com/v1"
# After:
base_url = "https://api.xalen.io/v1"
Your existing SDK code works. Same format, same streaming, same function calling. Plus you get 200+ models, 13 pre-built agents, Studio website builder, and custom agent builder.
Authentication
All API requests require a Bearer token. Include your API key in the Authorization header of every request.
Authorization: Bearer xln_live_YOUR_API_KEY
API keys start with xln_live_. Generate one from your Dashboard after signing up.
Never expose API keys in client-side code or public repositories. Use environment variables or a backend proxy.
Base URL
All endpoints are served from a single base URL. Append the endpoint path to this URL for every request.
https://api.xalen.io
For example, to call Chat Completions: POST https://api.xalen.io/v1/chat/completions
Quick Start
Make your first API call in under 60 seconds. Install an SDK or use cURL directly.
pip install xalen
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
response = client.chat.completions.create(
model="vedika-standard",
messages=[{"role": "user", "content": "Hello, world!"}]
)
print(response.choices[0].message.content)
npm install xalen-sdk
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
const response = await client.chat.completions.create({
model: "vedika-standard",
messages: [{ role: "user", content: "Hello, world!" }],
});
console.log(response.choices[0].message.content);
curl https://api.xalen.io/v1/chat/completions \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vedika-standard",
"messages": [{"role": "user", "content": "Hello, world!"}]
}'
Chat Completions
Generate a model response for a conversation. Compatible with the OpenAI Chat Completions API format, so existing OpenAI SDK code works by changing only the base URL and API key.
/v1/chat/completions endpoint. No code changes needed — just set model to claude-opus-4.7, claude-sonnet-4.6, etc.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Model ID. e.g. vedika-standard, claude-sonnet-4.6, claude-opus-4.7, llama-4-maverick |
| messages | array | Required | Array of message objects with role (system, user, assistant) and content. |
| temperature | number | Optional | Sampling temperature between 0 and 2. Default: 1. |
| max_tokens | integer | Optional | Maximum tokens to generate. Default: model-specific. |
| stream | boolean | Optional | Stream partial responses as Server-Sent Events. Default: false. |
| top_p | number | Optional | Nucleus sampling threshold. Default: 1. |
| stop | string | array | Optional | Up to 4 sequences where the model will stop generating. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
response = client.chat.completions.create(
model="vedika-pro",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is my birth chart?"}
],
temperature=0.7,
max_tokens=1024
)
print(response.choices[0].message.content)
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
const response = await client.chat.completions.create({
model: "vedika-pro",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is my birth chart?" },
],
temperature: 0.7,
max_tokens: 1024,
});
console.log(response.choices[0].message.content);
curl https://api.xalen.io/v1/chat/completions \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vedika-pro",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is my birth chart?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1717200000,
"model": "vedika-pro",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "To generate your birth chart, I need your date, time, and place of birth..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 52,
"total_tokens": 80
}
}
List Models
Returns a list of all available models. Use this to discover model IDs, capabilities, and pricing.
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
models = client.models.list()
for m in models.data:
print(m.id, m.owned_by)
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
const models = await client.models.list();
models.data.forEach(m => console.log(m.id, m.owned_by));
curl https://api.xalen.io/v1/models \ -H "Authorization: Bearer xln_live_YOUR_KEY"
Response
{
"object": "list",
"data": [
{
"id": "vedika-standard",
"object": "model",
"owned_by": "xalen",
"permission": []
},
{
"id": "vedika-pro",
"object": "model",
"owned_by": "xalen",
"permission": []
},
{
"id": "claude-opus-4.7",
"object": "model",
"owned_by": "anthropic",
"permission": []
},
{
"id": "claude-sonnet-4.6",
"object": "model",
"owned_by": "anthropic",
"permission": []
}
]
}
Embeddings
Generate vector embeddings for text input. Use for semantic search, clustering, or recommendation systems.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Embedding model ID. e.g. text-embedding-3-small |
| input | string | array | Required | Text to embed. Can be a single string or array of strings. |
| encoding_format | string | Optional | float (default) or base64. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
response = client.embeddings.create(
model="text-embedding-3-small",
input="Vedic astrology birth chart analysis"
)
print(len(response.data[0].embedding)) # 1536
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
const response = await client.embeddings.create({
model: "text-embedding-3-small",
input: "Vedic astrology birth chart analysis",
});
console.log(response.data[0].embedding.length); // 1536
curl https://api.xalen.io/v1/embeddings \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-3-small", "input": "Vedic astrology birth chart analysis"}'
Response
{
"object": "list",
"data": [{
"object": "embedding",
"index": 0,
"embedding": [0.0023, -0.0091, 0.0152, ...]
}],
"model": "text-embedding-3-small",
"usage": { "prompt_tokens": 6, "total_tokens": 6 }
}
Image Generation
Generate images from text prompts. Returns one or more image URLs or base64-encoded data.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | Required | Text description of the image to generate. |
| model | string | Optional | Image model ID. Default: platform default. |
| n | integer | Optional | Number of images. Default: 1. Max: 4. |
| size | string | Optional | 256x256, 512x512, or 1024x1024. Default: 1024x1024. |
| response_format | string | Optional | url (default) or b64_json. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
response = client.images.generate(
prompt="A serene Hindu temple at sunrise, watercolor style",
size="1024x1024"
)
print(response.data[0].url)
curl https://api.xalen.io/v1/images/generations \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A serene Hindu temple at sunrise, watercolor style", "size": "1024x1024"}'
Response
{
"created": 1717200000,
"data": [{
"url": "https://api.xalen.io/files/img-abc123.png"
}]
}
Text to Speech
Convert text to natural-sounding speech. Supports multiple voices and output formats.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | TTS model ID. e.g. tts-1, tts-1-hd |
| input | string | Required | Text to convert. Max 4096 characters. |
| voice | string | Required | Voice ID. Options: alloy, echo, fable, onyx, nova, shimmer |
| response_format | string | Optional | mp3 (default), opus, aac, flac, wav |
| speed | number | Optional | 0.25 to 4.0. Default: 1.0. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
response = client.audio.speech.create(
model="tts-1",
voice="nova",
input="Welcome to your daily horoscope reading."
)
with open("output.mp3", "wb") as f:
f.write(response.content)
curl https://api.xalen.io/v1/audio/speech \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "tts-1", "voice": "nova", "input": "Welcome to your daily horoscope reading."}' \
--output output.mp3
Returns raw audio bytes in the requested format.
Speech to Text
Transcribe audio to text. Supports multiple languages including 14 Indian languages.
Request Body (multipart/form-data)
| Parameter | Type | Required | Description |
|---|---|---|---|
| file | file | Required | Audio file (mp3, mp4, mpeg, mpga, m4a, wav, webm). Max 25 MB. |
| model | string | Required | Transcription model. e.g. whisper-1 |
| language | string | Optional | ISO-639-1 code. e.g. hi, ta, te, en |
| response_format | string | Optional | json (default), text, verbose_json |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
with open("audio.mp3", "rb") as f:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=f,
language="hi"
)
print(transcript.text)
curl https://api.xalen.io/v1/audio/transcriptions \ -H "Authorization: Bearer xln_live_YOUR_KEY" \ -F [email protected] \ -F model=whisper-1 \ -F language=hi
Response
{
"text": "Transcribed text content here..."
}
Voice AI
End-to-end voice conversation: send audio, get audio back. Combines speech recognition, AI reasoning, and text-to-speech in a single call. Supports 31 languages with sub-200ms latency.
Request Body (multipart/form-data)
| Parameter | Type | Required | Description |
|---|---|---|---|
| audio | file | Required | Audio input file (wav, mp3, webm, ogg). |
| language | string | Optional | ISO-639-1 code. Auto-detected if omitted. |
| voice | string | Optional | Response voice ID. Default: nova. |
| context | string | Optional | System prompt for the AI reasoning layer. |
| birth_details | object | Optional | For astrology queries: { "date": "1990-01-15", "time": "14:30", "place": "Mumbai" } |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
with open("question.wav", "rb") as f:
response = client.voice.binary(
audio=f,
language="hi",
voice="nova"
)
with open("answer.mp3", "wb") as f:
f.write(response.audio)
curl https://api.xalen.io/v1/voice/binary \ -H "Authorization: Bearer xln_live_YOUR_KEY" \ -F [email protected] \ -F language=hi \ -F voice=nova \ --output answer.mp3
Returns binary audio in mp3 format by default. The response includes a X-Transcript header with the text transcription and X-Response-Text with the AI's text reply.
Astrology AI Query
Ask any astrology question in natural language. Use model: "vedika-standard" or model: "vedika-fast" in the standard Chat Completions endpoint. The Vedika engine handles birth chart computation, classical text grounding, RAG retrieval, and multi-language response generation automatically.
Example
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
response = client.chat.completions.create(
model="vedika-standard",
messages=[
{"role": "user", "content": "I was born on 15 Jan 1990 at 2:30 PM in Pune. What is my current Mahadasha and its effects?"}
]
)
print(response.choices[0].message.content)
# Includes: grounded answer, classical citations, follow-up suggestions
curl -X POST "https://api.xalen.io/v1/chat/completions" \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vedika-standard",
"messages": [{"role": "user", "content": "What is Rahu in the 7th house according to BPHS?"}]
}'
The Vedika AI engine supports: birth chart analysis, dasha predictions, transit effects, compatibility matching, muhurta selection, panchang queries, yoga identification, and remedial suggestions, all through natural language conversation.
Structured Astrology Data
For structured JSON endpoints, use Vedika API directly
If you need raw structured data (birth charts, planetary positions, panchang, dasha timelines, divisional charts D1-D60, yoga calculations, compatibility scores), the Vedika API provides 130+ computation endpoints with structured JSON responses.
When to use XALEN vs Vedika: Use XALEN's /v1/chat/completions with vedika-standard for natural language AI queries with grounding and citations. Use Vedika's structured API directly when you need raw JSON computation data (chart objects, planetary degrees, dasha trees) for building custom UIs.
Kundali Generation
Generate a complete Vedic birth chart (Kundali) with planetary positions, house placements, nakshatras, yogas, and dashas. Powered by the Vedika Ephemeris engine with arc-second precision.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| date | string | Required | Birth date in YYYY-MM-DD format. |
| time | string | Required | Birth time in HH:MM 24-hour format. |
| lat | number | Required | Latitude of birth place. e.g. 18.5204 |
| lon | number | Required | Longitude of birth place. e.g. 73.8567 |
| tz | number | Optional | Timezone offset in hours. e.g. 5.5 for IST. Auto-detected from coordinates if omitted. |
| system | string | Optional | vedic (default), western, or kp. |
| language | string | Optional | Response language. ISO-639-1 code. Default: en. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
chart = client.astrology.kundali(
date="1990-01-15",
time="14:30",
lat=18.5204,
lon=73.8567
)
print(chart.ascendant) # "Taurus"
print(chart.moon_sign) # "Scorpio"
print(chart.planets) # Detailed planetary positions
print(chart.yogas) # Active yogas
curl "https://api.xalen.io/v1/astrology/kundali?date=1990-01-15&time=14:30&lat=18.5204&lon=73.8567" \ -H "Authorization: Bearer xln_live_YOUR_KEY"
Response
{
"ascendant": { "sign": "Taurus", "degree": 14.32, "nakshatra": "Rohini" },
"moon_sign": "Scorpio",
"sun_sign": "Capricorn",
"planets": [
{ "name": "Sun", "sign": "Capricorn", "house": 9, "degree": 0.85, "nakshatra": "Uttara Ashadha", "retrograde": false },
{ "name": "Moon", "sign": "Scorpio", "house": 7, "degree": 22.41, "nakshatra": "Jyeshtha", "retrograde": false }
],
"houses": [ ... ],
"yogas": [
{ "name": "Gaja Kesari Yoga", "description": "Jupiter in kendra from Moon", "strength": "strong" }
],
"dasha": { "current": "Venus", "sub": "Mercury", "start": "2024-03-12", "end": "2026-08-04" },
"engine": "vedika-ephemeris"
}
Run Agent
Execute a pre-built AI agent with a single API call. Agents are purpose-built for specific tasks like temple management, devotional content, and spiritual guidance.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| agent_id | string | Required | Agent identifier. e.g. temple-assistant, kundali-reader, puja-planner |
| input | string | Required | User message or task description. |
| context | object | Optional | Additional data for the agent (birth details, location, preferences). |
| stream | boolean | Optional | Stream the response. Default: false. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
result = client.agents.run(
agent_id="kundali-reader",
input="What career paths suit my chart?",
context={
"birth_date": "1990-01-15",
"birth_time": "14:30",
"birth_place": "Pune, India"
}
)
print(result.output)
curl https://api.xalen.io/v1/agents/run \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"agent_id": "kundali-reader",
"input": "What career paths suit my chart?",
"context": {"birth_date": "1990-01-15", "birth_time": "14:30", "birth_place": "Pune, India"}
}'
Response
{
"agent_id": "kundali-reader",
"output": "Based on your chart with Taurus ascendant and strong 10th house...",
"usage": { "prompt_tokens": 450, "completion_tokens": 320, "total_tokens": 770 }
}
Check Balance
Retrieve your current wallet balance and usage summary.
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
balance = client.billing.balance()
print(f"${balance.available / 100:.2f}") # Wallet in USD
curl https://api.xalen.io/v1/billing/balance \ -H "Authorization: Bearer xln_live_YOUR_KEY"
Response
{
"available": 4250,
"currency": "usd_cents",
"total_spent": 1750,
"plan": "pay-as-you-go"
}
Add Funds
Add funds to your wallet. Returns a Razorpay payment link. Minimum deposit: $10.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| amount | integer | Required | Amount in USD cents. Minimum: 1000 ($10). |
| currency | string | Optional | usd (default) or inr. |
Code Examples
from xalen import Xalen client = Xalen(api_key="xln_live_YOUR_KEY") payment = client.billing.deposit(amount=5000) # $50 print(payment.payment_url) # Razorpay checkout link
curl https://api.xalen.io/v1/billing/deposit \
-H "Authorization: Bearer xln_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"amount": 5000}'
Response
{
"payment_id": "pay_abc123",
"payment_url": "https://rzp.io/l/xalen-deposit",
"amount": 5000,
"currency": "usd_cents",
"status": "pending"
}
Billing Mechanics
XALEN uses a prepaid wallet model. Deposit funds, then pay per API call. No surprise invoices, no credit card holds, no overages unless you opt in.
Wallet Model
| Feature | Details |
|---|---|
| Deposit | Add funds via card or UPI. Minimum deposit: $10 (Pay-as-you-go). |
| First deposit bonus | $10 bonus on your first $100+ deposit. Applied automatically. |
| Per-call billing | Each API call deducts from your wallet based on model, tokens, and endpoint type. |
| Currency | Wallet stored in USD cents. Display values divide by 100. |
Balance & Usage Tracking
| Endpoint | Method | Description |
|---|---|---|
/v1/billing/balance | GET | Real-time wallet balance, total spent, and current plan. |
/v1/billing/usage?period=current | GET | Detailed usage breakdown for the current billing period. |
Plan Switching
Upgrades take effect immediately. Your new rate limits and features are available within seconds. Downgrades are deferred to the end of the current billing cycle to avoid mid-cycle disruption.
Spending Controls
| Control | Details |
|---|---|
| Spending alerts | Configurable notifications at 50%, 80%, 90%, and 100% of your budget threshold. |
| Hard spending cap | Set via dashboard. API calls return 402 Payment Required when the cap is exceeded. |
| Auto-pause | Optional. Stops API calls entirely when your budget is reached instead of allowing overage. |
Every API call costs money. XALEN does not offer free credits, trials, or complimentary usage. Use the Playground to test models before committing to a deposit.
Fine-Tuning
Fine-tune any supported model on your own data. Upload JSONL training files, configure hyperparameters, and deploy a custom model tailored to your use case. XALEN supports standard full-parameter fine-tuning and lightweight LoRA adapters depending on the base model.
Supported Base Models
| Model | Type | Min Training Examples |
|---|---|---|
| Llama 3.3 70B | Standard | 100 |
| DeepSeek V3 | Standard | 100 |
| Qwen 2.5 72B | Standard | 100 |
| Llama 4 Scout 17B | LoRA | 50 |
| Qwen3 235B | LoRA | 50 |
Training Data Format
Upload training data as a JSONL file. Each line must be a valid JSON object containing a messages array with system, user, and assistant turns.
{"messages": [{"role": "system", "content": "You are a Vedic astrology expert."}, {"role": "user", "content": "What does Saturn in the 7th house mean?"}, {"role": "assistant", "content": "Saturn in the 7th house indicates..."}]}
{"messages": [{"role": "system", "content": "You are a Vedic astrology expert."}, {"role": "user", "content": "Explain Rahu Mahadasha effects."}, {"role": "assistant", "content": "During Rahu Mahadasha..."}]}
Upload Training Data
Upload your JSONL training file. The response includes a file_id used when creating a fine-tuning job.
| Parameter | Type | Required | Description |
|---|---|---|---|
| file | file | Required | The JSONL file to upload. |
| purpose | string | Required | Must be fine-tune. |
Create Fine-Tuning Job
Create a new fine-tuning job. The job trains asynchronously and you can poll its status or list all jobs.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Base model ID. e.g. llama-3.3-70b, deepseek-v3, qwen-2.5-72b |
| training_file | string | Required | File ID returned from the upload endpoint. |
| hyperparameters | object | Optional | Training config: epochs (default 3), learning_rate (default auto), batch_size (default auto). |
| suffix | string | Optional | Custom suffix for the fine-tuned model name. Max 18 characters. |
List Fine-Tuning Jobs
Returns a list of all fine-tuning jobs for your account, including status, progress, and result model IDs.
Get Job Status
Retrieve the current status and details of a specific fine-tuning job.
Cancel Job
Cancel a running fine-tuning job. Jobs that have already completed cannot be cancelled.
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
# 1. Upload training data
training_file = client.files.create(
file=open("training_data.jsonl", "rb"),
purpose="fine-tune"
)
# 2. Create fine-tuning job
job = client.fine_tuning.jobs.create(
model="llama-3.3-70b",
training_file=training_file.id,
hyperparameters={
"epochs": 3,
"learning_rate": 1e-5,
"batch_size": 4
},
suffix="astro-expert"
)
print(f"Job created: {job.id}, status: {job.status}")
# 3. Poll for completion
import time
while job.status not in ["succeeded", "failed", "cancelled"]:
time.sleep(30)
job = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {job.status}")
# 4. Use the fine-tuned model
if job.status == "succeeded":
response = client.chat.completions.create(
model=job.fine_tuned_model,
messages=[{"role": "user", "content": "Analyze my chart"}]
)
print(response.choices[0].message.content)
import Xalen from "xalen-sdk";
import fs from "fs";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
// 1. Upload training data
const trainingFile = await client.files.create({
file: fs.createReadStream("training_data.jsonl"),
purpose: "fine-tune",
});
// 2. Create fine-tuning job
const job = await client.fineTuning.jobs.create({
model: "llama-3.3-70b",
training_file: trainingFile.id,
hyperparameters: {
epochs: 3,
learning_rate: 1e-5,
batch_size: 4,
},
suffix: "astro-expert",
});
console.log(`Job created: ${job.id}, status: ${job.status}`);
// 3. Poll for completion
let status = job.status;
while (!["succeeded", "failed", "cancelled"].includes(status)) {
await new Promise((r) => setTimeout(r, 30000));
const updated = await client.fineTuning.jobs.retrieve(job.id);
status = updated.status;
console.log(`Status: ${status}`);
}
// 4. Use the fine-tuned model
if (status === "succeeded") {
const updated = await client.fineTuning.jobs.retrieve(job.id);
const response = await client.chat.completions.create({
model: updated.fine_tuned_model,
messages: [{ role: "user", content: "Analyze my chart" }],
});
console.log(response.choices[0].message.content);
}
Response
{
"id": "ftjob-abc123",
"object": "fine_tuning.job",
"model": "llama-3.3-70b",
"status": "running",
"training_file": "file-xyz789",
"hyperparameters": {
"epochs": 3,
"learning_rate": 1e-5,
"batch_size": 4
},
"fine_tuned_model": null,
"created_at": 1717200000,
"finished_at": null,
"trained_tokens": 45200
}
Fine-tuning is billed per training token. Pricing varies by base model. Check the Pricing page for current per-token training rates. Inference on fine-tuned models is billed at the base model rate.
Dedicated Endpoints
Reserved GPU capacity for predictable performance. Dedicated endpoints give you isolated compute with no shared rate limits, built-in prompt caching, and support for hot-swapping LoRA adapters. Ideal for production workloads that demand consistent latency and throughput.
Hardware Options
| Hardware | VRAM | Price | Best For |
|---|---|---|---|
| NVIDIA H100 80GB | 80 GB | $3.49/GPU-hr | Large models, high throughput |
| NVIDIA A100 80GB | 80 GB | $2.09/GPU-hr | Mid-range models |
| NVIDIA L40S 48GB | 48 GB | $1.19/GPU-hr | Smaller models, cost-optimized |
Features
| Feature | Details |
|---|---|
| Prompt Caching | Up to 90% cost reduction on repeated prompt prefixes. Cached automatically on dedicated infrastructure. |
| Autoscaling | Scale from 1 to 16 replicas based on traffic. Configure min/max replicas and scaling thresholds. |
| LoRA Adapters | Hot-swap fine-tuned LoRA weights without redeploying. Attach multiple adapters to a single base model. |
| Custom Scaling Policies | Define scaling rules based on request queue depth, latency percentiles, or GPU utilization. |
| Isolated Rate Limits | Your endpoint is not affected by other tenants. Full throughput capacity is reserved for your traffic. |
Create Dedicated Endpoint
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Model to deploy. e.g. llama-3.3-70b, deepseek-v3 |
| hardware | string | Required | GPU type: h100-80gb, a100-80gb, or l40s-48gb |
| min_replicas | integer | Optional | Minimum replicas. Default: 1. |
| max_replicas | integer | Optional | Maximum replicas for autoscaling. Default: 1. |
List Endpoints
Returns all dedicated endpoints for your account, including status, hardware, model, and current replica count.
Delete Endpoint
Shut down a dedicated endpoint. Billing stops when the endpoint is fully deprovisioned.
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
# Create a dedicated endpoint
endpoint = client.endpoints.create(
model="llama-3.3-70b",
hardware="h100-80gb",
min_replicas=1,
max_replicas=4
)
print(f"Endpoint: {endpoint.id}, status: {endpoint.status}")
# Use the dedicated endpoint for inference
response = client.chat.completions.create(
model=endpoint.model,
messages=[{"role": "user", "content": "Hello!"}],
extra_headers={"X-Endpoint-Id": endpoint.id}
)
print(response.choices[0].message.content)
# List all endpoints
endpoints = client.endpoints.list()
for ep in endpoints.data:
print(f"{ep.id}: {ep.model} on {ep.hardware} ({ep.status})")
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
// Create a dedicated endpoint
const endpoint = await client.endpoints.create({
model: "llama-3.3-70b",
hardware: "h100-80gb",
min_replicas: 1,
max_replicas: 4,
});
console.log(`Endpoint: ${endpoint.id}, status: ${endpoint.status}`);
// Use the dedicated endpoint for inference
const response = await client.chat.completions.create({
model: endpoint.model,
messages: [{ role: "user", content: "Hello!" }],
}, {
headers: { "X-Endpoint-Id": endpoint.id },
});
console.log(response.choices[0].message.content);
// List all endpoints
const endpoints = await client.endpoints.list();
for (const ep of endpoints.data) {
console.log(`${ep.id}: ${ep.model} on ${ep.hardware} (${ep.status})`);
}
Dedicated endpoints require a Scale ($2,499/mo), Dedicated ($5,000+/mo), or Enterprise plan. Pay-as-you-go and Growth plans use shared serverless infrastructure.
Batch Processing
Process large workloads asynchronously at 50% lower cost. Submit up to 1,000 requests per batch, and XALEN processes them in the background with optimized throughput. Ideal for data labeling, content generation, bulk analysis, and any workload that does not require real-time responses.
Create Batch
| Parameter | Type | Required | Description |
|---|---|---|---|
| requests | array | Required | Array of up to 1,000 request objects. Each object has the same schema as a /v1/chat/completions request body. |
| model | string | Required | Model to use for all requests in the batch. e.g. deepseek-v3, llama-3.3-70b |
| metadata | object | Optional | Key-value metadata to attach to the batch for tracking. |
Get Batch Status
Retrieve the current status and results of a batch job. When the batch completes, the response includes an array of all outputs.
Code Examples
from xalen import Xalen
import time
client = Xalen(api_key="xln_live_YOUR_KEY")
# Create a batch of requests
batch = client.batch.create(
model="deepseek-v3",
requests=[
{"messages": [{"role": "user", "content": f"Summarize article {i}"}]}
for i in range(100)
],
metadata={"project": "content-pipeline"}
)
print(f"Batch {batch.id} created, status: {batch.status}")
# Poll for completion
while batch.status not in ["completed", "failed"]:
time.sleep(10)
batch = client.batch.retrieve(batch.id)
print(f"Progress: {batch.completed_requests}/{batch.total_requests}")
# Process results
if batch.status == "completed":
for result in batch.results:
print(result.choices[0].message.content[:100])
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
// Create a batch of requests
const requests = Array.from({ length: 100 }, (_, i) => ({
messages: [{ role: "user", content: `Summarize article ${i}` }],
}));
const batch = await client.batch.create({
model: "deepseek-v3",
requests,
metadata: { project: "content-pipeline" },
});
console.log(`Batch ${batch.id} created, status: ${batch.status}`);
// Poll for completion
let status = batch.status;
while (!["completed", "failed"].includes(status)) {
await new Promise((r) => setTimeout(r, 10000));
const updated = await client.batch.retrieve(batch.id);
status = updated.status;
console.log(`Progress: ${updated.completed_requests}/${updated.total_requests}`);
}
// Process results
if (status === "completed") {
const completed = await client.batch.retrieve(batch.id);
for (const result of completed.results) {
console.log(result.choices[0].message.content.slice(0, 100));
}
}
All batch requests are billed at 50% of standard per-token rates. The same model, the same quality, half the cost. Batches typically complete within 1-6 hours depending on size and model.
Evaluations
Automated model evaluation using LLM-as-a-Judge. Compare models head-to-head, score outputs against custom criteria, and track quality over time. Evaluations run server-side so you can benchmark without writing scoring infrastructure.
Evaluation Types
| Type | Description | Output |
|---|---|---|
| Classification | Judge assigns a discrete label to each response (e.g. "accurate", "inaccurate", "partial"). | Label + reasoning |
| Scoring | Judge rates each response on a numeric scale (e.g. 1-5 for relevance, accuracy, helpfulness). | Score + reasoning |
| Comparison | Judge selects the better response from two model outputs for the same prompt. | Winner + reasoning |
Supported Judge Models
Any chat model available on XALEN can serve as a judge. Recommended judges for high-quality evaluations:
| Model | Strengths |
|---|---|
| Llama 3.3 70B | Strong reasoning, good at nuanced scoring. Excellent general-purpose judge. |
| DeepSeek V3 | High accuracy on technical and domain-specific evaluations. |
| Qwen 2.5 72B | Multilingual evaluation strength. Good for non-English content scoring. |
Create Evaluation
| Parameter | Type | Required | Description |
|---|---|---|---|
| type | string | Required | Evaluation type: classification, scoring, or comparison. |
| judge_model | string | Required | Model ID for the judge. e.g. llama-3.3-70b |
| test_cases | array | Required | Array of test case objects. Each has input (messages array) and output (string or array of strings for comparison). |
| criteria | string | Optional | Custom evaluation criteria for the judge. e.g. "Rate accuracy of astrological predictions on a 1-5 scale." |
| scale | object | Optional | For scoring type: min and max values. Default: {"min": 1, "max": 5}. |
Get Evaluation Results
Retrieve the results of a completed evaluation, including per-test-case scores and aggregate metrics.
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
# Score model outputs on a 1-5 scale
evaluation = client.evaluations.create(
type="scoring",
judge_model="llama-3.3-70b",
criteria="Rate the accuracy and helpfulness of each response on a 1-5 scale.",
scale={"min": 1, "max": 5},
test_cases=[
{
"input": [{"role": "user", "content": "What is Ketu in astrology?"}],
"output": "Ketu is the south node of the Moon..."
},
{
"input": [{"role": "user", "content": "Explain Venus Mahadasha."}],
"output": "Venus Mahadasha lasts 20 years..."
}
]
)
# Retrieve results
results = client.evaluations.retrieve(evaluation.id)
print(f"Average score: {results.aggregate.mean_score}")
for case in results.results:
print(f"Score: {case.score}, Reasoning: {case.reasoning[:80]}")
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
// Compare two model outputs head-to-head
const evaluation = await client.evaluations.create({
type: "comparison",
judge_model: "llama-3.3-70b",
criteria: "Which response is more accurate and detailed?",
test_cases: [
{
input: [{ role: "user", content: "What is Ketu in astrology?" }],
output: [
"Ketu is the south node of the Moon in Vedic astrology...",
"Ketu represents past karma and spiritual liberation..."
],
},
],
});
// Retrieve results
const results = await client.evaluations.retrieve(evaluation.id);
for (const r of results.results) {
console.log(`Winner: Output ${r.winner}, Reasoning: ${r.reasoning}`);
}
GPU Clusters
On-demand GPU clusters for training, large batch jobs, and custom workloads. Self-service provisioning via API or dashboard. Spin up multi-node clusters with high-bandwidth interconnects, run your training jobs, and tear down when finished.
Available GPUs
| GPU | VRAM | Interconnect | Price |
|---|---|---|---|
| NVIDIA H100 SXM | 80 GB | NVLink 900 GB/s | $3.49/GPU-hr |
| NVIDIA B200 | 192 GB | NVLink 1800 GB/s | $5.99/GPU-hr |
| NVIDIA A100 SXM | 80 GB | NVLink 600 GB/s | $2.09/GPU-hr |
| NVIDIA L40S | 48 GB | PCIe Gen4 | $1.19/GPU-hr |
Features
| Feature | Details |
|---|---|
| Cluster Management | Slurm and Kubernetes orchestration. Choose your preferred scheduler at provisioning time. |
| Persistent Storage | NVMe SSD storage attached to every node. Data persists across job restarts within the cluster lifetime. |
| Multi-Region | Clusters available in US, Europe, and Asia Pacific. Select region at creation time for data residency compliance. |
| Health Monitoring | Real-time GPU utilization, memory, temperature, and error metrics via API and dashboard. |
| SSH Access | Direct SSH access to cluster nodes for custom setup, debugging, and environment configuration. |
Create Cluster
| Parameter | Type | Required | Description |
|---|---|---|---|
| gpu_type | string | Required | GPU model: h100-sxm, b200, a100-sxm, or l40s |
| gpu_count | integer | Required | Number of GPUs. Must be a multiple of 8 for H100/B200/A100. |
| region | string | Optional | Deployment region: us-east, us-west, eu-west, ap-south. Default: us-east. |
| scheduler | string | Optional | Orchestration: slurm or kubernetes. Default: kubernetes. |
| max_runtime_hours | integer | Optional | Auto-terminate after this many hours. Default: no limit. |
Get Cluster Status
Retrieve cluster status, node health, GPU utilization, and connection details (SSH host, Kubernetes config).
Delete Cluster
Terminate a cluster and release all resources. Billing stops when deprovisioning completes.
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
# Provision an 8xH100 cluster
cluster = client.clusters.create(
gpu_type="h100-sxm",
gpu_count=8,
region="us-east",
scheduler="kubernetes",
max_runtime_hours=24
)
print(f"Cluster {cluster.id}: {cluster.status}")
print(f"SSH: {cluster.ssh_host}")
print(f"K8s config: {cluster.kubeconfig_url}")
# Monitor GPU utilization
status = client.clusters.retrieve(cluster.id)
for node in status.nodes:
print(f"Node {node.id}: GPU util {node.gpu_utilization}%, "
f"memory {node.gpu_memory_used_gb}/{node.gpu_memory_total_gb} GB")
# Tear down when done
client.clusters.delete(cluster.id)
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
// Provision an 8xH100 cluster
const cluster = await client.clusters.create({
gpu_type: "h100-sxm",
gpu_count: 8,
region: "us-east",
scheduler: "kubernetes",
max_runtime_hours: 24,
});
console.log(`Cluster ${cluster.id}: ${cluster.status}`);
console.log(`SSH: ${cluster.ssh_host}`);
console.log(`K8s config: ${cluster.kubeconfig_url}`);
// Monitor GPU utilization
const status = await client.clusters.retrieve(cluster.id);
for (const node of status.nodes) {
console.log(
`Node ${node.id}: GPU util ${node.gpu_utilization}%, ` +
`memory ${node.gpu_memory_used_gb}/${node.gpu_memory_total_gb} GB`
);
}
// Tear down when done
await client.clusters.delete(cluster.id);
GPU clusters require a Dedicated ($5,000+/mo) or Enterprise plan. Contact sales for custom cluster configurations and reserved capacity pricing.
Prompt Caching
Reduce costs and latency by caching repeated prompt prefixes. When you send the same system prompt, few-shot examples, or long context prefix across multiple requests, XALEN caches the prefix and reuses it. Cache reads cost up to 90% less than processing the same tokens from scratch.
How It Works
| Step | Details |
|---|---|
| 1. Cache Creation | On the first request, the prompt prefix is processed and cached. Cache creation costs the same as standard input tokens. |
| 2. Cache Hit | Subsequent requests with an identical prefix hit the cache. Cached tokens are billed at 90% less than standard input token rates. |
| 3. Cache Eviction | Caches expire after a period of inactivity (typically 5-10 minutes). Frequently used prefixes stay cached indefinitely. |
Per-Model Caching Support
| Model | Serverless Cache | Dedicated Cache | Cache Discount |
|---|---|---|---|
| Claude Opus 4.7 | Yes | N/A | 90% on reads |
| Claude Sonnet 4.6 | Yes | N/A | 90% on reads |
| Claude Haiku 4.5 | Yes | N/A | 90% on reads |
| DeepSeek V3 | No | Yes | 90% on reads |
| Llama 3.3 70B | No | Yes | 90% on reads |
| Qwen 2.5 72B | No | Yes | 90% on reads |
Claude models receive automatic caching through the inference provider. Other models require a dedicated endpoint for caching support.
Usage Tracking
When prompt caching is active, the response usage object includes two additional fields:
| Field | Type | Description |
|---|---|---|
| cache_creation_tokens | integer | Number of tokens written to cache on this request. Billed at standard input rate. |
| cache_read_tokens | integer | Number of tokens read from cache. Billed at 90% discount. |
Code Examples
from xalen import Xalen
client = Xalen(api_key="xln_live_YOUR_KEY")
# Long system prompt that stays the same across requests
system_prompt = """You are a Vedic astrology expert trained on classical texts
including BPHS, Phaladeepika, and Brihat Jataka. Provide detailed analysis
based on planetary positions, houses, aspects, and dasha periods.
Always cite the relevant classical reference for each interpretation.
...(2000+ tokens of instructions)..."""
# First request: creates cache (billed at standard rate)
r1 = client.chat.completions.create(
model="claude-sonnet-4.6",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Analyze Saturn in the 10th house."}
]
)
print(f"Cache created: {r1.usage.cache_creation_tokens} tokens")
# Second request: hits cache (90% cheaper on cached prefix)
r2 = client.chat.completions.create(
model="claude-sonnet-4.6",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "What does Jupiter in the 5th house mean?"}
]
)
print(f"Cache read: {r2.usage.cache_read_tokens} tokens (90% discount)")
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: "xln_live_YOUR_KEY" });
// Long system prompt that stays the same across requests
const systemPrompt = `You are a Vedic astrology expert trained on classical texts
including BPHS, Phaladeepika, and Brihat Jataka. Provide detailed analysis
based on planetary positions, houses, aspects, and dasha periods.
Always cite the relevant classical reference for each interpretation.
...(2000+ tokens of instructions)...`;
// First request: creates cache
const r1 = await client.chat.completions.create({
model: "claude-sonnet-4.6",
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: "Analyze Saturn in the 10th house." },
],
});
console.log(`Cache created: ${r1.usage.cache_creation_tokens} tokens`);
// Second request: hits cache (90% cheaper on cached prefix)
const r2 = await client.chat.completions.create({
model: "claude-sonnet-4.6",
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: "What does Jupiter in the 5th house mean?" },
],
});
console.log(`Cache read: ${r2.usage.cache_read_tokens} tokens (90% discount)`);
Structure your prompts with the static content first (system instructions, few-shot examples, reference data) and dynamic content last (user query). The longer the shared prefix, the greater the cost savings. A 2,000-token cached prefix saves roughly $0.027 per request on Claude Sonnet 4.6.
Model Feature Matrix
Not every model supports every feature. Use this matrix to find the right model for your use case. Features marked "Dedicated" are available only when using a dedicated endpoint.
| Model | Chat | Vision | Streaming | Fine-tune | Batch | Dedicated | Caching | Functions |
|---|---|---|---|---|---|---|---|---|
| Claude Opus 4.7 | Yes | Yes | Yes | No | No | No | Yes | Yes |
| Claude Sonnet 4.6 | Yes | Yes | Yes | No | No | No | Yes | Yes |
| Claude Haiku 4.5 | Yes | No | Yes | No | No | No | Yes | Yes |
| DeepSeek V3 | Yes | No | Yes | Yes | Yes | Yes | Dedicated | Yes |
| DeepSeek R1 | Yes | No | Yes | Yes | Yes | Yes | Dedicated | No |
| Llama 3.3 70B | Yes | No | Yes | Yes | Yes | Yes | Dedicated | Yes |
| Llama 4 Scout | Yes | Yes | Yes | LoRA | Yes | Yes | Dedicated | Yes |
| Qwen 2.5 72B | Yes | No | Yes | Yes | Yes | Yes | Dedicated | Yes |
| Qwen3 235B | Yes | No | Yes | LoRA | Yes | Yes | Dedicated | Yes |
| GPT-OSS 120B | Yes | No | Yes | No | Yes | Yes | Dedicated | Yes |
| Vedika Standard | Yes | No | Yes | No | No | No | No | No |
| Vedika Fast | Yes | No | Yes | No | No | No | No | No |
Yes = fully supported on serverless. Dedicated = available only on dedicated endpoints. LoRA = LoRA adapter fine-tuning only (not full-parameter). No = not available for this model. Use GET /v1/models for the latest capabilities.
SDKs
Official client libraries for Python, JavaScript, and an MCP server for AI coding assistants.
Python
pip install xalen
from xalen import Xalen
# Uses XALEN_API_KEY env var by default
client = Xalen()
# Or pass explicitly
client = Xalen(api_key="xln_live_YOUR_KEY")
# All OpenAI-compatible methods work
response = client.chat.completions.create(
model="vedika-standard",
messages=[{"role": "user", "content": "Hello"}]
)
JavaScript / TypeScript
npm install xalen-sdk
import Xalen from "xalen-sdk";
const client = new Xalen({ apiKey: process.env.XALEN_API_KEY });
const response = await client.chat.completions.create({
model: "vedika-standard",
messages: [{ role: "user", content: "Hello" }],
});
MCP Server
Connect XALEN to AI coding assistants like Claude Code, Cursor, and GitHub Copilot using the Model Context Protocol.
npx xalen-mcp
The MCP server exposes 15 tools covering chat, embeddings, image generation, voice, astrology, and billing. See the npm package for configuration details.
Mobile
The XALEN REST API works from any platform, including React Native, Flutter, Swift, and Kotlin. Since the API is OpenAI-compatible, any existing OpenAI mobile wrapper or HTTP client works out of the box.
// Works today — standard fetch from React Native
const response = await fetch('https://api.xalen.io/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': 'Bearer xln_live_YOUR_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'vedika-standard',
messages: [{ role: 'user', content: 'Hello' }],
}),
});
const data = await response.json();
// Flutter SDK — pub.dev: xalen ^0.1.0 import 'package:xalen/xalen.dart'; final client = XALEN(apiKey: 'xln_live_YOUR_KEY'); final response = await client.chatCompletion( model: 'vedika-standard', messages: [ChatMessage(role: 'user', content: 'Hello')], ); print(response.choices.first.message.content);
React hooks: npm install xalen-react xalen-sdk — includes useChat, useCompletion, useVoice, and useAstrology hooks with XALENProvider context. Flutter/Dart: xalen: ^0.1.0 in pubspec.yaml — typed client with chat completions, astrology endpoints, and error handling. Offline caching and Expo module coming soon.
Errors
XALEN uses standard HTTP status codes. All error responses include a JSON body with error.type and error.message.
| Status | Type | Description |
|---|---|---|
| 400 | invalid_request | Missing or invalid parameters. |
| 401 | authentication_error | Invalid or missing API key. |
| 403 | permission_denied | Key lacks permission for this endpoint. |
| 404 | not_found | Endpoint or resource not found. |
| 429 | rate_limit_exceeded | Too many requests. Retry after the time in Retry-After header. |
| 500 | server_error | Internal error. Contact support if persistent. |
| 503 | service_unavailable | Temporary overload. Retry with exponential backoff. |
Error Response Format
{
"error": {
"type": "authentication_error",
"message": "Invalid API key. Keys must start with xln_live_.",
"status": 401
}
}
For 429 and 503 errors, use exponential backoff starting at 1 second. The SDKs handle this automatically with up to 3 retries.
Production Retry Examples
For production workloads, implement retry logic with exponential backoff and proper error classification. These examples handle rate limits, transient server errors, and hard failures differently.
import xalen
import time
client = xalen.Client(api_key="your-key")
def query_with_retry(prompt, max_retries=3):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="vedika-pro-ultra",
messages=[{"role": "user", "content": prompt}]
)
return response
except xalen.RateLimitError as e:
wait = min(2 ** attempt, 60)
time.sleep(wait)
except xalen.APIError as e:
if e.status_code >= 500:
time.sleep(2 ** attempt)
continue
raise
raise Exception("Max retries exceeded")
import Xalen from 'xalen-sdk';
const client = new Xalen({ apiKey: 'your-key' });
async function queryWithRetry(prompt, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await client.chat.completions.create({
model: 'vedika-pro-ultra',
messages: [{ role: 'user', content: prompt }]
});
} catch (err) {
if (err.status === 429 || err.status >= 500) {
const wait = Math.min(Math.pow(2, attempt) * 1000, 60000);
await new Promise(r => setTimeout(r, wait));
continue;
}
throw err;
}
}
throw new Error('Max retries exceeded');
}
Rate Limits
Rate limits are applied per API key and vary by plan. Limits are returned in response headers.
| Plan | RPM | TPM |
|---|---|---|
| Pay-as-you-go | 60 RPM | 100K TPM |
| Growth | 300 RPM | 500K TPM |
| Scale | 1,000 RPM | 2M TPM |
| Dedicated | 5,000 RPM | 10M TPM |
| Enterprise | Custom | Custom |
Rate Limit Headers
| Header | Description |
|---|---|
| x-ratelimit-limit-requests | Maximum requests per minute for your plan. |
| x-ratelimit-remaining-requests | Remaining requests in the current window. |
| x-ratelimit-reset-requests | Seconds until the request limit resets. |
| x-ratelimit-limit-tokens | Maximum tokens per minute for your plan. |
| x-ratelimit-remaining-tokens | Remaining tokens in the current window. |
API Versioning & Deprecation
XALEN uses URL-path versioning to guarantee backward compatibility. The current stable version is v1, and all endpoints are prefixed with /v1/.
Versioning Scheme
| Aspect | Policy |
|---|---|
| Current version | v1 (stable) |
| Version in URL | Major version in path: /v1/, /v2/ |
| Backward compatibility | v1 will be supported for a minimum of 24 months after v2 launches |
| Breaking changes | No breaking changes to existing endpoints without a version bump |
Deprecation Notice Periods
When an API version or model is deprecated, you will receive advance notice based on your plan tier.
| Plan | API Deprecation | Model Deprecation |
|---|---|---|
| Enterprise | 90 days | 60 days + migration guide |
| Scale | 60 days | 60 days + migration guide |
| Growth | 30 days | 60 days + migration guide |
| Pay-as-you-go | 30 days | 60 days + migration guide |
Sunset Headers
Deprecated endpoints include a Sunset header indicating the final date of availability, plus a Deprecation header with the date the deprecation was announced.
Sunset: Sat, 01 Nov 2027 00:00:00 GMT Deprecation: Wed, 01 Aug 2027 00:00:00 GMT Link: <https://xalen.io/docs#versioning>; rel="sunset"
When a model is deprecated, all notices include an equivalent model recommendation so you can migrate with minimal code changes. The deprecated model continues to function for the full 60-day notice period.
Data Portability & Export
XALEN is built on open standards. Your data, your code, your choice of provider. No lock-in, no proprietary formats, no exit fees.
Open Standards
| Aspect | Details |
|---|---|
| Response format | All API responses use standard JSON. No proprietary serialization or binary formats. |
| OpenAI compatibility | Drop-in compatible with any OpenAI-compatible provider. Migrate to or from XALEN by changing the base URL and API key. |
| Query language | Standard REST API. No proprietary query language or DSL to learn. |
| SDKs | Open-source Python and JavaScript SDKs. Portable, no vendor-specific runtime dependencies. |
Data Export
Export all account data as a single JSON archive. Includes usage history, billing records, API key metadata, and account settings.
{
"account": { "email": "[email protected]", "plan": "growth", "created": "2026-01-15" },
"usage": [ { "date": "2026-05-14", "requests": 1420, "tokens": 892300, "cost_cents": 178 } ],
"billing": [ { "id": "pay_abc123", "amount": 5000, "date": "2026-05-01", "status": "completed" } ],
"api_keys": [ { "id": "key_1", "prefix": "xln_live_a3f...", "created": "2026-01-15", "last_used": "2026-05-14" } ]
}
Code Ownership
Code generated through XALEN Studio is 100% customer-owned. Download your projects as a ZIP archive at any time. No licensing restrictions, no attribution requirements, no runtime dependency on XALEN infrastructure.
XALEN supports data portability rights under GDPR Article 20 and India's DPDPA. Submit export requests via the dashboard or email [email protected] for assisted exports.