Compact · backend
Use Llama 3.1 8B Turbo with FastAPI
Updated 2026-05-21 · By XALEN
How to use Llama 3.1 8B Turbo (Compact, 8B) with FastAPI. Install, authenticate, and make your first API call in minutes. Working code example included.
Model
Llama 3.1 8B Turbo
8B · 128K context · $0.01 input
Framework
FastAPI
backend · pip install xalen fastapi uvicorn
1. Install
pip install xalen fastapi uvicorn
2. Code
from fastapi import FastAPI
from pydantic import BaseModel
from xalen import XALEN
app = FastAPI()
client = XALEN(api_key="xln_test_YOUR_KEY")
class ChatRequest(BaseModel):
messages: list
@app.post("/chat")
async def chat(req: ChatRequest):
response = client.chat.completions.create(
model="llama-3-1-8b-turbo",
messages=[m.dict() for m in req.messages]
)
return {"reply": response.choices[0].message.content}
Llama 3.1 8B Turbo with Other Frameworks
Other Models with FastAPI
200+ models. One API. Works with any framework.
Get API KeyLast updated: 2026-05-21