Updated May 2026

All models. One API.

Text, coding, image, video, audio, embedding & rerank — one OpenAI-compatible API.

74 models

DeepSeek V4 Pro

NEWText

DeepSeek's next-generation flagship. 1M context, native reasoning mode, strong agentic coding and STEM.

Context Window
1M
Max Output
384K
Latency
Medium
Input price
$1.69 / 1M
Output price
$3.38 / 1M

DeepSeek V3.2

Text

Frontier-tier general intelligence with strong reasoning, coding, and multilingual coverage.

Context Window
128K
Max Output
32K
Latency
Fast
Input price
$0.270 / 1M
Output price
$0.400 / 1M

DeepSeek V4 Flash

NEWText

DeepSeek V4's fast variant. Same 1M context and reasoning mode, optimised for high concurrency and low latency.

Context Window
1M
Max Output
384K
Latency
Fast
Input price
$0.140 / 1M
Output price
$0.280 / 1M

Claude Sonnet 4.6

TextArabic

Anthropic's workhorse coding and chat model. Best Arabic among Western flagships, strong tool use, 200K context.

Context Window
200K
Max Output
64K
Latency
Medium
Input price
$3.00 / 1M
Output price
$15.00 / 1M

Qwen3.5-397B-A17B

TextArabic

Alibaba's flagship MoE — 397B total / 17B active. Frontier general intelligence with the best Arabic in the catalog.

Context Window
256K
Max Output
32K
Latency
Medium
Input price
$0.600 / 1M
Output price
$3.60 / 1M

GPT-5.4

TextArabic

OpenAI's flagship. Frontier reasoning, vision input, 400K context. Tier 1 pricing up to 272K.

Context Window
400K
Max Output
16K
Latency
Medium
Input price
$2.50 / 1M
Output price
$15.00 / 1M

Claude Opus 4.7

TextArabic

Anthropic's frontier model. Highest capability for complex reasoning, long-form, and agentic tasks.

Context Window
200K
Max Output
32K
Latency
Medium
Input price
$5.00 / 1M
Output price
$25.00 / 1M

Grok 4

Text

xAI's Grok 4. Real-time-aware reasoning model with strong coding and math.

Context Window
128K
Max Output
8K
Latency
Medium
Input price
$3.00 / 1M
Output price
$15.00 / 1M

Llama 3.3 70B Instruct

Text

Meta's Llama 3.3 70B Instruct. Open-weight foundation, widely fine-tuned, reliable baseline.

Context Window
128K
Max Output
8K
Latency
Medium
Input price
$0.130 / 1M
Output price
$0.390 / 1M

DeepSeek R1 0528

Text

Open chain-of-thought reasoning. Competitive on math, coding, and logic benchmarks against frontier reasoning models.

Context Window
128K
Max Output
8K
Latency
Slow
Input price
$0.700 / 1M
Output price
$2.50 / 1M

Kimi K2.5

Text

Kimi K2.5 — Moonshot AI flagship. 200K context, strong agentic and tool-use performance.

Context Window
200K
Max Output
16K
Latency
Medium
Input price
$0.600 / 1M
Output price
$3.00 / 1M

DeepSeek V3.1 Terminus

Text

V3.1 with the Terminus refresh — improved coding, longer effective context, lower hallucination rate.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.270 / 1M
Output price
$1.00 / 1M

Gemma 4 31B IT

TextArabic

Google text generation model. Available via THALAM.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.140 / 1M
Output price
$0.400 / 1M

GLM 4.6V

Text

Zhipu's flagship vision-language model. Strong document, OCR, and chart understanding.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.300 / 1M
Output price
$0.900 / 1M

GLM-5

Text

GLM-5 — the prior generation. Cheaper than 5.1, still very capable for general use.

Context Window
128K
Max Output
8K
Latency
Medium
Input price
$1.00 / 1M
Output price
$3.20 / 1M

GLM-5.1

Text

Zhipu's flagship GLM 5.1 — long context, strong multilingual coverage, reasonable pricing.

Context Window
128K
Max Output
8K
Latency
Medium
Input price
$1.40 / 1M
Output price
$4.40 / 1M

GPT-5 mini

TextArabic

GPT-5 mini — smaller, faster sibling. Capable and well-suited to high-volume production tasks.

Context Window
400K
Max Output
16K
Latency
Medium
Input price
$0.250 / 1M
Output price
$2.00 / 1M

GPT-OSS 120B

TextArabic

OpenAI's open-weight 120B coding model. Apache-licensed, fully self-hostable.

Context Window
128K
Max Output
16K
Latency
Fast
Input price
$0.100 / 1M
Output price
$0.500 / 1M

Kimi K2 Thinking

Text

MoonshotAI text generation model. Available via THALAM.

Context Window
128K
Max Output
8K
Latency
Slow
Input price
$0.600 / 1M
Output price
$2.50 / 1M

Llama 4 Maverick

Text

Llama 4 Maverick — Meta MoE flagship. Open-weight, 128 experts, FP8 quantized for speed.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.270 / 1M
Output price
$0.850 / 1M

Llama 4 Scout

Text

Llama 4 Scout — efficient sibling. 16 experts, lower cost, still very capable.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.180 / 1M
Output price
$0.590 / 1M

MiniMax M2.5

Text

M2.5 — the slightly older MiniMax. Same pricing as M2.7, kept for backward-compat.

Context Window
128K
Max Output
8K
Latency
Medium
Input price
$0.300 / 1M
Output price
$1.20 / 1M

MiniMax M2.7

Text

MiniMax M2.7 — fast multilingual chat. Strong Asian languages, decent English, cheap.

Context Window
128K
Max Output
8K
Latency
Medium
Input price
$0.300 / 1M
Output price
$1.20 / 1M

Mistral Nemo

Text

Cheapest model in the catalog. Mistral-Nemo for budget pipelines.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.040 / 1M
Output price
$0.170 / 1M

Qwen3 Max

TextArabic

Qwen3 Max — long context, strong reasoning, tiered pricing for high-volume use.

Context Window
256K
Max Output
16K
Latency
Medium
Input price
Output price

Qwen3.5-122B-A10B

TextArabic

Mid-tier Qwen3.5 MoE. The price/performance sweet spot for production workloads.

Context Window
256K
Max Output
16K
Latency
Medium
Input price
$0.400 / 1M
Output price
$3.20 / 1M

Qwen3 Coder 30B A3B

CodingArabic

IDE-tier coding model. Cheap, fast, fits inline-completion workloads.

Context Window
128K
Max Output
16K
Latency
Fast
Input price
$0.070 / 1M
Output price
$0.270 / 1M

Qwen3 Coder 480B A35B

CodingArabic

Qwen3 Coder 480B — Alibaba's flagship coding MoE. 256K context, 100+ languages, fill-in-the-middle.

Context Window
256K
Max Output
32K
Latency
Fast
Input price
$0.300 / 1M
Output price
$1.30 / 1M

Qwen3 Coder Next

CodingArabic

Alibaba coding model. Available via THALAM.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.200 / 1M
Output price
$1.50 / 1M

FLUX 2 Pro

Image

FLUX 2 Pro — frontier text-to-image. State-of-the-art aesthetics, prompt fidelity, fine detail.

Context Window
Max Output
Latency
Fast
Input price
$0.030 / image
Output price

FLUX.1 Kontext Max

Image

FLUX.1 Kontext Max — image edit and conditional generation with text prompts.

Context Window
Max Output
Latency
Fast
Input price
$0.072 / image
Output price

GLM Image

Image

GLM Image — Zhipu’s general image model. Solid quality at low cost.

Context Window
Max Output
Latency
Fast
Input price
$0.014 / image
Output price

Hunyuan Image 3

Image

Tencent's Hunyuan Image 3 — strong photorealism, multi-aspect ratio support.

Context Window
Max Output
Latency
Fast
Input price
$0.100 / image
Output price

Seedream 4.0

Image

Seedream 4.0 — best image deal in the catalog.

Context Window
Max Output
Latency
Fast
Input price
$0.030 / image
Output price

Seedream 4.5

Image

Seedream 4.5 — refresh of 4.0, same anchor pricing, slightly better fidelity.

Context Window
Max Output
Latency
Fast
Input price
$0.030 / image
Output price

Seedream 5.0 Lite

Image

Seedream 5.0 Lite — newest generation, smaller checkpoint, fast.

Context Window
Max Output
Latency
Fast
Input price
$0.035 / image
Output price

Z Image Turbo

Image

Z Image Turbo — cheapest image model, $0.005/image price anchor.

Context Window
Max Output
Latency
Fast
Input price
$0.0050 / image
Output price

Hunyuan Video Fast

Video

Hunyuan Video Fast — Tencent’s speed-optimised T2V at $0.30/video.

Context Window
Max Output
Latency
Slow
Input price
$0.300 / video
Output price

Kling V2.6 Pro Motion

Video

Kuaishou video generation model.

Context Window
Max Output
Latency
Slow
Input price
$0.350 / video
Output price

Kling V3.0 Pro I2V

Video

Kling V3.0 Pro — image-to-video. Animate stills with the same engine as T2V.

Context Window
Max Output
Latency
Slow
Input price
$1.120 / video
Output price

Kling V3.0 Pro T2V

Video

Kling V3.0 Pro — Kuaishou's flagship text-to-video. State-of-the-art motion fidelity.

Context Window
Max Output
Latency
Slow
Input price
$1.120 / video
Output price

Kling-o1 Edit Video

Video

Kuaishou video generation model.

Context Window
Max Output
Latency
Slow
Input price
Output price

MiniMax Hailuo 02

Video

MiniMax Hailuo 02 — fast, affordable text-to-video.

Context Window
Max Output
Latency
Slow
Input price
Output price

PixVerse V4.5 T2V

Video

PixVerse video generation model.

Context Window
Max Output
Latency
Slow
Input price
$0.350 / video
Output price

Seedance 1.5 Pro I2V

Video

ByteDance video generation model.

Context Window
Max Output
Latency
Slow
Input price
$0.270 / video
Output price

Seedance 1.5 Pro T2V

Video

ByteDance video generation model.

Context Window
Max Output
Latency
Slow
Input price
$0.270 / video
Output price

Vidu Q3 Pro T2V

Video

Shengshu video generation model.

Context Window
Max Output
Latency
Slow
Input price
$0.670 / video
Output price

Wan 2.5 T2V Preview

VideoArabic

Wan 2.5 T2V Preview — best video deal in the catalog.

Context Window
Max Output
Latency
Slow
Input price
$0.500 / video
Output price

Wan 2.6 T2V

VideoArabic

Wan 2.6 T2V — Alibaba’s text-to-video. Strong on cinematic shots and long takes.

Context Window
Max Output
Latency
Slow
Input price
$0.500 / video
Output price

ElevenLabs v3

Audio

ElevenLabs v3 — gold-standard voice synthesis.

Context Window
Max Output
Latency
Fast
Input price
$0.120 / min
Output price

Fish Audio TTS

Audio

Fish Audio TTS — multilingual voice synthesis with cloning.

Context Window
Max Output
Latency
Fast
Input price
$15.00 / 1M chars
Output price

MiniMax 2.8 HD Async

Audio

MiniMax audio model (TTS / STT).

Context Window
Max Output
Latency
Fast
Input price
$100.00 / 1M chars
Output price

MiniMax 2.8 Turbo

Audio

MiniMax 2.8 Turbo — latency-optimized variant for real-time TTS use cases.

Context Window
Max Output
Latency
Fast
Input price
$60.00 / 1M chars
Output price

MiniMax Speech 2.8 HD

Audio

MiniMax Speech 2.8 HD — top-tier multilingual TTS with natural prosody.

Context Window
Max Output
Latency
Fast
Input price
$100.00 / 1M chars
Output price

Llama 3.1 8B Instruct

NEWText

Meta's compact 8B Instruct — strong baseline for high-throughput backend chat and agent loops at near-zero cost.

Context Window
128K
Max Output
8K
Latency
Fast
Input price
$0.020 / 1M
Output price
$0.050 / 1M

Qwen MT Plus

NEWTextArabic

Alibaba's dedicated machine-translation model. Tuned for accuracy on Arabic↔English and 90+ language pairs.

Context Window
16K
Max Output
16K
Latency
Fast
Input price
$0.250 / 1M
Output price
$0.750 / 1M

Qwen3 VL 235B A22B Instruct

NEWTextArabic

Flagship vision-language model from Alibaba. 235B MoE with active-22B routing, 131K context, strong image understanding.

Context Window
131K
Max Output
32K
Latency
Medium
Input price
$0.300 / 1M
Output price
$1.500 / 1M

Qwen2.5 VL 72B Instruct

NEWTextArabic

Mature 72B dense vision-language model. Baseline VLM tier — different price-quality point from the 235B flagship.

Context Window
32K
Max Output
8K
Latency
Medium
Input price
$0.800 / 1M
Output price
$0.800 / 1M

BAAI BGE-M3

NEWEmbeddingArabic

Industry-standard multilingual embedding. Dense, sparse, and multi-vector retrieval in one model. The RAG default.

Context Window
8K
Max Output
Latency
Fast
Input price
$0.010 / 1M
Output price

Qwen3 Embedding 0.6B

NEWEmbeddingArabic

Tiny 0.6B Qwen3 embedding — designed for high-throughput RAG with a 32K window. Arabic-strong alternative to BGE.

Context Window
32K
Max Output
Latency
Fast
Input price
$0.070 / 1M
Output price

BGE Reranker v2-M3

NEWRerankingArabic

Industry-standard multilingual reranker. Drop in as the second stage of any RAG pipeline behind a vector recall step.

Context Window
8K
Max Output
Latency
Fast
Input price
$0.010 / 1M
Output price

Qwen Image Edit

NEWImage

Prompt-driven image editor from Alibaba. Single-pass edits like 'add sunglasses', 'change background to beach', 'remove the person on the left'.

Context Window
Max Output
Latency
Medium
Input price
$0.020 / image
Output price

FLUX.1 Kontext Pro

NEWImage

Mid-tier FLUX.1 Kontext — image edit and text-guided generation. Slots between Schnell (fast/cheap) and Kontext Max (premium).

Context Window
Max Output
Latency
Medium
Input price
$0.036 / image
Output price

Kling V3.0 4K T2V

NEWVideo

Kling 3.0 in 4K — text-to-video at premium resolution. Different SKU from the standard-resolution V3.0 Pro.

Context Window
Max Output
Latency
Slow
Input price
$0.420 / sec
Output price

Kling V3.0 4K I2V

NEWVideo

Kling 3.0 in 4K — image-to-video. Animate a still in premium resolution for hero / campaign output.

Context Window
Max Output
Latency
Slow
Input price
$0.420 / sec
Output price

MiniMax Hailuo 2.3 T2V

NEWVideo

Current-generation MiniMax Hailuo — text-to-video. Sharper motion and longer coherence than the Hailuo 02 predecessor.

Context Window
Max Output
Latency
Slow
Input price
$0.490 / video
Output price

MiniMax Hailuo 2.3 I2V

NEWVideo

Image-to-video variant of Hailuo 2.3. MiniMax's first I2V model — animate stills with prompt-controlled motion.

Context Window
Max Output
Latency
Slow
Input price
$0.490 / video
Output price

Wan 2.7 T2V

NEWVideoArabic

Latest-gen Wan — text-to-video. Per-second pricing at $0.10/s makes it the cost anchor for long-form video.

Context Window
Max Output
Latency
Slow
Input price
$0.100 / sec
Output price

Wan 2.7 I2V

NEWVideoArabic

Latest-gen Wan — image-to-video. Same $0.10/s pricing as T2V — the budget default for animating stills.

Context Window
Max Output
Latency
Slow
Input price
$0.100 / sec
Output price

GLM ASR

NEWAudio

Zhipu's speech-to-text — multilingual ASR for transcription, meeting capture, and voice-input flows.

Context Window
Max Output
Latency
Fast
Input price
$0.021 / 1M chars
Output price

MiniMax Voice Cloning

NEWAudio

Clone any voice from a 30-second reference sample. The premium tier for voice replication — MiniMax leads the quality bar in this category.

Context Window
Max Output
Latency
Medium
Input price
$2.400 / voice
Output price

MiniMax Voice Design

NEWAudio

Design a custom voice from a text description — 'middle-aged Arabic male, warm tone, slight raspy edge'. Companion to Voice Cloning.

Context Window
Max Output
Latency
Medium
Input price
$3.000 / voice
Output price

Fish Audio Voice Clone

NEWAudio

Cheap voice cloning at $0.10/voice — 24× cheaper than MiniMax. Trade quality for cost; use for higher-volume cases.

Context Window
Max Output
Latency
Medium
Input price
$0.100 / voice
Output price

MiniMax Music

NEWMusic

Text-to-music generation. Describe the song you want — genre, mood, instrumentation, lyrics — and the model composes and renders a track.

Context Window
Max Output
Latency
Medium
Input price
$0.150 / song
Output price