Weights & Biases — kaged models

id: wandb

npm: @ai-sdk/openai-compatible

env: WANDB_API_KEY

api: https://api.inference.wandb.ai/v1

doc: https://docs.wandb.ai/guides/integrations/inference/

Models

DeepSeek V3.1

deepseek-ai/DeepSeek-V3.1

in $0.55/M

out $1.65/M

ctx: 161,000 max out: 161,000 in: text out: text

reasoning tools vision structured temp open weights

GLM 5

zai-org/GLM-5-FP8

in $1.00/M

out $3.20/M

ctx: 200,000 max out: 200,000 in: text out: text

reasoning tools vision structured temp open weights

GLM-5.1

zai-org/GLM-5.1

in $1.40/M

out $4.40/M

cache read $0.26/M

cache write $0.00/M

ctx: 200,000 max out: 131,072 in: text out: text

reasoning tools vision structured temp open weights

gpt-oss-120b

openai/gpt-oss-120b

in $0.15/M

out $0.60/M

ctx: 131,072 max out: 131,072 in: text out: text

reasoning tools vision structured temp open weights

gpt-oss-20b

openai/gpt-oss-20b

in $0.05/M

out $0.20/M

ctx: 131,072 max out: 131,072 in: text out: text

reasoning tools vision structured temp open weights

Kimi K2.5

moonshotai/Kimi-K2.5

in $0.50/M

out $2.85/M

ctx: 262,144 max out: 262,144 in: text, image out: text

reasoning tools vision structured temp open weights

Llama 3.1 70B

meta-llama/Llama-3.1-70B-Instruct

in $0.80/M

out $0.80/M

ctx: 128,000 max out: 128,000 in: text out: text

reasoning tools vision structured temp open weights

Llama 4 Scout 17B 16E Instruct

meta-llama/Llama-4-Scout-17B-16E-Instruct

in $0.17/M

out $0.66/M

ctx: 64,000 max out: 64,000 in: text, image out: text

reasoning tools vision structured temp open weights

Llama-3.3-70B-Instruct

meta-llama/Llama-3.3-70B-Instruct

in $0.71/M

out $0.71/M

ctx: 128,000 max out: 128,000 in: text out: text

reasoning tools vision structured temp open weights

Meta-Llama-3.1-8B-Instruct

meta-llama/Llama-3.1-8B-Instruct

in $0.22/M

out $0.22/M

ctx: 128,000 max out: 128,000 in: text out: text

reasoning tools vision structured temp open weights

MiniMax M2.5

MiniMaxAI/MiniMax-M2.5

in $0.30/M

out $1.20/M

ctx: 196,608 max out: 196,608 in: text out: text

reasoning tools vision structured temp open weights

NVIDIA Nemotron 3 Super 120B

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8

in $0.20/M

out $0.80/M

ctx: 262,144 max out: 262,144 in: text out: text

reasoning tools vision structured temp open weights

OpenPipe Qwen3 14B Instruct

OpenPipe/Qwen3-14B-Instruct

in $0.05/M

out $0.22/M

ctx: 32,768 max out: 32,768 in: text out: text

reasoning tools vision structured temp open weights

Phi-4-mini-instruct

microsoft/Phi-4-mini-instruct

in $0.08/M

out $0.35/M

ctx: 128,000 max out: 128,000 in: text out: text

reasoning tools vision structured temp open weights

Qwen3 235B A22B Instruct 2507

Qwen/Qwen3-235B-A22B-Instruct-2507

in $0.10/M

out $0.10/M

ctx: 262,144 max out: 262,144 in: text out: text

reasoning tools vision structured temp open weights

Qwen3 30B A3B Instruct 2507

Qwen/Qwen3-30B-A3B-Instruct-2507

in $0.10/M

out $0.30/M

ctx: 262,144 max out: 262,144 in: text out: text

reasoning tools vision structured temp open weights

Qwen3-235B-A22B-Thinking-2507

Qwen/Qwen3-235B-A22B-Thinking-2507

in $0.10/M

out $0.10/M

ctx: 262,144 max out: 262,144 in: text out: text

reasoning tools vision structured temp open weights

Qwen3-Coder-480B-A35B-Instruct

Qwen/Qwen3-Coder-480B-A35B-Instruct

in $1.00/M

out $1.50/M

ctx: 262,144 max out: 262,144 in: text out: text

reasoning tools vision structured temp open weights