Deep Infra

[PROVIDER]
id: deepinfra
npm: @ai-sdk/deepinfra
env: DEEPINFRA_API_KEY

Models

DeepSeek V4 Flash

deepseek-ai/DeepSeek-V4-Flash
in $0.10/M
out $0.20/M
cache read $0.02/M
ctx: 1,048,576 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

DeepSeek V4 Pro

deepseek-ai/DeepSeek-V4-Pro
in $1.30/M
out $2.60/M
cache read $0.10/M
ctx: 1,048,576 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

DeepSeek-R1-0528

deepseek-ai/DeepSeek-R1-0528
in $0.50/M
out $2.15/M
cache read $0.35/M
ctx: 163,840 max out: 64,000 in: text out: text
reasoning tools vision structured temp open weights

DeepSeek-V3.2

deepseek-ai/DeepSeek-V3.2
in $0.26/M
out $0.38/M
cache read $0.13/M
ctx: 163,840 max out: 64,000 in: text out: text
reasoning tools vision structured temp open weights

Gemma 4 26B A4B IT

google/gemma-4-26B-A4B-it
in $0.07/M
out $0.34/M
ctx: 262,144 max out: 32,768 in: text, image out: text
reasoning tools vision structured temp open weights

Gemma 4 31B IT

google/gemma-4-31B-it
in $0.13/M
out $0.38/M
ctx: 262,144 max out: 32,768 in: text, image out: text
reasoning tools vision structured temp open weights

GLM-4.6

zai-org/GLM-4.6
in $0.43/M
out $1.74/M
cache read $0.08/M
ctx: 202,752 max out: 131,072 in: text out: text
reasoning tools vision structured temp open weights

GLM-4.7

zai-org/GLM-4.7
in $0.40/M
out $1.75/M
cache read $0.08/M
ctx: 202,752 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

GLM-4.7-Flash

zai-org/GLM-4.7-Flash
in $0.06/M
out $0.40/M
ctx: 202,752 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

GLM-5

zai-org/GLM-5
in $0.60/M
out $2.08/M
cache read $0.12/M
ctx: 202,752 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

GLM-5.1

zai-org/GLM-5.1
in $1.05/M
out $3.50/M
cache read $0.20/M
ctx: 202,752 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

GPT OSS 120B

openai/gpt-oss-120b
in $0.04/M
out $0.19/M
ctx: 131,072 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

GPT OSS 20B

openai/gpt-oss-20b
in $0.03/M
out $0.14/M
ctx: 131,072 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

Kimi K2.5

moonshotai/Kimi-K2.5
in $0.45/M
out $2.25/M
cache read $0.07/M
ctx: 262,144 max out: 32,768 in: text, image, video out: text
reasoning tools vision structured temp open weights

Kimi K2.6

moonshotai/Kimi-K2.6
in $0.75/M
out $3.50/M
cache read $0.15/M
ctx: 262,144 max out: 16,384 in: text, image, video out: text
reasoning tools vision structured temp open weights

Llama 3.3 70B Turbo

meta-llama/Llama-3.3-70B-Instruct-Turbo
in $0.10/M
out $0.32/M
ctx: 131,072 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

Llama 4 Maverick 17B FP8

meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
in $0.15/M
out $0.60/M
ctx: 1,048,576 max out: 16,384 in: text, image out: text
reasoning tools vision structured temp open weights

Llama 4 Scout 17B

meta-llama/Llama-4-Scout-17B-16E-Instruct
in $0.10/M
out $0.30/M
ctx: 327,680 max out: 16,384 in: text, image out: text
reasoning tools vision structured temp open weights

MiMo-V2.5

XiaomiMiMo/MiMo-V2.5
in $0.40/M
out $2.00/M
cache read $0.08/M
ctx: 262,144 max out: 16,384 in: text, image, audio, video out: text
reasoning tools vision structured temp open weights

MiMo-V2.5-Pro

XiaomiMiMo/MiMo-V2.5-Pro
in $1.00/M
out $3.00/M
cache read $0.20/M
ctx: 1,048,576 max out: 16,384 in: text out: text
reasoning tools vision structured temp open weights

MiniMax M2.5

MiniMaxAI/MiniMax-M2.5
in $0.15/M
out $1.15/M
cache read $0.03/M
cache write $0.38/M
ctx: 196,608 max out: 131,072 in: text out: text
reasoning tools vision structured temp open weights

Qwen 3.5 35B A3B

Qwen/Qwen3.5-35B-A3B
in $0.14/M
out $1.00/M
cache read $0.05/M
ctx: 262,144 max out: 81,920 in: text, image, video out: text
reasoning tools vision structured temp open weights

Qwen 3.5 397B A17B

Qwen/Qwen3.5-397B-A17B
in $0.45/M
out $3.00/M
cache read $0.22/M
ctx: 262,144 max out: 81,920 in: text, image, video out: text
reasoning tools vision structured temp open weights

Qwen3 Coder 480B A35B Instruct Turbo

Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo
in $0.30/M
out $1.00/M
ctx: 262,144 max out: 66,536 in: text out: text
reasoning tools vision structured temp open weights

Qwen3.6 35B A3B

Qwen/Qwen3.6-35B-A3B
in $0.15/M
out $0.95/M
ctx: 262,144 max out: 81,920 in: text, image, video out: text
reasoning tools vision structured temp open weights