Inference

[PROVIDER]
id: inference
npm: @ai-sdk/openai-compatible
env: INFERENCE_API_KEY
api: https://inference.net/v1

Models

Google Gemma 3

google/gemma-3
in $0.15/M
out $0.30/M
ctx: 125,000 max out: 4,096 in: text, image out: text
reasoning tools vision structured temp open weights

Llama 3.1 8B Instruct

meta/llama-3.1-8b-instruct
in $0.03/M
out $0.03/M
ctx: 16,000 max out: 4,096 in: text out: text
reasoning tools vision structured temp open weights

Llama 3.2 11B Vision Instruct

meta/llama-3.2-11b-vision-instruct
in $0.06/M
out $0.06/M
ctx: 16,000 max out: 4,096 in: text, image out: text
reasoning tools vision structured temp open weights

Llama 3.2 1B Instruct

meta/llama-3.2-1b-instruct
in $0.01/M
out $0.01/M
ctx: 16,000 max out: 4,096 in: text out: text
reasoning tools vision structured temp open weights

Llama 3.2 3B Instruct

meta/llama-3.2-3b-instruct
in $0.02/M
out $0.02/M
ctx: 16,000 max out: 4,096 in: text out: text
reasoning tools vision structured temp open weights

Mistral Nemo 12B Instruct

mistral/mistral-nemo-12b-instruct
in $0.04/M
out $0.10/M
ctx: 16,000 max out: 4,096 in: text out: text
reasoning tools vision structured temp open weights

Osmosis Structure 0.6B

osmosis/osmosis-structure-0.6b
in $0.10/M
out $0.50/M
ctx: 4,000 max out: 2,048 in: text out: text
reasoning tools vision structured temp open weights

Qwen 2.5 7B Vision Instruct

qwen/qwen-2.5-7b-vision-instruct
in $0.20/M
out $0.20/M
ctx: 125,000 max out: 4,096 in: text, image out: text
reasoning tools vision structured temp open weights

Qwen 3 Embedding 4B

qwen/qwen3-embedding-4b
in $0.01/M
out $0.00/M
ctx: 32,000 max out: 2,048 in: text out: text
reasoning tools vision structured temp open weights