id: inference
npm: @ai-sdk/openai-compatible
env: INFERENCE_API_KEY
api: https://inference.net/v1
Models
Google Gemma 3
google/gemma-3in $0.15/M
out $0.30/M
ctx: 125,000
max out: 4,096
in: text, image
out: text
reasoning
tools
vision
structured
temp
open weights
Llama 3.1 8B Instruct
meta/llama-3.1-8b-instructin $0.03/M
out $0.03/M
ctx: 16,000
max out: 4,096
in: text
out: text
reasoning
tools
vision
structured
temp
open weights
Llama 3.2 11B Vision Instruct
meta/llama-3.2-11b-vision-instructin $0.06/M
out $0.06/M
ctx: 16,000
max out: 4,096
in: text, image
out: text
reasoning
tools
vision
structured
temp
open weights
Llama 3.2 1B Instruct
meta/llama-3.2-1b-instructin $0.01/M
out $0.01/M
ctx: 16,000
max out: 4,096
in: text
out: text
reasoning
tools
vision
structured
temp
open weights
Llama 3.2 3B Instruct
meta/llama-3.2-3b-instructin $0.02/M
out $0.02/M
ctx: 16,000
max out: 4,096
in: text
out: text
reasoning
tools
vision
structured
temp
open weights
Mistral Nemo 12B Instruct
mistral/mistral-nemo-12b-instructin $0.04/M
out $0.10/M
ctx: 16,000
max out: 4,096
in: text
out: text
reasoning
tools
vision
structured
temp
open weights
Osmosis Structure 0.6B
osmosis/osmosis-structure-0.6bin $0.10/M
out $0.50/M
ctx: 4,000
max out: 2,048
in: text
out: text
reasoning
tools
vision
structured
temp
open weights
Qwen 2.5 7B Vision Instruct
qwen/qwen-2.5-7b-vision-instructin $0.20/M
out $0.20/M
ctx: 125,000
max out: 4,096
in: text, image
out: text
reasoning
tools
vision
structured
temp
open weights
Qwen 3 Embedding 4B
qwen/qwen3-embedding-4bin $0.01/M
out $0.00/M
ctx: 32,000
max out: 2,048
in: text
out: text
reasoning
tools
vision
structured
temp
open weights