Enhance model performance with custom fine-tuning tailored to your specific datasets. Our fine-tuning services are priced based on the total number of tokens in your training dataset.
Serverless Text Model Pricing
Model
$1M Tokens in training
Models up to 16B parameters
$0.6
Models 16.1B - 80B
$3.6
MoE 0B - 56B (e.g., Mixtral 8x7B)
$2.4
MoE 56.1B - 176B (e.g., DBRX, Mixtral 8x22B))
$7.2
Apperture charges are based on the total number of tokens in your fine-tuning dataset (dataset size \ number of epochs*).
Inference
Harness the power of our serverless inference API to deploy a variety of open-source chat models and multi-modal language models. Pay only for what you use with our transparent per-token pricing.
Serverless Text Model Pricing
Base Model Parameter Count
$/1M Tokens
0B - 4B
$0.12
4B - 16B
$0.24
16.1B+
$1.08
MoE, 56B (eg, DBX, Mistral 8x22B)
$0.6
MoE 56.1B - 176B (e.g., DBRX, Mixtral 8x22B)
$1.44
Yi Large
$3.6
Meta Llama 3.1 405B
$3.6
Note: Per-token pricing is applied only for serverless inference. See below for on-demand deployment pricing.
LoRA Models
LoRA models deployed to our serverless inference service are charged at the same rate as the underlying base model. There is no additional cost for serving LoRA models.
From zero to IPO. Built for every stage of your journey