Supported Models

Well-supported LLMs

Fine-tuning is currently available for the following models:

Model Name	Parameters	Architecture	License	Context Window	Supported Fine-Tuning Context Window
mistral-7b	7 billion	Mistral	Apache 2.0	32768	32768
mistral-7b-instruct	7 billion	Mistral	Apache 2.0	32768	32768
mistral-7b-instruct-v0-2	7 billion	Mistral	Apache 2.0	32768	32768
mixtral-8x7b	46.7 billion	Mixtral	Apache 2.0	32768	7168
mixtral-8x7b-v0-1	46.7 billion	Mixtral	Apache 2.0	32768	7168
mixtral-8x7b-instruct-v0-1	46.7 billion	Mixtral	Apache 2.0	32768	7168
llama-3-8b	8 billion	Llama-3	Meta (request for commercial use)	8192	8192
llama-3-8b-instruct	8 billion	Llama-3	Meta (request for commercial use)	8192	8192
llama-3-70b	70 billion	Llama-3	Meta (request for commercial use)	8192	8192
llama-3-70b-instruct	70 billion	Llama-3	Meta (request for commercial use)	8192	8192
llama-2-7b	7 billion	Llama-2	Meta (request for commercial use)	4096	4096
llama-2-7b-chat	7 billion	Llama-2	Meta (request for commercial use)	4096	4096
llama-2-13b	13 billion	Llama-2	Meta (request for commercial use)	4096	4096
llama-2-13b-chat	13 billion	Llama-2	Meta (request for commercial use)	4096	4096
llama-2-70b	70 billion	Llama-2	Meta (request for commercial use)	4096	4096
llama-2-70b-chat	70 billion	Llama-2	Meta (request for commercial use)	4096	4096
codellama-13b-instruct	13 billion	Llama-2	Meta (request for commercial use)	16384	16384
codellama-70b-instruct	70 billion	Llama-2	Meta (request for commercial use)	4096	4096
codellama-7b	7 billion	Llama-2	Meta (request for commercial use)	4096	4096
codellama-7b-instruct	7 billion	Llama-2	Meta (request for commercial use)	4096	4096
gemma-2b	2.5 billion	Gemma	Google	8192	8192
gemma-2b-instruct	2.5 billion	Gemma	Google	8192	8192
gemma-7b	8.5 billion	Gemma	Google	8192	8192
gemma-7b-instruct	8.5 billion	Gemma	Google	8192	8192
zephyr-7b-beta	7 billion	Mistral	MIT	32768	32768
phi-2	2.78 billion	Phi-2	Microsoft	2048	2048
phi-3-mini-4k-instruct	3.92 billion	Phi-3	Microsoft	4096	4096

Many of the latest OSS models are released in two variants:

Base model (llama-2-7b, etc): These are models that are primarily trained on the objective of text completion.
Instruction-Tuned (llama-2-7b-chat, mistral-7b-instruct, etc): These are models that have been further trained on (instruction, output) pairs in order to better respond to human instruction-styled inputs. The instructions effectively constrains the model’s output to align with the response characteristics or domain knowledge.

Best-Effort LLMs (via HuggingFace)

Best-effort fine-tuning is also offered for any Huggingface LLM meeting the following criteria:

Has the "Text Generation" and "Transformer" tags
Does not have a "custom_code" tag
Are not post-quantized (ex. model containing a quantization method such as "AWQ" in the name)
Has text inputs and outputs

"Best-effort" means we will try to support these models but it is not guaranteed.

Fine-tuning a custom LLM

Get the Huggingface ID for your model by clicking the the copy icon on the custom base model's page, ex. "BioMistral/BioMistral-7B".

Huggingface screenshot

Pass the Huggingface ID as the base_model.

# Create an adapter repository
repo = pb.repos.create(name="bio-summarizer", description="Bio News Summarizer", exists_ok=True)

# Start a fine-tuning job, blocks until training is finished
adapter = pb.adapters.create(
    config=FinetuningConfig(
        base_model="BioMistral/BioMistral-7B"
    ),
    dataset="bio-dataset",
    repo=repo,
    description="initial model with defaults"
)

dedicated deployment needed for inference

Note that if you fine-tune a custom model not on our serverless deployments list, you'll need to deploy the custom base model as a dedicated deployment in order to run inference on your newly trained adapter.

Well-supported LLMs​

Best-Effort LLMs (via HuggingFace)​

Fine-tuning a custom LLM​

Well-supported LLMs

Best-Effort LLMs (via HuggingFace)

Fine-tuning a custom LLM