Open Source Models

A wide variety of open-source large language models are available, many of which are refinements of foundation models. Some of these models have been fine-tuned to excel in specific tasks. Below is a curated list of some of the top open-source models currently available.

General Models

Lllama3.1 from Meta

https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

Model Summary The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

Intended use Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases.

Coding Models

Granite from IBM

https://huggingface.co/ibm-granite/granite-8b-code-instruct-128k

Model Summary Granite-8B-Code-Instruct-128K is a 8B parameter long-context instruct model fine tuned from Granite-8B-Code-Base-128K on a combination of permissively licensed data used in training the original Granite code instruct models, in addition to synthetically generated code instruction datasets tailored for solving long context problems. By exposing the model to both short and long context data, we aim to enhance its long-context capability without sacrificing code generation performance at short input context.

Intended use The model is designed to respond to coding related instructions over long context input up to 128K length and can be used to build coding assistants.

Codestral from Mistral

https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1

Model Summary Codestral Mamba is an open code model based on the Mamba2 architecture. It performs on par with state-of-the-art Transformer-based code models.

Maths

Mathstral from Mixtral

https://huggingface.co/mistralai/Mathstral-7B-v0.1

Model Summary Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B. You can read more in the official blog post.