Meta launches a new big language model that can be executed on a single GPU

Despite being “10x smaller,” Meta claimed on Friday that its new AI-powered large language model (LLM) LLaMA-13B can outperform OpenAI’s GPT-3 model. Smaller AI models could enable the local use of language assistants in the style of ChatGPT on PCs and smartphones. The “Large Language Model Meta AI,” or LLAMA for short, is a new group of language models.

The number of parameters in the LLaMA collection of language models ranges from 7 billion to 65 billion. In contrast, OpenAI’s GPT-3 model, which serves as the basis for ChatGPT, has 175 billion parameters.

Meta trained its LLaMA models with datasets that are accessible to the public, such as Common Crawl, Wikipedia, and C4, so the company may be able to open source the model and the weights. That is a significant new development in an industry where the Big Tech players in the AI race have remained secretive regarding their most potent AI technology up until this point.

“Unlike Chinchilla, PaLM, or GPT-3, we only use datasets publicly available, making our work compatible with open-sourcing and reproducible, while most existing models rely on data which is either not publicly available or undocumented,” tweeted project part Guillaume Lample.

Meta refers to its LLaMA models as “foundational models,” implying that the company intends for the models to serve as the foundation for future, more sophisticated AI models based on the technology, much like OpenAI did when it built ChatGPT on top of GPT-3. “question answering, natural language understanding or reading comprehension, understanding capabilities and limitations of current language models,” the company hopes, will benefit from LLaMA’s utility in natural language research.

The LLaMA-13B model, which can reportedly outperform GPT-3 while running on a single GPU, is arguably the most intriguing development. While the top-of-the-line LLaMA model (LLaMA-65B), which has 65 billion parameters, competes with similar offerings from rival AI labs DeepMind, Google, and OpenAI. Unlike GPT-3 derivatives’ data center requirements, LLaMA-13B allows for ChatGPT-like performance on consumer hardware in the near future.

In AI, parameter size matters a lot. A variable that a machine-learning model uses to predict or classify based on input data is called a parameter. A language model’s performance is largely influenced by the number of parameters it contains; larger models typically have the capacity to handle more challenging tasks and produce output that is more coherent. However, running with more parameters necessitates more computing power and takes up more space. Therefore, if a model has fewer parameters but still produces the same results as another model, it is significantly more efficient.

In a Mastodon thread analyzing the impact of Meta’s new AI models, independent AI researcher Simon Willison wrote, “I’m now thinking that we will be running language models with a sizable portion of the capabilities of ChatGPT on our own (top of the range) mobile phones and laptops within a year or two.”

A simplified version of LLaMA is currently available on GitHub. Meta offers a request form for access to the full code and weights—the “learned” training data in a neural network—for researchers who are interested. At this time, plans for a wider release of the model and weights have not been announced by Meta.