(Illustration: Lac de Neuchatel, Switzerland. Image source: Ernest)
Briefing
History
From ancient to modern times. Get through all the context.
- 2023-02-24: Introducing Meta LLaMA
- 2023-07-18: Introducing Meta Llama 2
- 2024-04-18: Introducing Meta Llama 3
Products
Meta LLaMA
- LLaMA (Large Language Model Meta AI)
- aka Meta Llama 1
- Blog = Introducing LLaMA: A foundational, 65-billion-parameter language model
- Paper = [2302.13971] LLaMA: Open and Efficient Foundation Language Models
Meta Llama 2
Meta announced the release of Llama-2 on July 18, 2023, as the advanced iteration of LLaMA, available in model sizes of 7 billion, 13 billion, and 70 billion parameters. While maintaining a similar structure to the LLaMA-1 models, these newer versions were trained with 40% more data. A preprint document 1 also reveals plans for a 34 billion parameter model, pending the achievement of requisite safety criteria.
- Product = Meta Llama 2
- Blog = Meta and Microsoft Introduce the Next Generation of Llama
- Paper = [2307.09288] Llama 2: Open Foundation and Fine-Tuned Chat Models
- GitHub = meta-llama/llama: Inference code for Llama models
- Summary
- Llama-2 is weak in coding.
- Claude-2 excels in coding, mathematics, and logical thinking, including the ability to comprehend PDFs – a task that GPT 4 still struggles with.
(Safety human evaluation results. Lower is safer.)
(Carbon Footprint of Pretraining.)
Meta Llama 3
Meta launched two configurations of the Llama-3 model on April 19, 2024, with sizes of 8 billion (8B) and 70 billion (70B) parameters, respectively. Pre-trained on about 15 trillion text tokens from sources available to the public, these models were refined further using instruction datasets also publicly accessible, complemented by more than 10 million human-annotated instances. Plans are in place to roll out multimodal capabilities, multilingual support, and larger contextual understanding in upcoming models. A model boasting over 400 billion parameters (400B+) is currently in training.
- Product = Meta Llama 3
- Blog = Introducing Meta Llama 3: The most capable openly available LLM to date
- GitHub = meta-llama/llama3: The official Meta Llama 3 GitHub site
- Interview = Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters - YouTube