My Study Notes on Meta AI

Published: 2023-08-22

Lastmod: 2024-04-26

by Ernest Chiang

(Illustration: Lac de Neuchatel, Switzerland. Image source: Ernest)

Briefing

History

From ancient to modern times. Get through all the context.

2023-02-24: Introducing Meta LLaMA
2023-07-18: Introducing Meta Llama 2
2024-04-18: Introducing Meta Llama 3

Products

Meta LLaMA

LLaMA (Large Language Model Meta AI)
aka Meta Llama 1
Blog = Introducing LLaMA: A foundational, 65-billion-parameter language model
Paper = [2302.13971] LLaMA: Open and Efficient Foundation Language Models

Meta Llama 2

Meta announced the release of Llama-2 on July 18, 2023, as the advanced iteration of LLaMA, available in model sizes of 7 billion, 13 billion, and 70 billion parameters. While maintaining a similar structure to the LLaMA-1 models, these newer versions were trained with 40% more data. A preprint document ¹ also reveals plans for a 34 billion parameter model, pending the achievement of requisite safety criteria.

Product = Meta Llama 2
Blog = Meta and Microsoft Introduce the Next Generation of Llama
Paper = [2307.09288] Llama 2: Open Foundation and Fine-Tuned Chat Models
GitHub = meta-llama/llama: Inference code for Llama models
Summary
- Llama-2 is weak in coding.
- Claude-2 excels in coding, mathematics, and logical thinking, including the ability to comprehend PDFs – a task that GPT 4 still struggles with.

(Safety human evaluation results. Lower is safer.)

(Carbon Footprint of Pretraining.)

Meta Llama 3

Meta launched two configurations of the Llama-3 model on April 19, 2024, with sizes of 8 billion (8B) and 70 billion (70B) parameters, respectively. Pre-trained on about 15 trillion text tokens from sources available to the public, these models were refined further using instruction datasets also publicly accessible, complemented by more than 10 million human-annotated instances. Plans are in place to roll out multimodal capabilities, multilingual support, and larger contextual understanding in upcoming models. A model boasting over 400 billion parameters (400B+) is currently in training.

Product = Meta Llama 3
Blog = Introducing Meta Llama 3: The most capable openly available LLM to date
GitHub = meta-llama/llama3: The official Meta Llama 3 GitHub site
Interview = Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters - YouTube

My Study Notes on Meta AI

Briefing

History

Products

Meta LLaMA

Meta Llama 2

Meta Llama 3

使用情境 Use Cases

參考資料 Reference

比較 Comparison

Contents