我的 Meta AI 學習筆記

Published: 2023-08-22

Lastmod: 2024-04-26

by Ernest Chiang

(圖說：拍攝於 Lac de Neuchatel 湖畔，瑞士。圖片來源：Ernest。)

概覽摘要 Briefing

歷史考古 History

古往今來，縱橫脈絡。

2023-02-24: Introducing Meta LLaMA
2023-07-18: Introducing Meta Llama 2
2024-04-18: Introducing Meta Llama 3

產品 Products

Meta LLaMA

LLaMA (Large Language Model Meta AI)
aka Meta Llama 1
Blog = Introducing LLaMA: A foundational, 65-billion-parameter language model
Paper = [2302.13971] LLaMA: Open and Efficient Foundation Language Models

Meta Llama 2

Meta 在 2023 年 7 月 18 日宣布推出 Llama-2，作為 LLaMA 的進階版本，提供 70 億 (7B)、130 億 (13B) 及 700 億 (70B) 參數的模型大小。這些新版本在保持與 LLaMA-1 模型相似的結構的同時，增加了 40% 的訓練數據。一份預印文件 ¹ 還透露了一個 340 億參數模型的計劃，該模型的發布取決於達到必要的安全標準。

Product = Meta Llama 2
Blog = Meta and Microsoft Introduce the Next Generation of Llama
Paper = [2307.09288] Llama 2: Open Foundation and Fine-Tuned Chat Models
GitHub = meta-llama/llama: Inference code for Llama models
Summary
- Llama-2 is weak in coding.
- Claude-2 excels in coding, mathematics, and logical thinking, including the ability to comprehend PDFs – a task that GPT 4 still struggles with.

(Safety human evaluation results. Lower is safer.)

(Carbon Footprint of Pretraining.)

Meta Llama 3

Meta 於 2024 年 4 月 19 日推出了兩種配置的 Llama-3 模型，分別具有 80 億 (8B) 和 700 億 (70B) 參數。這些模型預先在大約 15 兆的公開來源文本詞彙上進行訓練，隨後透過同樣公開的指示性資料集進行精進，並加入超過一千萬個人工註釋實例。Meta 計劃推出具備多模式功能、多語言支援及更大上下文理解的模型。目前，一個擁有超過 4000 億 (400B+) 參數的模型正在訓練中。

Product = Meta Llama 3
Blog = Introducing Meta Llama 3: The most capable openly available LLM to date
GitHub = meta-llama/llama3: The official Meta Llama 3 GitHub site
Interview = Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters - YouTube

我的 Meta AI 學習筆記

概覽摘要 Briefing

歷史考古 History

產品 Products

Meta LLaMA

Meta Llama 2

Meta Llama 3

使用情境 Use Cases

參考資料 Reference

比較 Comparison

內容大綱