(Illustration: A lot of hard work has been going on behind the scenes. Le Bouchon Ogasawara, Shibuya, Tokyo. Image source: Ernest)
Contents
tl;dr
- TOPS (Trillions of Operations Per Second) is a key indicator for measuring the computational power of AI chips and NPU chips, reflecting the trillions of operations a processor can execute per second.
- Using the “frying eggs” analogy to intuitively understand TOPS: A regular CPU is like a chef who can only fry one egg at a time, while a high TOPS performance AI chip is like a super chef who can fry an incredibly large number of eggs simultaneously.
- TOPS is an important reference for comparing AI chip performance, but when evaluating AI hardware, factors such as energy efficiency and memory bandwidth should also be considered. Additionally, TOPS values typically reflect theoretical peak performance, and actual performance needs to be judged based on a combination of other metrics suitable for the application scenario.
What is TOPS (Simple Life Version)
TOPS, which stands for Trillions Operations Per Second, is an important indicator for measuring the computational power of Artificial Intelligence (AI) chips or Neural Processing Units (NPUs). TOPS is used to express the maximum number of operations a processor can execute per second, calculated in trillions. In the future, if computational power continues to increase, the initial T may be replaced by other larger units of measurement.
We can use an example from daily life to explain and more intuitively understand TOPS:
Imagine AI computation as the process of frying eggs, where data is the egg being heated.
A regular chef (ordinary processor, CPU) might only be able to fry one egg at a time, while a super chef (AI chip) might be able to fry 1 trillion eggs simultaneously! TOPS is like a measure of this “super chef’s” ability, telling us how many “data eggs” it can “fry” per second.
TOPS is one of the important indicators for understanding and comparing AI chip performance, but it’s not the only one.
When evaluating AI hardware, AI phones, or AI computers, remember to consider other factors such as energy efficiency, memory bandwidth, software ecosystem, etc. Using TOPS can help us compare the computational power of different AI chips, providing a reference point for choosing AI hardware devices suitable for specific applications.
What is TOPS (In-depth Version, for Those Who Insist)
Before delving deeper into understanding TOPS, we need to first understand what an “operation” is:
In digital circuits and computer science, an “operation” typically refers to a basic mathematical or logical computation. For AI chips or NPUs, these operations mainly include:
- Floating-point operations: such as addition, subtraction, multiplication, and division.
- Matrix operations: Large-scale matrix multiplication is one of the most common operations in deep learning.
- Vector operations: including dot product (scalar product), cross product (vector product), etc.
- Activation functions: such as ReLU, Sigmoid, Tanh, etc.
- Convolution operations: widely used in Convolutional Neural Networks (CNN).
These operations are typically performed in FP32 (32-bit floating-point) or FP16 (16-bit floating-point) formats. Some AI chips also support lower precision formats like INT8 (8-bit integer) to improve performance and reduce energy consumption, typically used for inference.
The calculation of TOPS can be simplified as:
TOPS = (Number of operations per clock cycle) × (Clock frequency) / 1 trillion
For example, if an AI chip can perform 1000 operations per clock cycle and has a clock frequency of 1GHz, then its theoretical peak performance is 1 TOPS.
1000 operations/cycle × 1GHz = 1000 × 10^9 operations/second = 10^12 operations/second = 1 TOPS
When understanding TOPS, please note the following points:
- TOPS typically represents theoretical peak performance; actual performance may vary due to factors such as memory bandwidth and chip architecture.
- TOPS values may differ for different types of operations (such as FP32, FP16, INT8).
- A high TOPS value doesn’t necessarily mean better performance in all AI tasks, as actual performance also depends on software optimization and the characteristics of specific tasks.
TOPS Comparison Table
(Focus mainly on the “INT8 Ops” column. You can swipe left and right to see more comparison data)
INT8 Ops | FP32 FLOps | Company Name | Type | Target Market | Product Family | Product Name | Product Generation | Code Name | Release Year | First Used On | Fab Process | CPU | GPU | NPU | Memory Tech | Memory Bandwidth | TDP Base | Remark |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
73 TOPS | n/a | AMD | SoC | PC | Ryzen AI 300 | Ryzen AI 9 365 | n/a | Strix Point | 2024 | n/a | TSMC 4nm FinFET | n/a | AMD Radeon™ 880M | n/a | DDR5-5600 or LPDDR5X-7500 | n/a | 28.0 | - Total 73 TOPS (50 TOPS from NPU). |
80 TOPS | n/a | AMD | SoC | PC | Ryzen AI 300 | Ryzen AI 9 HX 370 | n/a | Strix Point | 2024 | n/a | TSMC 4nm FinFET | n/a | AMD Radeon™ 890M | n/a | DDR5-5600 or LPDDR5X-7500 | n/a | 28.0 | - Total 80 TOPS (50 TOPS from NPU). |
50 TOPS | n/a | AMD | NPU | n/a | Ryzen | XDNA 2 | n/a | AI | 2024 | Ryzen AI 9 HX 370 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
1961.2 TOPS 3922.3 TOPS (with Sparsity) | 122.6 TFLOPS | AMD | GPU | Datacenter | AMD Data Center GPUs (AMD Instinct) | MI300A | n/a | n/a | 2023 | n/a | n/a | n/a | n/a | n/a | HBM3 | 5300 GB/s | 550.0 | n/a |
2614.9 TOPS 5229.8 TOPS (with Sparsity) | 163.4 TFLOPS | AMD | GPU | Datacenter | AMD Data Center GPUs (AMD Instinct) | MI300X | n/a | n/a | 2023 | n/a | XCD: TSMC N5 IOD: TSMC N6 | n/a | n/a | n/a | HBM3 | 5300 GB/s | 750.0 | n/a |
2614.9 TOPS 5229.8 TOPS (with Sparsity) | 163.4 TFLOPS | AMD | GPU | Datacenter | AMD Data Center GPUs (AMD Instinct) | MI325X | n/a | n/a | 2024 | n/a | XCD: TSMC N5 IOD: TSMC N6 | n/a | n/a | n/a | HBM3E | 6000 GB/s | 750.0 | n/a |
n/a | n/a | ARM | IP | n/a | Neoverse | Neoverse E1 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | ARM | IP | n/a | Neoverse | Neoverse N1 | n/a | Ares | 2019 | Ampere Altra, AWS Graviton2 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | ARM | IP | Datacenter (Infrastructure Processor) | Neoverse | Neoverse N2 | n/a | Perseus | 2020 | Microsoft Azure Cobalt 100 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | ARM | IP | Datacenter (Infrastructure Processor) | Neoverse | Neoverse N3 | n/a | Hermes | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | ARM | IP | Datacenter (Infrastructure Processor) | Neoverse | Neoverse V1 | n/a | Zeus | 2020 | AWS Graviton3 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | - first announcements coming out of Arm’s TechCon convention 2018 in San Jose. |
n/a | n/a | ARM | IP | Datacenter (Infrastructure Processor) | Neoverse | Neoverse V2 | n/a | n/a | 2022 | NVIDIA Grace, AWS Graviton4, Google Axion | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | ARM | IP | Datacenter (Infrastructure Processor) | Neoverse | Neoverse V3 | n/a | Poseidon | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
825 TOPS ??? | n/a | Alibaba | SoC | Datacenter (AI inference) | Hanguang 含光 | Hanguang 800 | 1 | n/a | 2019 | n/a | TSMC 12nm | n/a | n/a | n/a | n/a | n/a | 280.0 | - 16x PCIe gen4 - SRAM, No DDR |
n/a | n/a | Alibaba | SoC | Datacenter (Infra) | Yitian 倚天 | Yitian 710 | 1 | n/a | 2021 | Alibaba ECS g8m | N5 | 128 Neoverse N2 core | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | Amazon | SoC | Datacenter (Infra) (Scale out) | AWS Graviton | Graviton | 1 | Alpine | 2018 | Amazon EC2 A1 | TSMC 16nm | Cortex A72 | n/a | n/a | DDR4-1600 | 51.2 GB/s | 95.0 | - 32 lanes of PCIe gen3 |
n/a | n/a | Amazon | SoC | Datacenter (Infra) (General Purpose) | AWS Graviton | Graviton 2 | 2 | Alpine+ | 2019 | Amazon EC2 M6g, M6gd, C6g, C6gd, C6gn, R6g, R6gd, T4g, X2gd, G5g, Im4gn, Is4gen, I4g | TSMC 7nm | 128 Neoverse N1 core | n/a | n/a | DDR4-3200 | 204.8 GB/s | 110.0 | - 64 lanes of PCIe gen4 |
n/a | n/a | Amazon | SoC | Datacenter (Infra) (ML, HPC, SIMD) | AWS Graviton | Graviton 3 | 3 | n/a | 2021 | Amazon EC2 C7g, M7g, R7g; with local disk: C7gd, M7gd, R7gd | TSMC 5nm | 64 Neoverse V1 core | n/a | n/a | DDR5-4800 | 307.2 GB/s | 100.0 | - 32 lanes of PCIe gen5 |
n/a | n/a | Amazon | SoC | Datacenter (Infra) | AWS Graviton | Graviton 3E | 3 | n/a | 2022 | Amazon EC2 C7gn, HPC7g | n/a | 64 Neoverse V1 core | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | Amazon | SoC | Datacenter (Infra) (Scale up) | AWS Graviton | Graviton 4 | 4 | n/a | 2023 | Amazon EC2 R8g | n/a | 96 Neoverse V2 core | n/a | n/a | DDR5-5600 | 537.6 GB/s | n/a | - 96 lanes of PCIe gen5 |
63.3 TOPS | 0.97 TFLOPS | Amazon | SoC | Datacenter (AI inference) | AWS Inferertia | Inferertia 1 | 1 | n/a | 2018 | Amazon EC2 Inf1 | TSMC 16nm | 16 NeuroCore v1 | n/a | n/a | n/a | 50 GB/s | n/a | n/a |
380 TOPS | 2.9 TFLOPS | Amazon | SoC | Datacenter (AI inference) | AWS Inferertia | Inferertia 2 | 2 | n/a | 2022 | Amazon EC2 Inf2 | TSMC 5nm | 24 NeuroCore v2 | n/a | n/a | n/a | 820 GB/s | n/a | n/a |
380 TOPS | 2.9 TFLOPS | Amazon | SoC | Datacenter (AI train) | AWS Trainium | Trainium 1 | 1 | n/a | 2020 | Amazon EC2 Trn1 | TSMC 7nm | 32 NeuroCore v2 | n/a | n/a | n/a | 820 GB/s | n/a | n/a |
861 TOPS | 6.57 TFLOPS | Amazon | SoC | Datacenter (AI train) | AWS Trainium | Trainium 2 | 2 | n/a | 2023 | Amazon EC2 Trn2 | TSMC 4nm | 64 NeuroCore v2 | n/a | n/a | n/a | 4,096 GB/s | n/a | n/a |
11 TOPS | 748.8 GFLOPS | Apple | SoC | Mobile | A | A14 Bionic | n/a | APL1W01 | 2020 | iPhone 12 | TSMC N5 | Firestorm + Icestorm | n/a | n/a | LPDDR4X-4266 | 34.1 GB/s | n/a | n/a |
15.8 TOPS | 1.37 TFLOPS | Apple | SoC | Mobile | A | A15 Bionic | n/a | APL1W07 | 2021 | iPhone 13 | TSMC N5P | Avalanche + Blizzard | n/a | n/a | LPDDR4X-4266 | 34.1 GB/s | n/a | n/a |
17 TOPS | 1.789 TFLOPS | Apple | SoC | Mobile | A | A16 Bionic | n/a | APL1W10 | 2022 | iPhone 14 | TSMC N4P | Everest + Sawtooth | n/a | n/a | LPDDR5-6400 | 51.2 GB/s | n/a | - 6GB LPDDR5 |
35 TOPS | 2.147 TFLOPS | Apple | SoC | Mobile | A | A17 Pro | n/a | APL1V02 | 2023 | iPhone 15 Pro, iPhone 15 Pro Max | TSMC N3B | 6 cores (2 performance + 4 efficiency) | Apple-designed 6-core | 16-core Neural Engine | LPDDR5-6400 | 51.2 GB/s | n/a | - 8GB LPDDR5 |
35 TOPS | n/a | Apple | SoC | Mobile | A | A18 | n/a | n/a | 2024 | iPhone 16 | TSMC N3P | 6 cores (2 performance + 4 efficiency) | Apple-designed 5-core | 16-core Neural Engine | n/a | n/a | n/a | n/a |
35 TOPS | n/a | Apple | SoC | Mobile | A | A18 Pro | n/a | n/a | 2024 | iPhone 16 Pro | TSMC N3P | 6 cores (2 performance + 4 efficiency) | Apple-designed 6-core | 16-core Neural Engine | n/a | n/a | n/a | n/a |
11 TOPS | 2.6 TFLOPS | Apple | SoC | Mobile, PC | M | M1 | n/a | APL1102 | 2020 | n/a | TSMC N5 | high-performance “Firestorm” + energy-efficient “Icestorm” | n/a | n/a | LPDDR4X-4266 | 68.3 GB/s | n/a | n/a |
11 TOPS | 10.4 TFLOPS | Apple | SoC | Mobile, PC | M | M1 Max | n/a | APL1105 | 2021 | n/a | TSMC N5 | n/a | n/a | n/a | LPDDR5-6400 | 409.6 GB/s | n/a | n/a |
11 TOPS | n/a | Apple | SoC | Mobile, PC | M | M1 Pro | n/a | APL1103 | 2021 | n/a | TSMC N5 | n/a | n/a | n/a | LPDDR5-6400 | 204.8 GB/s | n/a | n/a |
22 TOPS | 21 TFLOPS | Apple | SoC | Mobile, PC | M | M1 Ultra | n/a | APL1W06 | 2022 | n/a | TSMC N5 | The M1 Ultra consists of two M1 Max units connected with UltraFusion Interconnect with a total of 20 CPU cores and 96 MB system level cache (SLC). | n/a | n/a | LPDDR5-6400 | 819.2 GB/s | n/a | n/a |
15.8 TOPS | 2.863 TFLOPS, 3.578 TFLOPS | Apple | SoC | Mobile, PC | M | M2 | n/a | APL1109 | 2022 | n/a | TSMC N5P | high-performance @3.49 GHz “Avalanche” + energy-efficient @2.42 GHz “Blizzard” | n/a | n/a | LPDDR5-6400 | 102.4 GB/s | n/a | n/a |
15.8 TOPS | 10.736 TFLOPS, 13.599 TFLOPS | Apple | SoC | Mobile, PC | M | M2 Max | n/a | APL1111 | 2023 | n/a | TSMC N5P | n/a | n/a | n/a | LPDDR5-6400 | 409.6 GB/s | n/a | n/a |
15.8 TOPS | 5.726 TFLOPS, 6.799 TFLOPS | Apple | SoC | Mobile, PC | M | M2 Pro | n/a | APL1113 | 2023 | n/a | TSMC N5P | n/a | n/a | n/a | LPDDR5-6400 | 204.8 GB/s | n/a | n/a |
31.6 TOPS | 21.473 TFLOPS, 27.199 TFLOPS | Apple | SoC | Mobile, PC | M | M2 Ultra | n/a | APL1W12 | 2023 | n/a | TSMC N5P | n/a | n/a | n/a | LPDDR5-6400 | 819.2 GB/s | n/a | n/a |
18 TOPS | 2.826 TFLOPS, 3.533 TFLOPS | Apple | SoC | Mobile, PC | M | M3 | n/a | APL1201 | 2023 | MacBook Pro | TSMC N3B | n/a | n/a | n/a | LPDDR5-6400 | 102.4 GB/s | n/a | n/a |
18 TOPS | 10.598 TFLOPS, 14.131 TFLOPS | Apple | SoC | Mobile, PC | M | M3 Max | n/a | APL1204 | 2023 | n/a | TSMC N3B | n/a | n/a | n/a | LPDDR5-6400 | 307.2 GB/s, 409.6 GB/s | n/a | n/a |
18 TOPS | 4.946 TFLOPS, 6.359 TFLOPS | Apple | SoC | Mobile, PC | M | M3 Pro | n/a | APL1203 | 2023 | n/a | TSMC N3B | n/a | n/a | n/a | LPDDR5-6400 | 153.6 GB/s | n/a | n/a |
38 TOPS | 3.763 TFLOPS | Apple | SoC | Mobile, PC | M | M4 | n/a | APL1206 | 2024 | iPad Pro (7th generation) | TSMC N3E | 10 cores (4 performance + 6 efficiency) | Apple-designed 10-core | 16-core Neural Engine | LPDDR5X-7500 | 120 GB/s | n/a | n/a |
38 TOPS | n/a | Apple | SoC | Mobile, PC | M | M4 Max | n/a | n/a | 2024 | MacBook Pro M4 Max | TSMC N3E | 14 cores (10 performance + 4 efficiency) 16 cores (12 performance + 4 efficiency) | Apple-designed 16-core Apple-designed 20-core | 16-core Neural Engine | LPDDR5X-8533 | 409.6 GB/s (36GB), 546 GB/s (48GB, 64GB, 128GB) | n/a | n/a |
38 TOPS | n/a | Apple | SoC | Mobile, PC | M | M4 Pro | n/a | n/a | 2024 | MacBook Pro M4 Pro, Mac mini M4 Pro | TSMC N3E | 12 cores (8 performance + 4 efficiency) 14 cores (10 performance + 4 efficiency) | Apple-designed 32-core Apple-designed 40-core | 16-core Neural Engine | LPDDR5X-8533 | 273 GB/s | n/a | n/a |
n/a | n/a | SoC | Datacenter (Infra) | GCP CPU | Axion | n/a | Axion | 2024 | GCP Compute Engine ??? | n/a | ?? Neoverse V2 core | n/a | n/a | n/a | n/a | n/a | n/a | |
1.6 TOPS | n/a | SoC | Mobile | Google Tensor (Edge TPU) | G1 | 1 | Whitechapel | 2021 | Pixel 6, Pixel 6 Pro, Pixel 6a | Samsung 5 nm LPE | Octa-core: 2.8 GHz Cortex-X1 (2×) 2.25 GHz Cortex-A76 (2×) 1.8 GHz Cortex-A55 (4×) | Mali-G78 MP20 at 848 MHz | Google Edge TPU | LPDDR5 | 51.2 GB/s | n/a | n/a | |
n/a | n/a | SoC | Mobile | Google Tensor (Edge TPU) | G2 | 2 | Cloudripper | 2022 | Pixel 7, Pixel 7 Pro, Pixel 7a, Pixel Fold, Pixel Tablet | Samsung 5 nm LPE | Octa-core: 2.85 GHz Cortex-X1 (2×) 2.35 GHz Cortex-A78 (2×) 1.8 GHz Cortex-A55 (4×) | Mali-G710 MP7 at 850 MHz | Google Edge TPU | LPDDR5 | 51.2 GB/s | n/a | n/a | |
27 TOPS | n/a | SoC | Mobile | Google Tensor (Edge TPU) | G3 | 3 | Zuma (Dev Board: Ripcurrent) | 2023 | Pixel 8, Pixel 8 Pro, Pixel 8a | Samsung 4nm LPP | Nona-core: 2.91 GHz Cortex-X3 (1×) 2.37 GHz Cortex-A715 (4×) 1.7 GHz Cortex-A510 (4×) | Mali-G715 MP10 at 890 MHz | Google Edge TPU (Rio) | LPDDR5X | 68.2 GB/s | n/a | n/a | |
45 TOPS | n/a | SoC | Mobile | Google Tensor (Edge TPU) | G4 | 4 | Zuma Pro | 2024 | Pixel 9, Pixel 9 Pro | Samsung 4nm LPP | Octa-core: 3.1 GHz Cortex-X4 (1×) 2.6 GHz Cortex-A720 (3×) 1.92 GHz Cortex-A520 (4×) | Mali-G715 MP10 at 940 MHz | n/a | LPDDR5X | n/a | n/a | - 8Gen3 = 45 TOPS, D9300 = 48 TOPS | |
n/a | n/a | SoC | Mobile | Google Tensor (Edge TPU) | G5 | 5 | Laguna Beach (Dev Board: Deepspace) | 2025 | Pixel 10, Pixel 10 Pro | TSMC N3 + InFO-POP packaging | n/a | n/a | n/a | n/a | n/a | n/a | n/a | |
23 TOPS | n/a | SoC | Datacenter (AI inference) | TPU | TPUv1 | 1 | n/a | 2015 | n/a | 28nm | n/a | n/a | n/a | DDR3-2133 | 34 GB/s | 75.0 | - The core of TPU: Systolic Array - Matrix Multiply Unit (MXU): a big systolic array - PCIe Gen3 x16 | |
45 TOPS | 3 TFLOPS | SoC | Datacenter (AI inference) | TPU | TPUv2 | 2 | n/a | 2017 | n/a | 16nm | n/a | n/a | n/a | n/a | 600 GB/s | 280.0 | - 16GB HBM - BF16 | |
123 TOPS | 4 TFLOPS | SoC | Datacenter (AI inference) | TPU | TPUv3 | 3 | n/a | 2018 | n/a | 16nm | n/a | n/a | n/a | n/a | 900 GB/s | 220.0 | n/a | |
275 TOPS | n/a | SoC | Datacenter (AI inference) | TPU | TPUv4 | 4 | n/a | 2021 | n/a | 7nm | n/a | n/a | n/a | n/a | 1,200 GB/s | 170.0 | - 32GB HBM2 | |
393 TOPS | n/a | SoC | Datacenter (AI inference) | TPU | TPUv5e | 5 | n/a | 2023 | n/a | n/a | n/a | n/a | n/a | n/a | 819 GB/s | n/a | n/a | |
918 TOPS | n/a | SoC | Datacenter (AI inference) | TPU | TPUv5p | 5 | n/a | 2023 | n/a | n/a | n/a | n/a | n/a | n/a | 2,765 GB/s | n/a | n/a | |
n/a | n/a | SoC | Datacenter (AI inference) | TPU | TPUv6? Trillium? | 6 | n/a | 2024 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | |
n/a | 31 TFLOPS | Graphcore | SoC | Datacenter | Colossus | Colossus MK1 GC2 IPU | 1 | n/a | 2017 | n/a | TSMC 16nm | 1216 processor cores | n/a | n/a | n/a | 45,000 GB/s | n/a | n/a |
n/a | 62 TFLOPS | Graphcore | SoC | Datacenter | Colossus | Colossus MK2 GC200 IPU | 2 | n/a | 2020 | n/a | TSMC 7nm | 1472 processor cores | n/a | n/a | n/a | 47,500 GB/s | n/a | n/a |
n/a | n/a | Graphcore | SoC | Datacenter | Colossus | Colossus MK3 (TBD) | 3 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | Intel | SoC | HP Mobile, PC | n/a | n/a | n/a | Arrow Lake | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
120 TOPS | n/a | Intel | SoC | LP Mobile | Core Ultra | Core Ultra | Series 2 | Lunar Lake | 2024 | n/a | TSMC N3B (Compute tile), TSMC N6 (Platform controoler tile) | P-core: Lion Cove E-core: Skymont | Xe2 | NPU 4 | n/a | n/a | n/a | - Total 120 TOPS (48 TOPS from NPU 4 + 67 TOPS from GPU + 5 TOPS from CPU). |
34 TOPS | n/a | Intel | SoC | Mobile | Core Ultra | Core Ultra | Series 1 | Meteor Lake | 2023 | n/a | Intel 4 (7nm EUV, Compute tile), TSMC N5 (Graphics tile), TSMC N6 (Soc tile, I/O extender tile) | P-core: Redwood Cove E-core: Crestmont | Xe-LPG | NPU 3720 | n/a | n/a | n/a | - Total 34 TOPS (11 TOPS from NPU + 18 TOPS from GPU + 5 TOPS from CPU). |
0.5 TOPS | n/a | Intel | NPU | n/a | n/a | NPU 1 | 1 | n/a | 2018 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
7 TOPS | n/a | Intel | NPU | n/a | n/a | NPU 2 | 2 | n/a | 2021 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
11.5 TOPS | n/a | Intel | NPU | n/a | n/a | NPU 3 | 3 | n/a | 2023 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
48 TOPS | n/a | Intel | NPU | n/a | n/a | NPU 4 | 4 | n/a | 2024 | Lunar Lake | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9000 天璣 9000 | 9000 | n/a | 2021 | Redmi K50 Pro OPPO Find X5 Pro 天璣版 vivo X80 / X80 Pro 天璣版 | TSMC N4 | 1× Cortex-X2 @ 3.05 GHz 3× Cortex-A710 @ 2.85 GHz 4× Cortex-A510 @ 1.8 GHz | Mali-G710 MP10 @ 850 MHz | MediaTek APU 590 | n/a | n/a | n/a | - 5G NR Sub-6GHz, LTE |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9000+ 天璣 9000+ | 9000 | n/a | 2022 | 小米12 Pro 天璣版 華碩 ROG Phone 6D Ultimate iQOO Neo 7 OPPO Find N2 Flip | TSMC N4 | 1× Cortex-X2 @ 3.2 GHz 3× Cortex-A710 @ 2.85 GHz 4× Cortex-A510 @ 1.8 GHz | Mali-G710 MC10 | MediaTek APU 590 | n/a | n/a | n/a | - 5G NR Sub-6GHz, LTE |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9200 天璣 9200 | 9000 | n/a | 2022 | vivo X90, vivo X90 Pro OPPO Find X6 OPPO Find N3 Flip | TSMC N4 | 1× Cortex-X3 @ 3.05GHz 3× Cortex-A715 @ 2.85GHz 4× Cortex-A510 @ 1.8GHz | Mali-Immortalis-G715 MP11 @ 981 MHz | MediaTek APU 690 | n/a | n/a | n/a | - 5G NR Sub-6 GHz, 5G mmWave, LTE |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9200+ 天璣 9200+ | 9000 | n/a | 2023 | iQOO Neo8 Pro vivo X90s Redmi K60至尊版 | TSMC N4 | 1× Cortex-X3 @ 3.35 GHz 3× Cortex-A715 @ 3.0 GHz 4× Cortex-A510 @ 2.0 GHz | Mali-Immortalis-G715 MC11 | MediaTek APU 690 | n/a | n/a | n/a | - 5G NR Sub-6 GHz, 5G mmWave, LTE |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9300 天璣 9300 | 9000 | n/a | 2023 | vivo X100, vivo X100 Pro OPPO Find X7 | TSMC N4P | 1× Cortex-X4 @ 3.25 GHz 3× Cortex-X4 @ 2.85 GHz 4× Cortex-A720 @ 2.0 GHz | Mali-Immortalis-G720 MC12 @ 1300 MHz | MediaTek APU 790 | n/a | n/a | n/a | - 5G NR (Sub-6 GHz & mmWave), 4G LTE, quad-band GNSS (BeiDou, Galileo, GLONASS, GPS, NavIC, QZSS), Bluetooth 5.4, Wi-Fi 7 (2x2) |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9300+ 天璣 9300+ | 9000 | n/a | 2024 | vivo X100S, vivo X100X Pro | TSMC N4P | 1× Cortex-X4 @ 3.4 GHz 3× Cortex-X4 @ 2.85 GHz 4× Cortex-A720 @ 2.0 GHz | Mali-Immortalis-G720 MC12 @ 1300 MHz | MediaTek APU 790 | n/a | n/a | n/a | - 5G NR (Sub-6 GHz & mmWave), 4G LTE, quad-band GNSS (BeiDou, Galileo, GLONASS, GPS, NavIC, QZSS), Bluetooth 5.4, Wi-Fi 7 (2x2) |
n/a | n/a | MediaTek | SoC | Mobile | Dimensity 天璣 | Dimensity 9400 天璣 9400 | 9000 | n/a | 2024 | vivo X200, OPPO Find X8 / Pro | TSMC N3 | 1× Cortex-X925 @ 3.63 GHz 3× Cortex-X4 @ 2.8 GHz 4× Cortex-A725 @ 2.1 GHz | Mali-Immortalis-G925 MC12 @ ??? MHz | n/a | n/a | n/a | n/a | n/a |
n/a | n/a | Microsoft | SoC | Datacenter (Infra) | Azure Cobalt | Cobalt 100 | 1 | n/a | 2024 | Azure VM Dpsv6, Dplsv6, Epsv6 | n/a | 128 Neoverse V2 core | n/a | n/a | LPDDR5 ??? | n/a | n/a | - PCIe gen5 - CXL 1.1 - from project start to silicon in 13 months. |
1,600 TOPS | n/a | Microsoft | SoC | Datacenter (AI inference) | Azure Maia | Maia 100 | 1 | n/a | 2024 | Microsoft Copilot | TSMC N5 + CoWoS-S | n/a | n/a | n/a | n/a | 18,000 GB/s ??? | 500.0 | - 32Gb/s PCIe gen5x8 - Design to TDP = 700W - Provision TDP = 500W |
n/a | 15.1 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4060 | n/a | AD107-400 | 2023 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6 | 272 GB/s | 115.0 | - PCIe 4.0 x8 |
n/a | 22.1 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4060 Ti | n/a | AD106-351 | 2023 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6 | 288 GB/s | 160.0 | - PCIe 4.0 x8 |
n/a | 29.1 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4070 | n/a | AD104-250 | 2023 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 504 GB/s | 200.0 | - PCIe 4.0 x16 |
n/a | 35.48 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4070 Super | n/a | AD104-350 | 2024 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 504 GB/s | 220.0 | - PCIe 4.0 x16 |
n/a | 40.1 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4070 Ti | n/a | AD104-400 | 2023 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 504 GB/s | 285.0 | - PCIe 4.0 x16 |
n/a | 44.10 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4070 Ti Super | n/a | AD103-275 | 2024 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 672 GB/s | 285.0 | - PCIe 4.0 x16 |
n/a | 48.7 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4080 | n/a | AD103-300 | 2022 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 717 GB/s | 320.0 | - PCIe 4.0 x16 |
n/a | 52.22 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4080 Super | n/a | AD103-400 | 2024 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 736 GB/s | 320.0 | - PCIe 4.0 x16 |
n/a | 82.6 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4090 | n/a | AD102-300 | 2022 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 1008 GB/s | 450.0 | - PCIe 4.0 x16 |
n/a | 73.5 TFLOPS | NVIDIA | GPU | Desktop | GeForce RTX 40 | GeForce RTX 4090 D | n/a | AD102-250 | 2023 | n/a | TSMC N4 | n/a | n/a | n/a | GDDR6X | 1008 GB/s | 425.0 | - PCIe 4.0 x16 |
n/a | 124.96 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | A10 | Ampere | n/a | 2021 | n/a | n/a | n/a | 1× GA102-890-A1 | n/a | GDDR6 | 600 GB/s | n/a | n/a |
624 TOPS | 312.0 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | A100 | Ampere | n/a | 2020 | n/a | TSMC N7 | n/a | 1× GA100-883AA-A1 | n/a | HBM2 | 1555 GB/s | 400.0 | n/a |
n/a | 73.728 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | A16 | Ampere | n/a | 2021 | n/a | n/a | n/a | 4× GA107 | n/a | GDDR6 | 4x 200 GB/s | n/a | n/a |
n/a | 18.124 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | A2 | Ampere | n/a | 2021 | n/a | n/a | n/a | 1× GA107 | n/a | GDDR6 | 200 GB/s | 60.0 | n/a |
n/a | 165.12 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | A30 | Ampere | n/a | 2021 | n/a | n/a | n/a | 1× GA100 | n/a | HBM2 | 933.1 GB/s | n/a | n/a |
n/a | 149.68 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | A40 | Ampere | n/a | 2020 | n/a | n/a | n/a | 1× GA102 | n/a | GDDR6 | 695.8 GB/s | n/a | n/a |
3500 TOPS (3.5 POPS) | n/a | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | B100 (SXM6 card) | Blackwell | n/a | 2024 | n/a | TSMC 4NP (custom N4P) | n/a | n/a | n/a | HBM3E | 8000 GB/s | 700.0 | n/a |
4500 TOPS (4.5 POPS) | n/a | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | B200 (SXM6 card) | Blackwell | n/a | 2024 | n/a | TSMC 4NP (custom N4P) | n/a | n/a | n/a | HBM3E | 8000 GB/s | 1000.0 | n/a |
n/a | 756.449 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | H100 (PCIe card) | Hopper | n/a | 2022 | n/a | TSMC 4N (custom N4) | n/a | 1× GH100 | n/a | HBM2E | 2039 GB/s | n/a | n/a |
1980 TOPS (1.98 POPS) | 989.43 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | H100 (SXM5 card) | Hopper | n/a | 2022 | n/a | TSMC 4N (custom N4) | n/a | 1× GH100 | n/a | HBM3 | 3352 GB/s | 700.0 | n/a |
1980 TOPS (1.98 POPS) | 67 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | H200 (SXM5 card) | Hopper | n/a | 2023 | n/a | TSMC 4N (custom N4) | n/a | n/a | n/a | HBM3E | 4800 GB/s | 1000.0 | n/a |
n/a | 121.0 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | L4 | Ada Lovelace | n/a | 2023 | n/a | n/a | n/a | 1x AD104 | n/a | GDDR6 | 1563 GB/s | n/a | n/a |
n/a | 362.066 TFLOPS | NVIDIA | GPU | Datacenter | Nvidia Data Center GPUs (Nvidia Tesla) | L40 | Ada Lovelace | n/a | 2022 | n/a | n/a | n/a | 1× AD102 | n/a | GDDR6 | 2250 GB/s | n/a | n/a |
n/a | 2.774 TFLOPS | Qualcomm | SoC | Mobile | Snapdragon 8 | Snapdragon 8 Gen 3 | 8 | n/a | 2023 | n/a | TSMC N4P | 1× 3.30 GHz Kryo Prime (Cortex-X4) + 3× 3.15 GHz Kryo Gold (Cortex-A720) + 2× 2.96 GHz Kryo Gold (Cortex-A720) + 2× 2.27 GHz Kryo Silver (Cortex-A520) | Adreno 750 @ 903 MHz | n/a | LPDDR5X | 76.8 GB/s | n/a | n/a |
n/a | 1.689 TFLOPS | Qualcomm | SoC | Mobile | Snapdragon 8 | Snapdragon 8s Gen 3 | 8 | n/a | 2024 | n/a | TSMC N4P | 1× 3.0 GHz Kryo Prime (Cortex-X4) + 4× 2.8 GHz Kryo Gold (Cortex-A720) + 3× 2.0 GHz Kryo Silver (Cortex-A520) | Adreno 735 @ 1100 MHz | n/a | LPDDR5X | 76.8 GB/s | n/a | n/a |
45 TOPS | 4.6 TFLOPS | Qualcomm | SoC | PC | Snapdragon X | Snapdragon X Elite | X | n/a | 2023 | n/a | TSMC N4 | Oryon | Adreno X1 | Hexagon | LPDDR5X-8448 @ 4224 MHz | 135 GB/s | n/a | - Total 75 TOPS (45 TOPS from NPU). |
45 TOPS | 3.8 TFLOPS | Qualcomm | SoC | PC | Snapdragon X | Snapdragon X Plus | X | n/a | 2024 | n/a | TSMC N4 | Oryon | Adreno X1-45 1107 MHz (1.7 TFLOPS) Adreno X1-45 (2.1 TFLOPS) Adreno X1-85 1250 MHz (3.8 TFLOPS) | Hexagon | LPDDR5X-8448 @ 4224 MHz | 135 GB/s | n/a | n/a |
45 TOPS | n/a | Qualcomm | NPU | n/a | Hexagon | Hexagon | n/a | n/a | n/a | Snapdragon X Plus | n/a | n/a | n/a | n/a | n/a | n/a | n/a | - Hexagon is the brand name for a family of digital signal processor (DSP) and later neural processing unit (NPU) products by Qualcomm. Hexagon is also known as QDSP6, standing for “sixth generation digital signal processor.” |
n/a | 2.1 TFLOPS | Qualcomm | GPU | n/a | Adreno | Adreno X1-45 | X | Adreno 726 | n/a | n/a | TSMC N4 | n/a | n/a | n/a | LPDDR5X-8448 @ 4224 MHz or LPDDR5X-8533 @ 4266.5 MHz | 125.1 GB/s or 136.5 GB/s | n/a | - The Adreno X1-45 is internally called the Adreno 726, suggesting it’s a scaled-up of the Adreno 725 from the Snapdragon 7+ Gen 2. |
n/a | 4.6 TFLOPS | Qualcomm | GPU | n/a | Adreno | Adreno X1-85 | X | Adreno 741 | n/a | Snapdragon X Plus | TSMC N4 | n/a | n/a | n/a | LPDDR5X-8448 @ 4224 MHz or LPDDR5X-8533 @ 4266.5 MHz | 125.1 GB/s or 136.5 GB/s | n/a | - The Adreno X1-85 is internally called the Adreno 741, suggesting it’s a scaled-up of the Adreno 730 from the Snapdragon 8 Gen 1/8+ Gen 1. |