Decoding AI TOPS: Essential Metrics for AI Chips and TOPS Comparison Chart

(Illustration: A lot of hard work has been going on behind the scenes. Le Bouchon Ogasawara, Shibuya, Tokyo. Image source: Ernest)



tl;dr

  • TOPS (Trillions of Operations Per Second) is a key indicator for measuring the computational power of AI chips and NPU chips, reflecting the trillions of operations a processor can execute per second.
  • Using the “frying eggs” analogy to intuitively understand TOPS: A regular CPU is like a chef who can only fry one egg at a time, while a high TOPS performance AI chip is like a super chef who can fry an incredibly large number of eggs simultaneously.
  • TOPS is an important reference for comparing AI chip performance, but when evaluating AI hardware, factors such as energy efficiency and memory bandwidth should also be considered. Additionally, TOPS values typically reflect theoretical peak performance, and actual performance needs to be judged based on a combination of other metrics suitable for the application scenario.

What is TOPS (Simple Life Version)

TOPS, which stands for Trillions Operations Per Second, is an important indicator for measuring the computational power of Artificial Intelligence (AI) chips or Neural Processing Units (NPUs). TOPS is used to express the maximum number of operations a processor can execute per second, calculated in trillions. In the future, if computational power continues to increase, the initial T may be replaced by other larger units of measurement.

We can use an example from daily life to explain and more intuitively understand TOPS:

Imagine AI computation as the process of frying eggs, where data is the egg being heated.

A regular chef (ordinary processor, CPU) might only be able to fry one egg at a time, while a super chef (AI chip) might be able to fry 1 trillion eggs simultaneously! TOPS is like a measure of this “super chef’s” ability, telling us how many “data eggs” it can “fry” per second.

TOPS is one of the important indicators for understanding and comparing AI chip performance, but it’s not the only one.

When evaluating AI hardware, AI phones, or AI computers, remember to consider other factors such as energy efficiency, memory bandwidth, software ecosystem, etc. Using TOPS can help us compare the computational power of different AI chips, providing a reference point for choosing AI hardware devices suitable for specific applications.


What is TOPS (In-depth Version, for Those Who Insist)

Before delving deeper into understanding TOPS, we need to first understand what an “operation” is:

In digital circuits and computer science, an “operation” typically refers to a basic mathematical or logical computation. For AI chips or NPUs, these operations mainly include:

  1. Floating-point operations: such as addition, subtraction, multiplication, and division.
  2. Matrix operations: Large-scale matrix multiplication is one of the most common operations in deep learning.
  3. Vector operations: including dot product (scalar product), cross product (vector product), etc.
  4. Activation functions: such as ReLU, Sigmoid, Tanh, etc.
  5. Convolution operations: widely used in Convolutional Neural Networks (CNN).

These operations are typically performed in FP32 (32-bit floating-point) or FP16 (16-bit floating-point) formats. Some AI chips also support lower precision formats like INT8 (8-bit integer) to improve performance and reduce energy consumption, typically used for inference.

The calculation of TOPS can be simplified as:

TOPS = (Number of operations per clock cycle) × (Clock frequency) / 1 trillion

For example, if an AI chip can perform 1000 operations per clock cycle and has a clock frequency of 1GHz, then its theoretical peak performance is 1 TOPS.

1000 operations/cycle × 1GHz = 1000 × 10^9 operations/second = 10^12 operations/second = 1 TOPS

When understanding TOPS, please note the following points:

  1. TOPS typically represents theoretical peak performance; actual performance may vary due to factors such as memory bandwidth and chip architecture.
  2. TOPS values may differ for different types of operations (such as FP32, FP16, INT8).
  3. A high TOPS value doesn’t necessarily mean better performance in all AI tasks, as actual performance also depends on software optimization and the characteristics of specific tasks.

TOPS Comparison Table

(Focus mainly on the “INT8 Ops” column. You can swipe left and right to see more comparison data)

INT8 OpsFP32 FLOpsCompany NameTypeTarget MarketProduct FamilyProduct NameProduct GenerationCode NameRelease YearFirst Used OnFab ProcessCPUGPUNPUMemory TechMemory BandwidthTDP BaseRemark
73 TOPSn/aAMDSoCPCRyzen AI 300Ryzen AI 9 365n/aStrix Point2024n/aTSMC 4nm FinFETn/aAMD Radeon™ 880Mn/aDDR5-5600 or LPDDR5X-7500n/a28.0- Total 73 TOPS (50 TOPS from NPU).
80 TOPSn/aAMDSoCPCRyzen AI 300Ryzen AI 9 HX 370n/aStrix Point2024n/aTSMC 4nm FinFETn/aAMD Radeon™ 890Mn/aDDR5-5600 or LPDDR5X-7500n/a28.0- Total 80 TOPS (50 TOPS from NPU).
50 TOPSn/aAMDNPUn/aRyzenXDNA 2n/aAI2024Ryzen AI 9 HX 370n/an/an/an/an/an/an/an/a
1961.2 TOPS 3922.3 TOPS (with Sparsity)122.6 TFLOPSAMDGPUDatacenterAMD Data Center GPUs (AMD Instinct)MI300An/an/a2023n/an/an/an/an/aHBM35300 GB/s550.0n/a
2614.9 TOPS 5229.8 TOPS (with Sparsity)163.4 TFLOPSAMDGPUDatacenterAMD Data Center GPUs (AMD Instinct)MI300Xn/an/a2023n/aXCD: TSMC N5 IOD: TSMC N6n/an/an/aHBM35300 GB/s750.0n/a
2614.9 TOPS 5229.8 TOPS (with Sparsity)163.4 TFLOPSAMDGPUDatacenterAMD Data Center GPUs (AMD Instinct)MI325Xn/an/a2024n/aXCD: TSMC N5 IOD: TSMC N6n/an/an/aHBM3E6000 GB/s750.0n/a
n/an/aARMIPn/aNeoverseNeoverse E1n/an/an/an/an/an/an/an/an/an/an/an/a
n/an/aARMIPn/aNeoverseNeoverse N1n/aAres2019Ampere Altra, AWS Graviton2n/an/an/an/an/an/an/an/a
n/an/aARMIPDatacenter (Infrastructure Processor)NeoverseNeoverse N2n/aPerseus2020Microsoft Azure Cobalt 100n/an/an/an/an/an/an/an/a
n/an/aARMIPDatacenter (Infrastructure Processor)NeoverseNeoverse N3n/aHermesn/an/an/an/an/an/an/an/an/an/a
n/an/aARMIPDatacenter (Infrastructure Processor)NeoverseNeoverse V1n/aZeus2020AWS Graviton3n/an/an/an/an/an/an/a- first announcements coming out of Arm’s TechCon convention 2018 in San Jose.
n/an/aARMIPDatacenter (Infrastructure Processor)NeoverseNeoverse V2n/an/a2022NVIDIA Grace, AWS Graviton4, Google Axionn/an/an/an/an/an/an/an/a
n/an/aARMIPDatacenter (Infrastructure Processor)NeoverseNeoverse V3n/aPoseidonn/an/an/an/an/an/an/an/an/an/a
825 TOPS ???n/aAlibabaSoCDatacenter (AI inference)Hanguang 含光Hanguang 8001n/a2019n/aTSMC 12nmn/an/an/an/an/a280.0- 16x PCIe gen4 - SRAM, No DDR
n/an/aAlibabaSoCDatacenter (Infra)Yitian 倚天Yitian 7101n/a2021Alibaba ECS g8mN5128 Neoverse N2 coren/an/an/an/an/an/a
n/an/aAmazonSoCDatacenter (Infra) (Scale out)AWS GravitonGraviton1Alpine2018Amazon EC2 A1TSMC 16nmCortex A72n/an/aDDR4-160051.2 GB/s95.0- 32 lanes of PCIe gen3
n/an/aAmazonSoCDatacenter (Infra) (General Purpose)AWS GravitonGraviton 22Alpine+2019Amazon EC2 M6g, M6gd, C6g, C6gd, C6gn, R6g, R6gd, T4g, X2gd, G5g, Im4gn, Is4gen, I4gTSMC 7nm128 Neoverse N1 coren/an/aDDR4-3200204.8 GB/s110.0- 64 lanes of PCIe gen4
n/an/aAmazonSoCDatacenter (Infra) (ML, HPC, SIMD)AWS GravitonGraviton 33n/a2021Amazon EC2 C7g, M7g, R7g; with local disk: C7gd, M7gd, R7gdTSMC 5nm64 Neoverse V1 coren/an/aDDR5-4800307.2 GB/s100.0- 32 lanes of PCIe gen5
n/an/aAmazonSoCDatacenter (Infra)AWS GravitonGraviton 3E3n/a2022Amazon EC2 C7gn, HPC7gn/a64 Neoverse V1 coren/an/an/an/an/an/a
n/an/aAmazonSoCDatacenter (Infra) (Scale up)AWS GravitonGraviton 44n/a2023Amazon EC2 R8gn/a96 Neoverse V2 coren/an/aDDR5-5600537.6 GB/sn/a- 96 lanes of PCIe gen5
63.3 TOPS0.97 TFLOPSAmazonSoCDatacenter (AI inference)AWS InferertiaInferertia 11n/a2018Amazon EC2 Inf1TSMC 16nm16 NeuroCore v1n/an/an/a50 GB/sn/an/a
380 TOPS2.9 TFLOPSAmazonSoCDatacenter (AI inference)AWS InferertiaInferertia 22n/a2022Amazon EC2 Inf2TSMC 5nm24 NeuroCore v2n/an/an/a820 GB/sn/an/a
380 TOPS2.9 TFLOPSAmazonSoCDatacenter (AI train)AWS TrainiumTrainium 11n/a2020Amazon EC2 Trn1TSMC 7nm32 NeuroCore v2n/an/an/a820 GB/sn/an/a
861 TOPS6.57 TFLOPSAmazonSoCDatacenter (AI train)AWS TrainiumTrainium 22n/a2023Amazon EC2 Trn2TSMC 4nm64 NeuroCore v2n/an/an/a4,096 GB/sn/an/a
11 TOPS748.8 GFLOPSAppleSoCMobileAA14 Bionicn/aAPL1W012020iPhone 12TSMC N5Firestorm + Icestormn/an/aLPDDR4X-426634.1 GB/sn/an/a
15.8 TOPS1.37 TFLOPSAppleSoCMobileAA15 Bionicn/aAPL1W072021iPhone 13TSMC N5PAvalanche + Blizzardn/an/aLPDDR4X-426634.1 GB/sn/an/a
17 TOPS1.789 TFLOPSAppleSoCMobileAA16 Bionicn/aAPL1W102022iPhone 14TSMC N4PEverest + Sawtoothn/an/aLPDDR5-640051.2 GB/sn/a- 6GB LPDDR5
35 TOPS2.147 TFLOPSAppleSoCMobileAA17 Pron/aAPL1V022023iPhone 15 Pro, iPhone 15 Pro MaxTSMC N3B6 cores (2 performance + 4 efficiency)Apple-designed 6-core16-core Neural EngineLPDDR5-640051.2 GB/sn/a- 8GB LPDDR5
35 TOPSn/aAppleSoCMobileAA18n/an/a2024iPhone 16TSMC N3P6 cores (2 performance + 4 efficiency)Apple-designed 5-core16-core Neural Enginen/an/an/an/a
35 TOPSn/aAppleSoCMobileAA18 Pron/an/a2024iPhone 16 ProTSMC N3P6 cores (2 performance + 4 efficiency)Apple-designed 6-core16-core Neural Enginen/an/an/an/a
11 TOPS2.6 TFLOPSAppleSoCMobile, PCMM1n/aAPL11022020n/aTSMC N5high-performance “Firestorm” + energy-efficient “Icestorm”n/an/aLPDDR4X-426668.3 GB/sn/an/a
11 TOPS10.4 TFLOPSAppleSoCMobile, PCMM1 Maxn/aAPL11052021n/aTSMC N5n/an/an/aLPDDR5-6400409.6 GB/sn/an/a
11 TOPSn/aAppleSoCMobile, PCMM1 Pron/aAPL11032021n/aTSMC N5n/an/an/aLPDDR5-6400204.8 GB/sn/an/a
22 TOPS21 TFLOPSAppleSoCMobile, PCMM1 Ultran/aAPL1W062022n/aTSMC N5The M1 Ultra consists of two M1 Max units connected with UltraFusion Interconnect with a total of 20 CPU cores and 96 MB system level cache (SLC).n/an/aLPDDR5-6400819.2 GB/sn/an/a
15.8 TOPS2.863 TFLOPS, 3.578 TFLOPSAppleSoCMobile, PCMM2n/aAPL11092022n/aTSMC N5Phigh-performance @3.49 GHz “Avalanche” + energy-efficient @2.42 GHz “Blizzard”n/an/aLPDDR5-6400102.4 GB/sn/an/a
15.8 TOPS10.736 TFLOPS, 13.599 TFLOPSAppleSoCMobile, PCMM2 Maxn/aAPL11112023n/aTSMC N5Pn/an/an/aLPDDR5-6400409.6 GB/sn/an/a
15.8 TOPS5.726 TFLOPS, 6.799 TFLOPSAppleSoCMobile, PCMM2 Pron/aAPL11132023n/aTSMC N5Pn/an/an/aLPDDR5-6400204.8 GB/sn/an/a
31.6 TOPS21.473 TFLOPS, 27.199 TFLOPSAppleSoCMobile, PCMM2 Ultran/aAPL1W122023n/aTSMC N5Pn/an/an/aLPDDR5-6400819.2 GB/sn/an/a
18 TOPS2.826 TFLOPS, 3.533 TFLOPSAppleSoCMobile, PCMM3n/aAPL12012023MacBook ProTSMC N3Bn/an/an/aLPDDR5-6400102.4 GB/sn/an/a
18 TOPS10.598 TFLOPS, 14.131 TFLOPSAppleSoCMobile, PCMM3 Maxn/aAPL12042023n/aTSMC N3Bn/an/an/aLPDDR5-6400307.2 GB/s, 409.6 GB/sn/an/a
18 TOPS4.946 TFLOPS, 6.359 TFLOPSAppleSoCMobile, PCMM3 Pron/aAPL12032023n/aTSMC N3Bn/an/an/aLPDDR5-6400153.6 GB/sn/an/a
38 TOPS3.763 TFLOPSAppleSoCMobile, PCMM4n/aAPL12062024iPad Pro (7th generation)TSMC N3E10 cores (4 performance + 6 efficiency)Apple-designed 10-core16-core Neural EngineLPDDR5X-7500120 GB/sn/an/a
38 TOPSn/aAppleSoCMobile, PCMM4 Maxn/an/a2024MacBook Pro M4 MaxTSMC N3E14 cores (10 performance + 4 efficiency) 16 cores (12 performance + 4 efficiency)Apple-designed 16-core Apple-designed 20-core16-core Neural EngineLPDDR5X-8533409.6 GB/s (36GB), 546 GB/s (48GB, 64GB, 128GB)n/an/a
38 TOPSn/aAppleSoCMobile, PCMM4 Pron/an/a2024MacBook Pro M4 Pro, Mac mini M4 ProTSMC N3E12 cores (8 performance + 4 efficiency) 14 cores (10 performance + 4 efficiency)Apple-designed 32-core Apple-designed 40-core16-core Neural EngineLPDDR5X-8533273 GB/sn/an/a
n/an/aGoogleSoCDatacenter (Infra)GCP CPUAxionn/aAxion2024GCP Compute Engine ???n/a?? Neoverse V2 coren/an/an/an/an/an/a
1.6 TOPSn/aGoogleSoCMobileGoogle Tensor (Edge TPU)G11Whitechapel2021Pixel 6, Pixel 6 Pro, Pixel 6aSamsung 5 nm LPEOcta-core: 2.8 GHz Cortex-X1 (2×) 2.25 GHz Cortex-A76 (2×) 1.8 GHz Cortex-A55 (4×)Mali-G78 MP20 at 848 MHzGoogle Edge TPULPDDR551.2 GB/sn/an/a
n/an/aGoogleSoCMobileGoogle Tensor (Edge TPU)G22Cloudripper2022Pixel 7, Pixel 7 Pro, Pixel 7a, Pixel Fold, Pixel TabletSamsung 5 nm LPEOcta-core: 2.85 GHz Cortex-X1 (2×) 2.35 GHz Cortex-A78 (2×) 1.8 GHz Cortex-A55 (4×)Mali-G710 MP7 at 850 MHzGoogle Edge TPULPDDR551.2 GB/sn/an/a
27 TOPSn/aGoogleSoCMobileGoogle Tensor (Edge TPU)G33Zuma (Dev Board: Ripcurrent)2023Pixel 8, Pixel 8 Pro, Pixel 8aSamsung 4nm LPPNona-core: 2.91 GHz Cortex-X3 (1×) 2.37 GHz Cortex-A715 (4×) 1.7 GHz Cortex-A510 (4×)Mali-G715 MP10 at 890 MHzGoogle Edge TPU (Rio)LPDDR5X68.2 GB/sn/an/a
45 TOPSn/aGoogleSoCMobileGoogle Tensor (Edge TPU)G44Zuma Pro2024Pixel 9, Pixel 9 ProSamsung 4nm LPPOcta-core: 3.1 GHz Cortex-X4 (1×) 2.6 GHz Cortex-A720 (3×) 1.92 GHz Cortex-A520 (4×)Mali-G715 MP10 at 940 MHzn/aLPDDR5Xn/an/a- 8Gen3 = 45 TOPS, D9300 = 48 TOPS
n/an/aGoogleSoCMobileGoogle Tensor (Edge TPU)G55Laguna Beach (Dev Board: Deepspace)2025Pixel 10, Pixel 10 ProTSMC N3 + InFO-POP packagingn/an/an/an/an/an/an/a
23 TOPSn/aGoogleSoCDatacenter (AI inference)TPUTPUv11n/a2015n/a28nmn/an/an/aDDR3-213334 GB/s75.0- The core of TPU: Systolic Array - Matrix Multiply Unit (MXU): a big systolic array - PCIe Gen3 x16
45 TOPS3 TFLOPSGoogleSoCDatacenter (AI inference)TPUTPUv22n/a2017n/a16nmn/an/an/an/a600 GB/s280.0- 16GB HBM - BF16
123 TOPS4 TFLOPSGoogleSoCDatacenter (AI inference)TPUTPUv33n/a2018n/a16nmn/an/an/an/a900 GB/s220.0n/a
275 TOPSn/aGoogleSoCDatacenter (AI inference)TPUTPUv44n/a2021n/a7nmn/an/an/an/a1,200 GB/s170.0- 32GB HBM2
393 TOPSn/aGoogleSoCDatacenter (AI inference)TPUTPUv5e5n/a2023n/an/an/an/an/an/a819 GB/sn/an/a
918 TOPSn/aGoogleSoCDatacenter (AI inference)TPUTPUv5p5n/a2023n/an/an/an/an/an/a2,765 GB/sn/an/a
n/an/aGoogleSoCDatacenter (AI inference)TPUTPUv6? Trillium?6n/a2024n/an/an/an/an/an/an/an/an/a
n/a31 TFLOPSGraphcoreSoCDatacenterColossusColossus MK1 GC2 IPU1n/a2017n/aTSMC 16nm1216 processor coresn/an/an/a45,000 GB/sn/an/a
n/a62 TFLOPSGraphcoreSoCDatacenterColossusColossus MK2 GC200 IPU2n/a2020n/aTSMC 7nm1472 processor coresn/an/an/a47,500 GB/sn/an/a
n/an/aGraphcoreSoCDatacenterColossusColossus MK3 (TBD)3n/an/an/an/an/an/an/an/an/an/an/a
n/an/aIntelSoCHP Mobile, PCn/an/an/aArrow Laken/an/an/an/an/an/an/an/an/an/a
120 TOPSn/aIntelSoCLP MobileCore UltraCore UltraSeries 2Lunar Lake2024n/aTSMC N3B (Compute tile), TSMC N6 (Platform controoler tile)P-core: Lion Cove E-core: SkymontXe2NPU 4n/an/an/a- Total 120 TOPS (48 TOPS from NPU 4 + 67 TOPS from GPU + 5 TOPS from CPU).
34 TOPSn/aIntelSoCMobileCore UltraCore UltraSeries 1Meteor Lake2023n/aIntel 4 (7nm EUV, Compute tile), TSMC N5 (Graphics tile), TSMC N6 (Soc tile, I/O extender tile)P-core: Redwood Cove E-core: CrestmontXe-LPGNPU 3720n/an/an/a- Total 34 TOPS (11 TOPS from NPU + 18 TOPS from GPU + 5 TOPS from CPU).
0.5 TOPSn/aIntelNPUn/an/aNPU 11n/a2018n/an/an/an/an/an/an/an/an/a
7 TOPSn/aIntelNPUn/an/aNPU 22n/a2021n/an/an/an/an/an/an/an/an/a
11.5 TOPSn/aIntelNPUn/an/aNPU 33n/a2023n/an/an/an/an/an/an/an/an/a
48 TOPSn/aIntelNPUn/an/aNPU 44n/a2024Lunar Laken/an/an/an/an/an/an/an/a
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9000 天璣 90009000n/a2021Redmi K50 Pro OPPO Find X5 Pro 天璣版 vivo X80 / X80 Pro 天璣版TSMC N41× Cortex-X2 @ 3.05 GHz 3× Cortex-A710 @ 2.85 GHz 4× Cortex-A510 @ 1.8 GHzMali-G710 MP10 @ 850 MHzMediaTek APU 590n/an/an/a- 5G NR Sub-6GHz, LTE
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9000+ 天璣 9000+9000n/a2022小米12 Pro 天璣版 華碩 ROG Phone 6D Ultimate iQOO Neo 7 OPPO Find N2 FlipTSMC N41× Cortex-X2 @ 3.2 GHz 3× Cortex-A710 @ 2.85 GHz 4× Cortex-A510 @ 1.8 GHzMali-G710 MC10MediaTek APU 590n/an/an/a- 5G NR Sub-6GHz, LTE
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9200 天璣 92009000n/a2022vivo X90, vivo X90 Pro OPPO Find X6 OPPO Find N3 FlipTSMC N41× Cortex-X3 @ 3.05GHz 3× Cortex-A715 @ 2.85GHz 4× Cortex-A510 @ 1.8GHzMali-Immortalis-G715 MP11 @ 981 MHzMediaTek APU 690n/an/an/a- 5G NR Sub-6 GHz, 5G mmWave, LTE
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9200+ 天璣 9200+9000n/a2023iQOO Neo8 Pro vivo X90s Redmi K60至尊版TSMC N41× Cortex-X3 @ 3.35 GHz 3× Cortex-A715 @ 3.0 GHz 4× Cortex-A510 @ 2.0 GHzMali-Immortalis-G715 MC11MediaTek APU 690n/an/an/a- 5G NR Sub-6 GHz, 5G mmWave, LTE
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9300 天璣 93009000n/a2023vivo X100, vivo X100 Pro OPPO Find X7TSMC N4P1× Cortex-X4 @ 3.25 GHz 3× Cortex-X4 @ 2.85 GHz 4× Cortex-A720 @ 2.0 GHzMali-Immortalis-G720 MC12 @ 1300 MHzMediaTek APU 790n/an/an/a- 5G NR (Sub-6 GHz & mmWave), 4G LTE, quad-band GNSS (BeiDou, Galileo, GLONASS, GPS, NavIC, QZSS), Bluetooth 5.4, Wi-Fi 7 (2x2)
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9300+ 天璣 9300+9000n/a2024vivo X100S, vivo X100X ProTSMC N4P1× Cortex-X4 @ 3.4 GHz 3× Cortex-X4 @ 2.85 GHz 4× Cortex-A720 @ 2.0 GHzMali-Immortalis-G720 MC12 @ 1300 MHzMediaTek APU 790n/an/an/a- 5G NR (Sub-6 GHz & mmWave), 4G LTE, quad-band GNSS (BeiDou, Galileo, GLONASS, GPS, NavIC, QZSS), Bluetooth 5.4, Wi-Fi 7 (2x2)
n/an/aMediaTekSoCMobileDimensity 天璣Dimensity 9400 天璣 94009000n/a2024vivo X200, OPPO Find X8 / ProTSMC N31× Cortex-X925 @ 3.63 GHz 3× Cortex-X4 @ 2.8 GHz 4× Cortex-A725 @ 2.1 GHzMali-Immortalis-G925 MC12 @ ??? MHzn/an/an/an/an/a
n/an/aMicrosoftSoCDatacenter (Infra)Azure CobaltCobalt 1001n/a2024Azure VM Dpsv6, Dplsv6, Epsv6n/a128 Neoverse V2 coren/an/aLPDDR5 ???n/an/a- PCIe gen5 - CXL 1.1 - from project start to silicon in 13 months.
1,600 TOPSn/aMicrosoftSoCDatacenter (AI inference)Azure MaiaMaia 1001n/a2024Microsoft CopilotTSMC N5 + CoWoS-Sn/an/an/an/a18,000 GB/s ???500.0- 32Gb/s PCIe gen5x8 - Design to TDP = 700W - Provision TDP = 500W
n/a15.1 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4060n/aAD107-4002023n/aTSMC N4n/an/an/aGDDR6272 GB/s115.0- PCIe 4.0 x8
n/a22.1 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4060 Tin/aAD106-3512023n/aTSMC N4n/an/an/aGDDR6288 GB/s160.0- PCIe 4.0 x8
n/a29.1 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4070n/aAD104-2502023n/aTSMC N4n/an/an/aGDDR6X504 GB/s200.0- PCIe 4.0 x16
n/a35.48 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4070 Supern/aAD104-3502024n/aTSMC N4n/an/an/aGDDR6X504 GB/s220.0- PCIe 4.0 x16
n/a40.1 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4070 Tin/aAD104-4002023n/aTSMC N4n/an/an/aGDDR6X504 GB/s285.0- PCIe 4.0 x16
n/a44.10 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4070 Ti Supern/aAD103-2752024n/aTSMC N4n/an/an/aGDDR6X672 GB/s285.0- PCIe 4.0 x16
n/a48.7 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4080n/aAD103-3002022n/aTSMC N4n/an/an/aGDDR6X717 GB/s320.0- PCIe 4.0 x16
n/a52.22 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4080 Supern/aAD103-4002024n/aTSMC N4n/an/an/aGDDR6X736 GB/s320.0- PCIe 4.0 x16
n/a82.6 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4090n/aAD102-3002022n/aTSMC N4n/an/an/aGDDR6X1008 GB/s450.0- PCIe 4.0 x16
n/a73.5 TFLOPSNVIDIAGPUDesktopGeForce RTX 40GeForce RTX 4090 Dn/aAD102-2502023n/aTSMC N4n/an/an/aGDDR6X1008 GB/s425.0- PCIe 4.0 x16
n/a124.96 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)A10Amperen/a2021n/an/an/a1× GA102-890-A1n/aGDDR6600 GB/sn/an/a
624 TOPS312.0 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)A100Amperen/a2020n/aTSMC N7n/a1× GA100-883AA-A1n/aHBM21555 GB/s400.0n/a
n/a73.728 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)A16Amperen/a2021n/an/an/a4× GA107n/aGDDR64x 200 GB/sn/an/a
n/a18.124 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)A2Amperen/a2021n/an/an/a1× GA107n/aGDDR6200 GB/s60.0n/a
n/a165.12 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)A30Amperen/a2021n/an/an/a1× GA100n/aHBM2933.1 GB/sn/an/a
n/a149.68 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)A40Amperen/a2020n/an/an/a1× GA102n/aGDDR6695.8 GB/sn/an/a
3500 TOPS (3.5 POPS)n/aNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)B100 (SXM6 card)Blackwelln/a2024n/aTSMC 4NP (custom N4P)n/an/an/aHBM3E8000 GB/s700.0n/a
4500 TOPS (4.5 POPS)n/aNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)B200 (SXM6 card)Blackwelln/a2024n/aTSMC 4NP (custom N4P)n/an/an/aHBM3E8000 GB/s1000.0n/a
n/a756.449 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)H100 (PCIe card)Hoppern/a2022n/aTSMC 4N (custom N4)n/a1× GH100n/aHBM2E2039 GB/sn/an/a
1980 TOPS (1.98 POPS)989.43 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)H100 (SXM5 card)Hoppern/a2022n/aTSMC 4N (custom N4)n/a1× GH100n/aHBM33352 GB/s700.0n/a
1980 TOPS (1.98 POPS)67 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)H200 (SXM5 card)Hoppern/a2023n/aTSMC 4N (custom N4)n/an/an/aHBM3E4800 GB/s1000.0n/a
n/a121.0 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)L4Ada Lovelacen/a2023n/an/an/a1x AD104n/aGDDR61563 GB/sn/an/a
n/a362.066 TFLOPSNVIDIAGPUDatacenterNvidia Data Center GPUs (Nvidia Tesla)L40Ada Lovelacen/a2022n/an/an/a1× AD102n/aGDDR62250 GB/sn/an/a
n/a2.774 TFLOPSQualcommSoCMobileSnapdragon 8Snapdragon 8 Gen 38n/a2023n/aTSMC N4P1× 3.30 GHz Kryo Prime (Cortex-X4) + 3× 3.15 GHz Kryo Gold (Cortex-A720) + 2× 2.96 GHz Kryo Gold (Cortex-A720) + 2× 2.27 GHz Kryo Silver (Cortex-A520)Adreno 750 @ 903 MHzn/aLPDDR5X76.8 GB/sn/an/a
n/a1.689 TFLOPSQualcommSoCMobileSnapdragon 8Snapdragon 8s Gen 38n/a2024n/aTSMC N4P1× 3.0 GHz Kryo Prime (Cortex-X4) + 4× 2.8 GHz Kryo Gold (Cortex-A720) + 3× 2.0 GHz Kryo Silver (Cortex-A520)Adreno 735 @ 1100 MHzn/aLPDDR5X76.8 GB/sn/an/a
45 TOPS4.6 TFLOPSQualcommSoCPCSnapdragon XSnapdragon X EliteXn/a2023n/aTSMC N4OryonAdreno X1HexagonLPDDR5X-8448 @ 4224 MHz135 GB/sn/a- Total 75 TOPS (45 TOPS from NPU).
45 TOPS3.8 TFLOPSQualcommSoCPCSnapdragon XSnapdragon X PlusXn/a2024n/aTSMC N4OryonAdreno X1-45 1107 MHz (1.7 TFLOPS) Adreno X1-45 (2.1 TFLOPS) Adreno X1-85 1250 MHz (3.8 TFLOPS)HexagonLPDDR5X-8448 @ 4224 MHz135 GB/sn/an/a
45 TOPSn/aQualcommNPUn/aHexagonHexagonn/an/an/aSnapdragon X Plusn/an/an/an/an/an/an/a- Hexagon is the brand name for a family of digital signal processor (DSP) and later neural processing unit (NPU) products by Qualcomm. Hexagon is also known as QDSP6, standing for “sixth generation digital signal processor.”
n/a2.1 TFLOPSQualcommGPUn/aAdrenoAdreno X1-45XAdreno 726n/an/aTSMC N4n/an/an/aLPDDR5X-8448 @ 4224 MHz or LPDDR5X-8533 @ 4266.5 MHz125.1 GB/s or 136.5 GB/sn/a- The Adreno X1-45 is internally called the Adreno 726, suggesting it’s a scaled-up of the Adreno 725 from the Snapdragon 7+ Gen 2.
n/a4.6 TFLOPSQualcommGPUn/aAdrenoAdreno X1-85XAdreno 741n/aSnapdragon X PlusTSMC N4n/an/an/aLPDDR5X-8448 @ 4224 MHz or LPDDR5X-8533 @ 4266.5 MHz125.1 GB/s or 136.5 GB/sn/a- The Adreno X1-85 is internally called the Adreno 741, suggesting it’s a scaled-up of the Adreno 730 from the Snapdragon 8 Gen 1/8+ Gen 1.

Reference

Loading comments…