Tachyum's Prodigy Processor to Deliver 50 exaFLOPS Performance

Category Technology

tldr #

Tachyum® announced that it has accepted a major purchase order from a US company to build a 50 exaFLOPS supercomputer, powered by Prodigy® Universal Processor chip. The Prodigy chip is 5nm in size, 10x lower power and 1/3 the cost of competing products delivering 8 Zettaflops AI training for big language models and 16 Zettaflops of image and video processing capacity. It is built with hundreds of petabytes of DRAM and exabytes of flash-based primary storage with 4-socket nodes connected to 400G RoCE ethernet.Tachyum’s proprietary TPU® AI Inference IP supports Tachyum AI (TAI) data type and provides up to 6x performance for AI applications.


content #

Tachyum® announced that it has accepted a major purchase order from a US company to build a large-scale system, based on its 5nm Prodigy® Universal Processor chip, which delivers more than 50 exaflops performance that will exponentially exceed the computational capabilities of the fastest inference or generative AI supercomputers available anywhere in the world today. When complete, the Prodigy-powered system will deliver a 25x multiplier vs. the world’s fastest conventional supercomputer built just this year, and will achieve AI capabilities 25,000x larger than models for ChatGPT4.

The system performance with Prodigy chips will be 25x faster than the world’s fastest conventional supercomputer built this year

Tachyum has developed the world’s first Universal Processor that combines the functions of a CPU, GPU, and TPU into a single homogeneous processor architecture that is faster, 10x lower power, and 1/3 the cost of competing products. The Tachyum solution delivers never-before seen performance and efficiency to a wide range of applications including Hyperscale, HPC, and AI.

The Prodigy system with hundreds of petabytes of DRAM will have 100x more memory than needed to have human brain level compute. Installation of this Prodigy-enabled solution will begin in 2024 and reach full capacity in 2025. Among the deliverables are: * 8 Zettaflops AI training for big language models * 16 Zettaflops of image and video processing * Ability to fit more than 100,000x PALM2 530B parameter models OR 25,000x ChatGPT4 1.7T parameter models with base memory and 100,000x ChatGPT4 with 4x of base DRAM * Upgradable memory of the base model system * Hundreds of petabytes of DRAM and exabytes of flash-based primary storage * 4-socket, liquid-cooled nodes connected to 400G RoCE ethernet, with the capability to double to an 800G all non-blocking and non-overprovisioned switching fabricTachyum’s proprietary TPU® AI Inference IP supports Tachyum AI (TAI) data type and provides even more; breakthrough efficiency for video and large language model data formats that otherwise would require excessive power and expensive multipliers in matrix multiplication to achieve.

The Prodigy chip is 5nm in size, 10x lower power and 1/3 the cost of competing products

As a Universal Processor offering utility for all workloads, Prodigy-powered data center servers can seamlessly and dynamically switch between computational domains (such as AI/ML, HPC, and cloud) on a single architecture. By eliminating the need for expensive dedicated AI hardware and dramatically increasing server utilization, Prodigy reduces CAPEX and OPEX significantly while delivering unprecedented data center performance, power, and economics. Prodigy integrates 192 high-performance custom-designed 64-bit compute cores, to deliver up to 4,5x the performance of the highest-performing x86 processors for cloud workloads, up to 3x that of the highest performing GPU for HPC, and 6x for AI applications.

The system has 8 Zettaflops AI training for big language models and 16 Zettaflops of image and video processing capacity

hashtags #
worddensity #

Share