Tachyum, which focuses on the high-performance computing (HPC) and research markets, began offering Prodigy, the company’s latest processors for servers and data centers, to select customers this week. The novelty is that it is “the world’s first general-purpose processor,” capable of performing CPU, GPU and artificial intelligence tasks on a single chip, and embedded with some of the most advanced technologies available today.
Tachyum Prodigy has 128 cores and 950 W power consumption
With Prodigy, Tachyum is targeting solutions like the recently launched Nvidia H100 and AMD Instinct MI250X, with the big difference being the ability to act as a dedicated CPU, GPU, or TPU (Tensor Processing Unit) for AI, depending on what code it’s running on . . With this, the company calls the novelty a “universal processor,” allowing data centers to replace complex fabrics with multiple CPUs and GPUs with just a single chip.
The manufacturer promises 4x more performance than the Intel Xeon Platinum 8380 (3rd Gen Ice Lake, latest) in HPC tasks like database processing and simulation, and 6x more performance than the Nvidia H100 in AI and inference workloads . Overall, the Prodigy family of processors will provide 10 times the performance of the competition while consuming the same amount of power.
The series has 8 models with different core configurations, but the highlight is the T16128-AIX, the top of the line with 128 cores and extreme configurations. Manufactured on TSMC’s 5nm N5P process, optimized for data center chips, all 128 cores run at 5.7GHz, access up to 32TB of DDR5-7200 RAM, spread across 16 channels and 64 DIMM format modules, bandwidth Up to 2TB/s and 64 PCIe 5.0 bus lanes.
The strong specs translate into 950 W of power consumption, an impressive figure when we consider it a single processor, but ultimately well below the power consumption of a CPU and dedicated GPU combined. Interested companies can also combine 2 or 4 Prodigy’s in a 2U rack (server “drawer” approximately 90mm thick) to take this set to a higher level, up to 512 cores and 3,600 W power consumption.
One of the most interesting things about the Tachyum Prodigy series is that in addition to performing CPU, GPU and TPU tasks, they can all run Tachyum native architecture instructions or translate code for x86, ARM and RISC-V architectures, hardening purpose “general purpose” components. However, there is no information on how this translation will affect performance.
To achieve this feat, the chip line has two 1024-bit vector units and one 4096-bit matrix processor per core, supports computation in FP64, FP32, TF32, BF16, INT8, FP8, and TAI formats, and supports process quantization Support for low-precision data formats, as well as support for “Scatter” and “Gatter” procedures, will simplify the process of loading arrays into AI.
Testing and the future of more cores at 3nm
The company released some benchmarks performed with Prodigy in the SPECrate 2017 test, where the posted performance will outperform competitors such as the H100 with Zen 3 cores, the Xeon 8380, and the AMD EPYC 7763 Milan. High precision FP64 computing, where the chip will be no less than 30 times.
In fact, the T16128 is guaranteed to deliver 90 TFLOPs of compute power in FP64, but a rack with 4 of these chips and air cooling would achieve 6.2 PFLOPs (PetaFLOPs, or 6,200 TFLOPs) — by comparison, Nvidia’s server DGX H100 With 4 new H100 GPUs, it promises 960 TFLOPs. Expect even more for the liquid-cooled version of the Prodigy, which will deliver 12.9 PFLOPs.
Tachyum didn’t disclose pricing because it’s a company-facing product, but detailed that the racks will arrive in customers’ hands as evaluation units so that necessary adjustments can be made. Interested companies must pre-sell and receive servers within 6 to 9 months — component shortages are the reason for the delay.
Along with the release of Prodigy, the company also released a roadmap for the next 2 years, in which Prodigy 2 is expected to debut in 2024. Few technical details of the successor have been announced, but the new generation is expected to arrive at 3 nm., more cores, HBM memory for increased bandwidth, and a new CXL 3.0 and PCIe 6.0 bus.
Source: WCCFTech, Tachyum