Interconnect for Performance-Intensive Computing – No longer a Hobson’s Choice
In many areas, performance-intensive computing has seen considerable innovation over the past few decades. Compute central processing units (CPUs) have evolved from single processors to multiple processors on a single integrated circuit. Increasing clock speeds, core counts and flops enabled serial applications to be converted into parallel processes and tackle much larger and more complex workloads. The advent of graphics processing units?(GPUs), data processing units (DPUs) and other specialized processors have also increased system performance and efficiency by leaps and bounds.
Storage has seen similar advances, moving from tape drives to disk drives, and then off magnetic media entirely to high-performance, flash-based solid-state drives. This has allowed compute nodes to move larger volumes of data in and out of servers more quickly, enabling much faster workload completion times, even for complex problems. Concepts such as tiering how data is stored based on how frequently it needs to be accessed have also produced significant performance gains.
These advances, along with a plethora of suppliers offering their own versions of storage and compute, meant that architects had lots of options to consider when building data centers for performance-intensive workloads.
Rethinking centralized architecture
Networking, by comparison, has been a Hobson’s choice – architects could pick any technology they wanted, so long as it was based on centralized switching in a spine-and-leaf architecture. In almost all cases, this meant picking InfiniBand.
And since there was one deployment model and no real competition amongst vendors, innovation in networking has remained stagnant. For more than 30 years, centralized design has become a limiting factor for what can be accomplished within a cluster, and the choice of solution a mere check box for architects after they’d determined the best mix of compute and storage resources to meet their application objectives.
But “the times, they are a-changin’,” as Mr. Dylan sang. The white paper, Key Considerations for Architecting a High-Performance Fabric, describes in detail how the introduction of a modern, distributed networking architecture for performance-intensive applications breaks down the barriers inherent in the traditional spine-and-leaf approach and gives data-center architects a lot more to think about as they plan their clusters, including:
- bisection efficiency
- power consumption
- scalability
- workload composition
- resiliency
- accurately measuring cluster performance
- protocol familiarity
The network moves from afterthought to top-level consideration for cluster design on par with compute and storage, and one that has a significant impact on performance, efficiency and total cost of ownership.
With an introduction from leading HPC industry analyst Addison Snell from Intersect360, the white paper provides guidance to help you plan your next cluster build.