Big iron will always result in high expenses


As early as the late 1980s, when Sun Microsystems was on the rise in the data center and Hewlett Packard was its main competitor in Unix-based systems, market forces forced IBM to finally and vigorously use its own open system machines to create Sun, HP. to fight against, and others behind the Unix movement. IBM was the main supplier of proprietary systems at the time – fully vertically integrated stacks, from the CPU to the application development tools and runtimes – and the rise of IUnix had a negative impact on sales of these machines.

Here we are, more than three decades later, and IBM makes up about half of the revenue outside of the X86 server market. This relatively large part of the non-X86 server market is despite the emergence of single-socket machines based on arm servers at selected hyperscalers and cloud builders, which eats up the growth of the X86 servers, but also the non-X86 Part of the cake makes it go up more than in about a decade. Sun and HP left the RISC / Unix server battlefield a long time ago – about half a decade ago, if you’re going to be generous – and AMD has no interest in having Epyc architecture-based machines with four, eight, or larger sockets build . But on the contrary. AMD is the flagship of the single-socket server and has made this the core of its strategy four years ago since the launch of the “Naples” Epyc 7001 CPUs – the company’s re-entry into the server market. Google caught religion with its Tau instances in its public cloud of the same name, but that is mainly used to combat the single-socket Graviton2 instances at Amazon Web Services.

While machines with two or fewer sockets are likely to dominate server racket deliveries for the foreseeable future and will continue to dominate the source of income, there is still a healthy appetite for machines with four or more sockets and these machines power a very considerable proportion Revenue and probably the vast majority of the profits there is in the server business.

The reason is simple: when you have a big job that requires lots of memory and lots of cores and threads, having a large NUMA server is the easiest way for most enterprise IT to achieve it. These NUMA machines look and feel like a gigantic workstation and are relatively easy to program and scale, at least when compared to distributed computing clusters, which have a much looser connection between computing power and memory.

IBM reminded its customers and partners of the revenue stream for its “Denali” Power E1080 server launched two weeks ago, the first and largest of the servers Big Blue is expected to launch this year and next, as part of its briefings Big Iron based on the Power10 processor, which has been in development for more than three years. As with previous power chips since the Power4 was introduced in 2001, IBM has focused on building a beefy core that can do a lot of work, and then scaling the machine with dozens of sockets.

What this diagram doesn’t show is the socket’s scalability over time. In 1997 the RS / 6000 Unix line and the proprietary AS / 400 line were both based on the 64-bit PowerPC processor “Northstar” and could be scaled to 8 or 16 sockets, and IBM also sold machines with the previous generation of “Apache” PowerPC processors that can be scaled to 8 sockets. In 2000, the single-core “I-Star” PowerPC chips debuted and IBM increased the top-end scale by 33 percent to 24 sockets, and in early 2001 IBM had machines with its single-core “S-Star” “-PowerPC chips in essentially the same machines with a 20 percent clock speed increase and a range of micro-architecture features.

In late 2001, IBM launched the Power4 processor, the first dual-core chip and the first to break the 1 GHz barrier in the server racket, and Big Blue started beating Sun and HP – and doing it in half the next decade or more. So much so that they abandoned the Unix market and adopted Intel Xeon processors and adopted Linux. In any case, IBM scaled the chassis with Power4 chips to just 8 sockets using NUMA clustering, but was able to scale the overall scale by 33 percent compared to the S-Star machines due to improvements in the NUMA connection and with two cores per socket increase significantly.

With Power5, IBM has renamed the iSeries of the AS / 400 series and the pSeries of the RS / 6000 Unix series, and the number of cores and sockets remained the same at two per chip; With Power5 +, IBM has developed its first dual-chip module (DCM), which puts two entire processors in a single socket, doubled the number of cores for the systems and added two threads per core with simultaneous multithreading (SMT), but because the clock rates lower were only increase the system throughput by about 60 percent. Although this roadmap doesn’t show it, IBM also had DCMs with the Power6 + chip. But again, because of the core microarchitecture and NUMA improvements, IBM could do more work with fewer cores with Power5 and then Power6 machines, and then add DCMs halfway between generations for a nice, smooth performance increase curve.

The Power5 high-end machines were based on the NUMA interconnect technology purchased from Sequent and laid the groundwork for the multi-socket-node, four-node system designs that are used to this day for large irons at IBM – including the Power E0180 -Machine. (IBM had a big, bad 32-socket machine based on a single backplane and NUMA connection to the Power 795, much like the 24-socket I-Star and S-Star machines, but that was it last machine of its kind that IBM had made this way.)

With the Power 770 and Power 780 enterprise-class machines based on derivatives of this Sequent NUMA interconnects and the Power7 and then Power7 + processors, IBM switched from two-socket nodes to four-socket nodes and closed the high-end on 16-socket systems. This has not changed and will not change in the future. Queuing theory and communication overhead make it very difficult to connect more than 16 nodes together in an efficient manner. If you switch to 32 or 64 sockets, the extra performance is never realized because the cores end up waiting for memory and I / O accesses in the NUMA cluster. Or you partition the machine, and at that point you also can’t pay for the NUMA electronics overhead and just buy smaller, cheaper physical machines. At NUMA there are decreasing marginal yields – and they decrease quickly. It is better to couple the wires inside the socket more and more tightly, and then to couple each socket more and more tightly. The result is that over the past two decades, IBM has increased the throughput of top-of-the-line NUMA systems between a 16-core iSeries 890 or pSeries 690 and a 240-core Power E1080 by a factor of 15.4.

This consistent, predictable, and reliable increase in the NUMA scale is why IBM has around 14,000 power-based machines with four or more outlets in use. (To be precise, IBM has told its business partners that it has over 10,000 customers using its Power7, Power7 + and Power8 processor-based “enterprise scale-up servers” and we estimate they do worldwide Another 4,000 customers with Power Systems are irons with more than four sockets based on Power9 processors That may not sound like many customers or machines (many customers have at least two systems with high availability clustering over the boxes), but such machines are expensive – from hundreds of thousands to tens of millions of dollars, depending on the configuration – and the money definitely adds up.

This is how IBM actually predicts the market for machines with more than four sockets:

Due to the processor transitions in IBM’s own System z and Power Systems lines, it wouldn’t be surprising to see a drop in large NUMA server sales from 2020 to 2021, but as you can see in the graph above, much of the drop was off declines Sales of boxes with four bases, which is interesting. This could be a standstill in demand or a more aggressive pricing environment as Intel peddles “Skylake” and “Cascade Lake” and “Cooper Lake” Xeon SP systems against Power9 irons – especially in China for various workloads and especially for support of SAP HANA in-memory databases and their application stacks.

It is also interesting that IBM believes the market will be remarkably stable and growing in the years to come, and the forecast for eight-socket machines is expected to skyrocket in 2025. IBM informs resellers and other business partners that it has a 50 percent share in the high-end market (where machines cost over $ 250,000) and 10 percent in the midrange market (where machines cost between $ 25,000 and $ 250,000) want to have. The Power E1080 will address the high-end segment, and the future Power E1050, which is due to appear sometime in the second quarter of next year, will address the mid-range segment.

What is really surprising is that IBM is not adopting a more aggressive stance in the four-socket market, as it represents half the opportunities and the majority of server deliveries in this upper tier of the NUMA system area. It can only be about reach and profitability. HPE will be very aggressive in protecting its terrain, including selling 16-socket Superdome Flex 280 machines running Linux. Dell doesn’t have much appetite for Xeon SP machines with eight sockets, but is very aggressive with boxes with four sockets – as are Lenovo, HPE, Inspur and Cisco Systems. Ultimately, market share creates market share. Sun and HP customers who were used to RISC / Unix machines switched to IBM because they could have a familiar great iron experience. Customers who are used to HPE and Dell four-socket boxes will sometimes consider Lenovo, Inspur, and Cisco devices, but in general they fluctuate between the two largest OEMs in the world and pit them against each other.

Much depends on which two-socket infrastructure servers companies typically buy and which Baseboard Management Controller (BMC) they invested their time and money in to learn how to use it. Remember, corporate servers are pets that generally run one workload per physical machine, and we’re kidding you, the BMC is the single biggest factor in buying servers. If there was a standard BMC while the open compute crowd tried to cobble together, all hell could break loose in the corporate data center.

Which could actually be fun.


Comments are closed.