Specialized Compute Infrastructure, Part 1
Today’s compute needs are vastly different from even a decade ago. The amount of data is expanding exponentially, the type of data is shifting, and the sort of processing needed by enterprises and individuals is changing. There have been rapid software and hardware advances to keep up with these evolving demands, including fast paced advances in specialized compute. To appreciate the scale and velocity of recent advances, let’s detail the options for compute infrastructure until recently.
Commodity Hardware, OEMs, ODMs
For most of the history of modern computing, CPUs have powered all workloads. Multi-functional, interchangeable, and cross-compatible CPUs are designed and built for general purpose computation and perform lots of different compute operations relatively well. For example, a RAID (redundant array of independent disks) configures commodity hard disks for high-volume data storage at the server level.
CPUs power enterprise and consumer compute products, including on-premises data centers, cloud data centers, and PCs. Most companies that design chips use OEMs and ODMs to integrate their chips into products. OEMs and ODMs engage with different stages of these processes. Lenovo embed and integrate chips into servers, computers, and smart devices, whereas GigaIPC integrates chips into industrial motherboards, embedded systems, and smart display modules. This is still how most enterprise and consumer hardware is manufactured, though the chips used are developing fast.
The Evolution of the CPU
From the late 1980’s, the performance and transistor density of CPUs increased. Intel led the way with progressively faster processors and their own chip fabrication. This vertical integration was an advantage in Intel’s early days. Though there are still many foundries globally, the overwhelming majority produce chips using less advanced processes. It is increasingly difficult to keep pace with the market leading fabs, which has resulted in consolidation at the top-end of the foundry market.
The past decade has seen increasing competition in the CPU market, though AMD and Intel’s CPU market dominance is ongoing. The most significant development in CPU architecture came from Apple’s M1 chip, a custom ARM-based processor released in 2020. The M1 chip had marked improvements in performance, memory, and power efficiency, and replaced Apple’s dependence on other chip companies. Apple silicon has continued to increase performance and power-efficiency. By increasing chip performance, Apple increases device performance and maintains its competitive edge.
Apple’s M1 chip represents two important trends in CPU development. First, chip fabrication and chip design have segmented into increasingly separate industries, as both design tools and fabrication processes become more specialized and complex. Second, as a result, the past decade has seen hyperscale software and product companies buy up smaller hardware design companies, attempting to bring chip design in house. This move represents attempts to optimize CPUs for product-specific needs and is part of a much larger trend towards specialized compute hardware.
From Homogeneous to Heterogeneous Compute
Alongside rapid advancements in CPU architecture, another important trend is from homogeneous to heterogeneous compute. Until less than a decade ago, almost all computing systems were homogeneous – they did one type of compute and so lots of systems could be easily strung together to increase the overall compute power available for certain tasks. Commodity hardware provides important functionality for the workloads of many enterprises.
However, commodity hardware is insufficient to keep up with the workloads needed to run Artificial Intelligence, Machine Learning and Deep Learning applications. These applications require specialized hardware to provide the acceleration and functionality required. Even the most advanced CPUs are not sufficient to keep up.
GPUs were originally designed to carry out the faster processing needed for gaming – when video interacts with users in real-time. Nvidia first developed a GPU to accelerate gaming graphics. Since, the applications of parallel computing enabled by GPUs are numerous, from scientific computing in the early 2000’s to AI, Machine Learning, and Deep Learning more recently. GPUs have become mainstream hardware solutions to accelerate different sorts of compute, including in applications like computer vision in self-driving cars.
In the past 15 years, there’s been a shift towards GPGPU (General Purpose GPU), where technologies such as CUDA or OpenCL use the parallel computing on the GPU to do general purpose computing. Like CPUs, GPUs are becoming ever more specialized and optimized for particular workloads.
Unlike CPUs or GPUs, ASICs are chips with a narrow use case geared towards a single application. The increasing need for heterogeneous, accelerated compute has led to a proliferation of ASICs. For the period 2021-2024, the ASIC market is projected to grow by over $9B. Recent examples of ASICs deployed at scale include the Google Tensor Processing Unit (TPU), used to accelerate Machine Learning functions, and ASICs for orbital maneuvering used to move rockets to a new orbit at a different altitude used by SpaceX. ASICs are at the forefront of a range of new industries including networking, robotics, IoT, and blockchain.
Despite fast-paced development of silicon for compute, the majority of compute is still powered by CPUs. One size fits all computing, that worked well from the first chips in the Walkman until today, is no longer sufficient. Over just a few years there has been huge growth in compute hardware to optimize use cases CPUs cannot address. The trend towards increasingly specialized compute is only accelerating.
Read part 2 for how specialized compute hardware is reshaping the future of compute infrastructure.