Interview: Michael Kagan, chief expertise officer, Nvidia

Interview: Michael Kagan, chief expertise officer, Nvidia | Computer Weekly

Editor2

June 21, 2023

Interview: Michael Kagan, chief expertise officer, Nvidia | Computer Weekly

[ad_1]

“While everything in computing gets smaller and smaller, the 21st century computer is something that scales from a smartwatch all the way up to the hyperscale datacentre,” says Michael Kagan, CTO of Nvidia.

“The datacentre is the computer and Nvidia is building the architecture for the datacentre. We are building pretty much everything needed, from silicon and frameworks all the way up to tuning applications for efficient execution on this 21st century machinery.”

Based within the Haifa district of Israel, Kagan joined Nvidia three years in the past as the corporate’s CTO via the acquisition of Mellanox Technologies. Jensen Huang, Nvidia’s founder and CEO, informed Kagan that he would oversee the structure of all techniques.

Beyond Moore’s Law

The well-known Moore’s Law comes from a paper Gordon Moore wrote in 1965, known as Cramming more components onto integrated circuits. In the paper, Moore, who went on to develop into the CEO of Intel, predicted that expertise and economics would conspire to permit the semiconductor trade to squeeze twice as many transistors into the identical quantity of house yearly. He stated this may go on for the following 10 years.

This prophecy, which turned often known as Moore’s Law, was modified 10 years later. In 1975, Moore stated that the doubling would happen roughly each two years, as an alternative of yearly. He additionally stated it might proceed for the foreseeable future. In reality, chip producers benefitted from this doubling up till round 2005, after they might now not rely on economics and the legal guidelines of physics to squeeze twice as many transistors into the identical quantity of house each two years. There simply wasn’t any extra room left between transistors.

Since then, chip producers found out different methods to extend computing energy. One means was to extend the variety of cores. Another means was to enhance communications between a number of chips and between processors and reminiscence by connecting the completely different elements extra straight to 1 one other utilizing a community, as an alternative of a shared bus, which was liable to bottlenecks.

Semiconductor producers additionally went additional up the stack to invent new methods to ship computing energy. They appeared on the algorithms, accelerators, and the way in which information was being processed. Accelerators are specialised elements – often chips – that carry out particular duties in a short time. When a system encounters such a activity, it fingers it off to the accelerator, thereby attaining beneficial properties in total efficiency.

Manufacturers appeared particularly at synthetic intelligence (AI), the place information is processed in a basically new means, as in comparison with the von Neumann architecture the pc trade was so used to.

“AI is based on neural networks,” explains Kagan. “That requires a really completely different sort of information processing than a von Neumann structure, which is a serial machine that executes an instruction, appears on the outcome, after which decides what to do subsequent.

“The neural network model of data processing was inspired by studies of the human brain. You feed the neural network data and it learns. It works similarly to showing a three-year-old kid dogs and cats. Eventually the kid learns to distinguish between the two. Thanks to neural networks, we can now solve problems that we didn’t know how to solve on the von Neumann machine.”

AI and different new purposes, equivalent to digital twins, accelerated the necessity for computing efficiency and introduced on the requirement for a brand new paradigm. In the previous, software program growth required little or no computing energy, however working the ensuing program required rather more. By distinction, AI requires an enormous quantity of compute to coach neural networks, however a lot much less to run neural networks.

A single GPU or CPU just isn’t sufficient to coach a big AI mannequin. For instance, ChatGPT takes about 10,000 GPUs to coach. All the GPUs work collectively in parallel, and naturally, they should talk. In addition to large parallel processing, the brand new paradigm requires a brand new sort of specialised chip, known as the info processing unit (DPU).

Huang’s Law

“The quickest machine on this planet in 2003 was the Earth-Simulator, which carried out in teraflops,” says Kagan. “The quickest pc right now is Frontier, which performs in exaflops, a million times more. In 20 years, we’ve gone from teraflops to exaflops.”

He provides: “During the 20 years between 1983 and 2003, compute performance increased a thousandfold and in the next 20 years the performance of computing increased millionfold. That phenomenon is what some have called ‘Huang’s Law.’ Our CEO, Jensen Huang, observed that GPU-accelerated computing doubles its performance two times every other year.

“As a matter of fact, it is going even faster than two times every other year. Now we are talking about AI workloads and a new way of processing data. If you look at how much faster you can run an application on Nvidia Hopper versus Ampere, which is our current generation GPU versus previous generation, it’s more than 20 times.”

Kagan says, what makes computing sooner now could be primarily the algorithms and the accelerators: “With each new technology of GPUs, extra accelerators – and higher accelerators – are added, processing information in rather more refined methods.

“It’s all about how you partition functions between different parts. You now have three computer elements – GPU, CPU, and DPU – and a network connecting them, which also computes. At Mellanox, the company bought by Nvidia, we introduced in-network computing where you can make the data calculations as data flows through the network.”

Moore’s Law counted on the variety of transistors to double computing efficiency each two years; Huang’s Law counts on GPU-accelerated computing to double system efficiency each different yr. But now, even Huang’s Law could not have the ability to sustain with the rising demand from AI purposes, which want 10 occasions extra computing energy yearly.

[adinserter block=”4″]

[ad_2]

Source link

Beyond Moore’s Law

Huang’s Law

LEAVE A REPLY Cancel reply