Wednesday, August 11, 2010

Nvidia-based Cheap Supercomputing Coming to an End

Nvidia's CUDA has been hailed as "Supercomputing for the Masses," and with good reason. Amazing speedups on scientific / technical code have been reported, ranging from a mere 10X through hundreds. It's become a darling of academic computing and a major player in DARPA's Exascale program, but performance alone is not the reason; it's price. For that computing power, they're incredibly cheap. As Sharon Glotzer of UMich noted, "Today you can get 2GF for $500. That is ridiculous." It is indeed. And it's only possible because CUDA is subsidized by sinking the fixed costs of its development into the high volumes of Nvidia's mass market low-end GPUs.

Unfortunately, that subsidy won't last forever; its end is now visible. Here's why:

Apparently ignored in the usual media fuss over Intel's next and greatest, Sandy Bridge, is the integration of Intel's graphics onto the same die as the processor chip.

The current best integration is onto the same package, as illustrated in the photo of the current best, Clarkdale (a.k.a. Westmere), as shown in the photo on the right. As illustrated, the processor is in 32nm silicon technology, and the graphics, with memory controller, is in 45nm silicon technology. Yes, the graphics and memory controller is the larger chip.

Intel has not been touting higher graphics performance from this tighter integration. In fact, Intel's press releasers for Clarkdale claimed that being on two die wouldn't reduce performance because they were in the same package. But unless someone has changed the laws of physics as I know them, that's simply false; at a minimum, eliminating off-chip drivers will reduce latency substantially. Also, being on the same die as the processor implies the same process, so graphics (and memory control) goes all the way from 45nm to 32nm, the same as the processor, in one jump; this certainly will also result in increased performance. For graphics, this is a very loud the Intel "Tock" in its "Tick-Tock" (architecture / silicon) alternation.

So I'll semi-fearlessly predict some demos of midrange games out of Intel when Sandy Bridge is ready to hit the streets, which hasn't been announced in detail aside from being in 2011.

Probably not coincidentally, mid-2011 is when AMD's Llano processor sees daylight. Also in 32nm silicon, it incorporates enough graphics-related processing to be an apparently decent DX11 GPU, although to my knowledge the architecture hasn't been disclosed in detail.

Both of these are lower-end units, destined for laptops, and intent on keeping a tight power budget; so they're not going to run high-end games well or be a superior target for HPC. It seems that they will, however, provide at least adequate low-end, if not midrange, graphics.

Result: All of Nvidia's low-end market disappears by the end of next year.

As long as passable performance is provided, integrated into the processor equates with "free," and you can't beat free. Actually, it equates with cheaper than free, since there's one less chip to socket onto the motherboard, eliminating socket space and wiring costs. The power supply will probably shrink slightly, too.

This means the end of the low-end graphics subsidy of high-performance GPGPUs like Nvidia's CUDA. It will have to pay its own way, with two results:

First, prices will rise. It will no longer have a huge advantage over purpose-built HPC gear. The market for that gear is certainly expanding. In a long talk at the 2010 ISC in Berlin, Intel's Kirk Skaugan (VP of Intel Architecture Group and GM, Data Center Group, USA) stated that HPC was now 25% of Intel's revenue – a number double the HPC market I last heard a few years ago. But larger doesn't mean it has anywhere near the volume of low-end graphics.

DARPA has pumped more money in, with Nvidia leading a $25M chunk of DARPA's Exascale project. But that's not enough to stay alive. (Anybody remember Thinking Machines?)

The second result will be that Nvidia become a much smaller company.

But for users, it's the loss of that subsidy that will hurt the most. No more supercomputing for the masses, I'm afraid. Intel will have MIC (son of Larrabee); that will have a partial subsidy since it probably can re-use some X86 designs, but that's not the same as large low-end sales volumes.

So enjoy your "supercomputing for the masses," while it lasts.


Andrew Richards said...

NVIDIA makes its money out of high-end graphics. That's a mixture of graphics professionals (engineers, designers, video editors, artists...) and gamers. They have added a bit of HPC in there, with CUDA.

The graphics professionals and gamers aren't going to switch to the netbook Fusion chips or Intel's integrated graphics. Maybe some might want a netbook with Fusion instead of an Atom netbook. But Intel's IGP has a really bad reputation for anyone who is serious about graphics (which is where most of NVIDIA's income comes from.)

You've got to remember that NVIDIA really has a lot of mindshare among software developers: especially games developers, professional graphics developers and HPC researchers. Those people (actually, me too) need to have a company like NVIDIA pushing technology forwards, whereas Intel with IGP and (to a much lesser extent) ATI have been following behind.

Fusion will push ATI/AMD ahead of NVIDIA in some senses, as long as NVIDIA don't have a secret x86 project up their sleave and ready to go next year.

NVIDIA's Tegra has really disappointed in the marketplace. They haven't managed to get into a market now dominated by Qualcomm, Samsung and TI, despite having some great technology.

So, really, all of these projects are making it difficult for NVIDIA to sell their back-catalogue of chips. It doesn't block their ability to sell new GPUs.

What we're seeing is companies saying they can do a better job than NVIDIA, and then finding that it's actually quite hard. NVIDIA definitely have some tough times ahead, but they still have the high-margin end of the market almost to themselves (apart from the period when Cypress was king and Fermi was struggling).

Greg Pfister said...

Andrew, I must respectfully disagree.

About the effect of low-end sales, and the issue of high-end/low-end profits: Rather than going through a long-winded explanation, I'll refer you to Linus Torvald's long-winded but excellent explanation over on RWT:

He makes the case that even if all the profit is from the high end -- *zero* from the low end -- the low end still effectively subsidizes the high end by soaking up fixed costs. I agree with his analysis.

I agree Nvidia has a lot of mind share. At one point, so did Cell -- although Nvidia's probably exceeds Cell's peak by an order of magnitude. The question is whether that mindshare will last when customers have to pay the whole price for their hardware. Their high mindshare may result in a long fall.

Unfortunately, Intel merely has to be adequate for this to happen - not particularly good, just adequate. This isn't a mark they historically have met. With integration and a massive improvement in silicon technology, adequacy might just happen.


Yale Zhang said...

I'm not sure about NVIDIA getting most of its subsidies from the low end. I know NVIDIA gets a lot of revenue from its Tesla products. I'm using a Tesla C2050 ($2400) and architecturally it's the same as a GTX 470 ($500)

The thinking is basically, if you can't afford errors, then you can afford error free hardware.

Anonymous said...

One way or another this is going to have an effect on NVIDIAs dynamics.
Since most of these units ship in PCs a very many end users dont really care about graphics performance the question will be can it run whatever?
There will always be a niche for people that want neat (gfx) hardware bought at a premium - but whether this will remain a viable financial proposition is to be demonstrated. Theres a long history of dead companies that also produced neat hardware eg.NEXT and the NEXTCube, DEC and Alpha they too had a dedicated following and a mindshare...

me said...

Your post has a good point, but I think that rather than being the end of "supercomputing for the masses", integration onto the CPU die is really the beginning of it going mainstream.

CPUs are optimized for branching, and as a result the percentage of die area devoted to cache is going to 90%. It's very hard to increase throughput under those conditions.

GPUs are optimized for throughput, and the die area for ALUs is 80-90%. This is a perfect for uniform data processing, i.e. photos, video, number crunching, etc.

It will be nice to have both, and applications will surely use whatever processor is best. NVidia might have a tough time of it though.

Greg Pfister said...

@Uncle Joe - see the discussion by Linus Torvalds I referenced in the prior comment:

@anonymous - +1. Exactly.

@zork - I agree with you; it's a point I would have made myself, had I thought of it.

Greg Pfister said...

Oh, and by the way - Clarkdale is not, as I said above, a low-end laptop device; it's mainstream. Sorry. I'll edit the text to fix that.

Anonymous said...

Interesting point. It's worth mentioning here that the new xbox 360 comes already with the CPU/GPU in one package (chip?).

Anonymous said...

It seems that NVIDIA themselves are now seeing the beginning of the end and are eating their graphics card partners in an attempt to make revenue look better (though at the expense of their profit margins).

Greg Pfister said...

Hi, Curt. Thanks for that link. Amazing how many comments there are from people denying reality in this case.


Post a Comment

Thanks for commenting!

Note: Only a member of this blog may post a comment.