The Perils of Parallel: The Cloud Got GPUs

Monday, November 15, 2010

The Cloud Got GPUs

Amazon just announced, on the first full day of SC10 (SuperComputing 2010), the availability of Amazon EC2 (cloud) machine instances with dual Nvidia Fermi GPUs. According to Amazon's specification of instance types, this "Cluster GPU Quadruple Extra Large" instance contains:

22 GB of memory

33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core "Nehalem" architecture)

2 x NVIDIA Tesla "Fermi" M2050 GPUs

1690 GB of instance storage

64-bit platform

I/O Performance: Very High (10 Gigabit Ethernet)

So it looks like the future virtualization features of CUDA really are for purposes of using GPUs in the cloud, as I mentioned in my prior post.

One of these XXXXL instances costs $2.10 per hour for Linux; Windows users need not apply. Or, if you reserve an instance for a year – for $5630 – you then pay just $0.74 per hour during that year. (Prices quoted from Amazon's price list as of 11/15/2010; no doubt it will decrease over time.)

This became such hot news that GPU was a trending topic on Twitter for a while.

For those of you who don't watch such things, many of the Top500 HPC sites – the 500 supercomputers worldwide that are the fastest at the Linpack benchmark – have nodes featuring Nvidia Fermi GPUs. This year that list notoriously includes, in the top slot, the system causing the heaviest breathing at present: The Tianhe-1A at the National Supercomputer Center in Tianjin, in China.

I wonder how well this will do in the market. Cloud elasticity – the ability to add or remove nodes on demand – is usually a big cloud selling point for commercial use (expand for holiday rush, drop nodes after). How much it will really be used in HPC applications isn't clear to me, since those are usually batch mode, not continuously operating, growing and shrinking, like commercial web services. So it has to live on price alone. The price above doesn't feel all that inexpensive to me, but I'm not calibrated well in HPC costs these days, and don't know how much it compares with, for example, the cost of running the same calculation on Teragrid. Ad hoc, extemporaneous use of HPC is another possible use, but, while I'm sure it exists, I'm not sure how much exists.

Then again, how about services running games, including the rendering? I wonder if, for example, the communications secret sauce used by OnLive to stream rendered game video fast enough for first-person shooters can operate out of Amazon instances. Even if it doesn't, games that can tolerate a tad more latency may work. Possibly games targeting small screens, requiring less rendering effort, are another possibility. That could crater startup costs for companies offering games over the web.

Time will tell. For accelerators, we certainly are living in interesting times.

11 comments:

The Opportunistic Photographer said...: That seems pretty pricey. I'm not sure I could justify the $5630/year when I can buy the same system(or better) for ~$10k.; November 15, 2010 at 7:31 PM
Greg Pfister said...: Hi, Gary.

I agree. But if you only need it for a couple of hours, $2.10/hour is better than $10K.

Doing some arithmetic, the break-even point for that is when you need 4139 hours, or 24.6 weeks. So if you need to run the thing for half a year straight, the reserved instance makes sense.

I'd say that's somewhat unlikely. Makes sense for a commercial server running 24 hours/day, but not for typical HPC work.

Security traders may have a different point of view, but also rather higher security requirements than Amazon is likely to be able to provide.

Greg; November 15, 2010 at 8:55 PM
Greg Pfister said...: Oh, and Gary, someone over on LinkedIn questioned whether you could get a node for $10k. The GPUs alone are around $2500 each, leaving $5k for a node with 22GB of memory. Seems a bit low. But even if the total is, say, $15k, your general point is reasonable.

Greg; November 15, 2010 at 9:20 PM
The Opportunistic Photographer said...: They're right. It was ~$11k... :)
http://www.velocitymicro.com/wizard.php?sr=0&iid=187

I understand that there are additional costs associated with owning your system (i.e. electricity, etc), but there are also distinct advantages (security, faster I/O, etc.); November 16, 2010 at 6:13 AM
Greg Pfister said...: And, given the "laser focus" of all the people wanting to make $M on clouds, it was inevitable that there's someone already claiming they can set up a fully secure XXXXL-node combo on AWS in 15 minutes. http://goo.gl/zx1pr

This is why I don't blog much about the cloud any more. There's nothing but firehose info streams, many of which I'm sure will evaporate shortly.; November 16, 2010 at 3:53 PM
Atul Kumthekar said...: all n all, $2.1 / Hr may be attractive for various testings, before making a buy, no-buy decision.; November 16, 2010 at 7:38 PM
Anonymous said...: I can see that being a very niche market. Movie studios, CGI, art houses, and other media would probably use it, but their patterns are irregular. Mass users I see a problem with.; November 16, 2010 at 7:47 PM
Unknown said...: Any idea on what will happen in AWS (or other clouds) when the hybrid systems, Intel Sandy Bridge and AMD Fusion, will arrive? These systems are much faster with algorithms where transfer-times matter and are OpenCL-only, no CUDA.

Vincent (StreamComputing); November 17, 2010 at 3:32 AM
Greg Pfister said...: Hi, Vincent. That's a good question.

I think there will still be Nvidia (or even other) accelerator instances in the long term, even after the current hype cycle peak trails off. Their features will target the HPC market, they'll be the highest performance units, and there will be a lot of legacy code written for them.

Will there also be integrated on-die units -- XXX1/2L nodes? -- probably, but it will take a while. Right now they're really "good enough" performance low price units, not designed for all-out high-end. On the other hand, they will be cheaper enough to matter, as well as having the transfer time advantages you mention.

They'll always be a more niche, though, unless they get past some "I'm only for graphics" limitations, like lack of ECC in local memory and not-quite-standard floating point. Until that's fixed, they won't become the darlings of HPC that Nvidia cards are now.

Ultimately, as half-Moore's-law continues and device counts climb, those disadvantages will probably drop away.

For some render farms, though, those not requiring greater algorithmic generality, they may be the hardware of choice in the more near term.

Greg; November 17, 2010 at 10:27 AM
Greg Pfister said...: And here's another "niche" application: cracking passwords http://goo.gl/Phsv4

Cracked a SHA1 password for $1.71. SHA1 is deprecated, but still.

Greg Pfister; November 17, 2010 at 7:59 PM
cloud storage said...: As a Dell employee I think IT departments that anticipate an enormous uptick in user load need not scramble to secure additional hardware and software with cloud computing.; January 12, 2012 at 3:10 AM

Monday, November 15, 2010

The Cloud Got GPUs

11 comments:

Post a Comment