Monday, August 15, 2011

IBM Dumps Blue Waters – Final Curtain on the Old Days

IBM has pulled out of the much-touted Blue Waters supercomputer project of IBM and National Center for Supercomputing Applications at the University of Illinois, an effort which was supposed to produce one petaflops of sustained performance by the end of 2012. Googling “IBM Blue Waters” and selecting “news” will give you a bevy of reports on this, (like this, this, this, this) so I’m going to refrain from reduplicating what everybody else has said.

I don’t have any inside scoop on this, in the sense that I have no under-the-table secret contacts or communications channels back into IBM. However, I can make some connections between dots already out there, based on my experience leading one flashy HPC project (RP3) back in the 1980s (possibly the first IBM did), and being close to such projects after that. My conclusion: There has been a major change in IBM executive management’s attitude towards flashy HPC projects, a change that is probably the drop of the final shoe of the “good old days” of IT architecture research.

I deduce the attitude change from HPCwire’s call to Herb Schultz, marketing manager for IBM's Deep Computing unit, in which he said a while ago that “There is really no appetite in IBM anymore -- with some of the leadership changes over the last few years – for revenue that has no profit with it”.

So, IBM wants to make money on its high performance computing products. What’s wrong with that? Nothing. As every IBM manager is taught in their first management training – at least I was – the purpose of IBM isn’t to advance technology, or make the world a better place, or be a good corporate citizen; it’s to make money. (Those were the multiple choices in a quiz, by the way.) It’s perfectly obvious that any company that doesn’t make money, and thereby stay in business, can’t do anything. It’s like the first and most important rule of breathing I was taught in Tai Chi, which was: Breathe. If you don’t do that, you won’t be around long.

But as everyone should also know, there’s a focus on making money now, directly, measurably; and there’s setting up to make more money in the future. The first is needed; but if done exclusively, without the second, your corporate lifetime is also being limited – rather like living on a tasty but unhealthy diet.

I recall distinctly the response of Ralph Gomory, then IBM Senior VP of Science and Technology, to a cadre of high-level development managers who were complaining about the cost of some HPC project, proposing to kill it. He told them “This will make you money in ways you can’t conceive of” (approximate quote). He was right. What they return isn’t money, directly; it’s column-inches on the front page of the New York Times and similar media.

This works. I’ve recounted in a much earlier post a case I was involved in where an IBM account rep absolutely owned the entire IT account of a large, conservative retailer in the Midwest – because an IBM RISC system was given the credit for beating Kasparov. (Winning Jeopardy! hardly has the same cachet.)

Also, while it may be hard to fathom now, there was a time when computer architecture and hardware development research was simply pursued for its own sake, primarily because we might find something out by doing it, without knowing what that might be.

This also works. My personal example of that is tree saturation[1] (a.k.a. congestion spreading, but in non-lossy networks), which I and Alan Norton serendipitously discovered in the RP3 project. I distinctly recall involuntarily standing and my whole body stiffening when I looked at the graphs revealing it, and realized what was happening. It was my own personal “eureka!” kind of moment. We’d no clue we’d find that, and it was the occasion of my only recursive award – an award from IBM research for getting an award for the paper on it. Gomory (who, coincidentally, was Research Division president at the time) said that was exactly the kind of thing he had hoped to get from RP3.

However, two things have changed since then: There’s a much stronger focus on showing results today (which the IBM stock price rise duly reflects). And the cost of entry has become quite a bit higher, particularly entries like Blue Waters.
Back when Gomory said what I recounted above, IBM was riding high on steady income from mainframes and their software. Those still bring in substantial money, particularly via drag of software along with them (which the hardware guys aren’t allowed to count… grrr…). Now, though, the software business has moved on to the much more competitive arena of stand-alone software products that run on a variety of platforms. Of course, there is also now the whole service business that practically didn’t exist back then.

In addition, the cost of entry has skyrocketed. Back when I was involved in RP3, we had a contract with DARPA that brought in a whole $1M or so, which paid something like half the real bill. Compare that with El Reg’s estimate that a single Blue Waters rack is an $8M proposition, with over 200 racks needed for the final configuration and you’re over $1B. Those are all rough numbers, and they’re retail, not cost (an impossible number to pin down from outside), but you can see where the table stakes have gotten beyond many of the highest high rollers stash.

So I’m going to label this pull out from Blue Waters as the final ringing down of the last curtain on an era of free-wheeling profit-unconstrained research into computer architecture and systems.

It was fun while it lasted, but now, no matter what you do, the issue is where and when the profit comes out. That’s normal now, but I think we need to remember that it was not always so.

[1] I’d like to give a URL for that, but it was back in the early 80s pre-web. There are lots of papers still out there about avoiding or fixing it (many wrong) that you can find by Googling “tree saturation”, though. Finally figured out how to fix it in InfiniBand. Complicated. Possibly not worth the effort. Added: Since someone asked, here's bibliographical information on the paper: "Hot spot" contention and combining in multistage interconnection networks. GF Pfister, V Norton IEEE TRANS. COMP. 34:1010, 943-948, 1985