Friday, January 23, 2009

Multi-Multicore Single System Image / Cloud Computing. A Good Idea? (3)

This is part 3 of a multi-post sequence on this topic which began here. This part recounts some implementations and history.

History and Examples

The earliest version of this concept that I'm aware of was the Locus project at UCLA, led by Jerry Popek. It had a long, and ultimately frustrating, history.

The research project was successful. It created a version of UNIX that ran across several computers (VAXen) and as a system didn't go down for something over a year of use by students. The project was funded by IBM, and that connection led to key elements being ported experimentally to IBM's AIX in the late 1980s; I saw AIX processes migrating between nodes in 1989-90. Then an IBM executive, in a fit of magnanimity and/or cluelessness, gifted all the IP back to the Locus project. They went off and founded Locus Computing Corporation (LCC). From there it was used on an Intel Paragon (a massively-parallel computing project), which says something about scalability. At one point it was also part of an attempt in IBM to make mainframes and desktops (PS/2s) together look like one big AIX (UNIX) system. Feel free to boggle at that concept. It did run.

But LCC ran into the boughts: They were bought by Tandem, where the code was productized as Tandem NonStop Clusters, which as far as I know shipped in small numbers. Tandem was bought by Compaq, which bought DEC, so the code was ported to DEC's Alpha, and almost used in DEC's brand of Unix before DEC was snuffed. Then Compaq was bought by HP, and shortly thereafter this code stream came perilously close to being part of HP-UX, HP's Unix: In 2004, there was a public announcement that it would be in the next major HP-UX release. About a month later, that was later officially de-committed. It also saw use in a PRPQ (limited-edition product) from IBM that was called POWER4. At some point it also almost became part of SCO's UnixWare.

This is a portrait of frustration for all involved.

The Locus thread, however, isn't the only implementation or planned product of full single system image. There's an open-source implementation now called MOSIX and based on Linux that you can download and try, designed specifically to support HPC. It also has a long history, starting life as MOS in the late 70s at The Hebrew University of Jerusalem. It doesn't distribute absolutely every element of the OS, but does quite enough to be useful.

Sun Microsystems published a fair amount of material about a version for Solaris, which it called Full Moon, including a roadmap indicating complete single-system-image in 1999. It hasn't happened yet, obviously. The most recent news I found about it was that some current variation would go open source (CDDL) in 2007. The odd name, by the way, was chosen because full moons make wolf packs howl, and Microsoft's cluster support was codenamed Wolfpack. ("Wolfpack" referred to my analogy of clusters being like packs of dogs in In Search of Clusters. "Dogpack" wouldn't have had quite the same connotations. But there's no mythological multi-headed wolf to play the part of the multiprocessor in my analogy.)

ScaleMP today sells a multi-multicore single system image product, vSMP foundation, aiming to ride on the speed of modern higher-performance interconnects like InfiniBand (and likely Data Center Ethernet or equivalent), but my understanding is that vSMP is hardware-assisted, a somewhat different subject. Similarly, Virtual Iron was at one point also pushing an ability to glue multiple multi-cores into one larger SMP-like system, with hardware assist, but that has apparently been dropped. Both of these, unlike prior efforts, have some flavor of virtualization to them and/or their implementation.

I wouldn't be at all surprised to learn that there are others who have heard the seductive Siren call of SSI.

So much for history. Implementation comments in the next post.

No comments:

Post a Comment

Thanks for commenting!

Note: Only a member of this blog may post a comment.