So, is there some history and practice that can shed some like on the issue of helping people use parallel computers? There certainly is, in the High-Performance Computing (HPC) community.
The HPC community has been trying to flog the parallel programming language horse info life for almost forty years. These are the people who do scientific and technical computing (mostly), including things like fluid dynamics of combustion in jet engines, pricing of the now-infamous tranches of mortgage-backed securities, and so on. They’re motivated. Faster means more money to the financial guys (I’ve heard estimates of millions of dollars a minute if you’re a millisecond faster than the competition), fame and tenure to scientists, product deadlines for crash simulators, and so on. They include the guys who use (and get the funding for) the government-funded record-smashing installations at national labs like Los Alamos, Livermore, Sandia, etc., the places always reported on the front pages of newspapers as “the world’s fastest computer.” (They’re not, because they’re not one computer, but let that pass for now.)
For HPC, no computer has ever been fast enough and likely no computer ever will be fast enough. (That’s my personal definition of HPC, by the way.) As a result, the HPC community has always wanted to get multiple computers to gang up on problems, meaning operate in parallel, so they have always wanted to make parallel computers easier. Also, it’s not like those people are exactly stupid. We’re talking about a group that probably has the highest density of Ph.D.’s outside a University faculty meeting. Also, they’ve got a good supply of very highly motivated, highly intelligent, mostly highly skilled workers in the persons of graduate students and post-docs.
They have certainly succeeded in using parallelism, in massive quantities. Some of those installations have 1000s of computers, all whacking on the same problem in parallel, and are the very definition of the term “Massively Parallel Processing.”
So they’ve got the motivation, they’ve got the skills, they’ve got the tools, they’ve had the time, and they’ve made it work. What programming methods do they use, after banging on the problem since the late 1960s?
No parallel languages, to a very good approximation. Of course, there are some places that use some of them (Ken, don’t rag on me about HPF in
1. Message-Passing Interface (MPI), a subroutine package callable from Fortran, C, C++, and so on. It does basic “send this to that program on that parallel node” message passing, plus setup, plus collective operations [“Is everybody done with Phase 3?” (also known as a “barrier”) “Who’s got the largest error value – Is it small enough that we can stop?” (also known as a “reduction”)]. There are implementations of MPI exploiting special high performance inter-communication network hardware like InfiniBand, Myrinet, and others, as well as plain old Ethernet LAN, but at its core you get “send X to Y,” and manually write plain old serial code to call procedures to do that.
2. As a distant second, OpenMP. It provides similarly basic use of multiple processors sharing memory, that uses and uses more support at the language level. The big difference is the ability to do a FOR loop in parallel (FOR N=1 to some_big_number, do THIS (parameterized by N)).
That’s it. MPI is king, with some OpenMP. This is basic stuff: pass a message, do a loop. This seems a puny result after forty years of work. In fact, it wouldn’t be far from the truth to say that the major research result of this area is that parallel programming languages don’t work.
Why is this?
I effectively convened my own task force on this, in the open, asking the question on a mailing list. A susprising (to me) number of significant lights in the HPC parallelism community chimed in; this is clearly a significant and/or sore point for everybody.
I’ll talk about that in my next post here.
No comments:
Post a Comment
Thanks for commenting!
Note: Only a member of this blog may post a comment.