Wednesday, September 24, 2008

101 Parallel Languages (Part 3, the last)

So why, after all the effort spent on parallel programming languages, have none seen broad use? I posed this question on the mailing list of the IEEE Technical Committee on Scalable Computing, and got a tremendously, gratifyingly high-quality response from people including major researchers in the field, a funding director of NSF, and programming language luminaries.

Nobody disagreed with the premise: MPI (message-passing interface) has a lock on parallel programming, with OpenMP (straightforward shared memory) a distant second. Saying the rest are in the noise level is being overly generous. The discussion was completely centered around reasons why that is the case.

After things settled down, I was requested to collect it for the on-line newsletter of the IEEE TCSC. You can read the whole thing here.

I also boiled down the results as much as I could; many reasons were provided, all of them good. Here’s that final summary, which appears at the end of the newsletter article, turned into prose instead of powerpoint bullets (which I originally used).

A key reason, and probably the key reason, is application longevity and portability. Applications outlive any given generation of hardware, so developers must create them in languages they are sure will be available over time – which is impossible except for broadly used, standard languages. You can’t assume you can just create a new compiler for wonderful language X on the next hardware generation, since good compilers are very expensive to develop. (I first proposed this as a reason; there was repeated agreement.) That longevity issue particularly scares independent software vendors (ISVs). (Pointed out by John D. McCalpin, from his experience at SGI.)

The investment issue isn’t just for developing a compiler; languages are just one link in an ecosystem. All links needed for success. A quote: “the perfect parallel programming language will not succeed [without, at introduction] effective compilers on a wide range of platforms,… massive support for re-writing applications,” analysis, debugging, databases, etc.” (Horst Simon, Lawrence Berkeley Laboratory Associate Lab Director CS; from a 2004 NRC report.)

Users are highly averse to using new languages. They have invested a lot of time and effort in the skills they use writing programs in existing languages, and as a result are reluctant to change. “I don't know what the dominant language for scientific programming will be in 10 years, but I do know it will be called Fortran.” (quoted by Allan Gottlieb, NYU). And there’s a Catch-22 in effect: A language will not be used unless it is popular, but it’s not popular unless it’s used by many people. (Edelman, MIT)

Since the only point of parallelism is performance, high quality object code required at the introduction of the new language; this is seldom the case, and of course adds to the cost issue. (Several)

I can completely understand the motivation for research in new languages; my Ph.D. thesis was a new language. There is always frustration with the limitations of existing languages, and also hubris: Much more than most infrastructure development, when you do language development you are directly influencing how people think, how they structure their approach to problems. This is heady stuff. (I am now of the opinion that nobody should presume to work on a new language until they have written at least 200,000 lines of code themselves – a requirement that would have ditched my thesis – but that’s another issue.)

I also would certainly not presume to stop work on language development. Many useful ideas come out of that, whose developers could not initially conceive of outside of a new language, but often enough then become embedded in the few standard languages and systems. Key example: SmallTalk, the language in which object-oriented programming was conceived, later to be embedded in several other languages. (I suspect that language-based Aspect Oriented Programming is headed that way, too.)

But I’d be extremely surprised to find the adoption of some new parallel language – E.g., Sun’s Fortress, Intel’s Ct – as a significant element in taming parallelism. That’s searching under the wrong lamppost.

I think the way forward is to discover killer applications that “embarrassingly parallel” – so clear and obvious in their parallelism that running them in parallel is not a Computer Science problem, and certainly not a programming language problem, because it’s so straightforward.

13 comments:

John said...

Greg - I was with you on all three parts (thanks for making the effort to write that long a piece) until the very end of part three.

I think a way I could restate your conclusion -- that we need to make use of parallel hardware by finding new killer applications -- such that it would be more directly useful would be to say that we need to find modes of expression for problems that we desire to solve with the aid of computers that result in the exposure more parallelism or better parallelism (better for some measure of "better").

I'm thinking of modes of parallel work expression here as a transformation of the algorithm that reveals additional traits. In the same way, for example, that plotting a set of data on a log chart reveals trends not evident on a linear chart. This doesn't have to involve a new language per se: it is possible that there will be some intermediate representation that depicts the organization of work in a way that reveals new levels of parallelism (or provides more complete access to existing parallelism) and then subsequently reduces to a implementation in an existing language.

What does such a transformation look like? I have no clue, and it may not even exist.

Perhaps we are saying the same thing, but what I took away from your post was that we need to find new problems as killer apps for parallelism, rather than that we need to find ways to express current problems in ways that enable a better mapping between the machine and the algorithm.

Curt Sampson said...

The language list was interesting, but I think it may be a bit disingenuous to claim it's a list of "parallel" languages. Neither Ada nor Haskell, for example, were designed explicitly for parallel programming above anything else. I'm guessing that the same is true about many other languages on the list.

Addressing Haskell specifically (since that is a language I am now reasonably familiar with regarding both programming in it and its history), it was originally designed in the late 80s and early 90s as a pure, lazy functional language with a Hindley-Milner type system (A History of Haskell: being lazy with class). While parallel evaluation was discussed, this was in a context about whether certain language features might require it, rather than with the specific idea of exploiting it.

Later research on parallel programming (most notably Glasgow Parallel Haskell) did use Haskell as a platform for research, and clearly Haskell's purity provides some advantages here, but that still does not appear to have been a primary purpose of the language as originally designed. In fact, as always appears to be the case with concurrent and parallel programming, it's never as simple as one would hope.

That said, don't take this to mean I don't agree with the general thrust of your essay. The first few paragraphs of Part I caused me to laugh aloud ("LOL," I suppose we call it these days) at seeing yet another essential truth that we've known for a long time but still most people don't want to hear about.

Arkadius said...

We already know and use parallel languages.
Here's an example:

A - "Yeah, I saw Jenny today at the mall."
A - "She was asking me out, if you have a girlfriend and so on".
B - "I was there too"
B - "LOL"
B - "What did you say?"
A - "I told her you were planning to become a monk and that you aren't interested in women at all!"
A - "I bought a new phone today"
B - "Really?"
B - "You are a douche! LOL"

Is there a asynchronous "path" that helps us understand the context of each and every sentence prior and post-reading?

Probably. But what's more interesting, is that both friends respond asynchronously without losing track of the conversation and often referring to parts of the conversation "post mortem". There are parallel "spaces" being created. But in the end, all these calculations, as parallel as they are and as asynchronous, they do form a common space.

My point is, sometimes we misunderstand each other. We do get things wrong. We are certain that we got the right answer, while we didn't even understand the question.

The question is: "What is parallelism NOT".
It's not a DIALOG. It's finding that "common space". In computations, 2+2 = 4 and there will be 4 standing there as a result, no matter how many times we ask for the result. But a "cloudy" day is not the same as any other "cloudy day". It's the context. So yes, we may never find the perfect song, the perfect movie or the perfect sunset, but we will find many of them.

We must free computers. Free them from our rules. From our sado-masochistic tendency to be perfect, each and every time.

Software-synthesis is the "keyword" here. "The ability to make mistakes and learn from them".

Music IS "software-synthesis". We take "hardware", put a couple of "processors" together and let them create data in a "space".
Call it rhythm, sound, groove or mood, it's there. And they work together, in parallel.
The "space" may be clearly defined (an interpretation/interpolation of a Mozart piece), or pretty "loose", like Jazz.

So on some days, the algorithm will give us a "new" Mozart piece that no expert will be able to (stylistically) distinguish from an "original" and if it "feels like it", it will give us a Dvorak interpretation like Art Tatum "http://www.youtube.com/watch?v=qYcZGPLAnHA" or Free Jazz.

We already have a highly parallel language. Music. And it's the HPC equivalent of.......HPC, what our brain is capable of when PRODUCING or PERCEIVING music.

But it goes even further than this. Imagine producing a piece of music, as a collaborative effort. The "objects" are the bassline, the drum track, the synth line, the vocal track and so on.

And it is indeed the case, that you can create a "rock song", without any restraints, but still following the rules of "rock music". But how do you take that "mix of tracks" and form a SONG out of it? What if the guitar player does something in Em, but the bassline plays in G and so on.
It would sound horrible, mixing stuff together, that doesn't FIT together, because it wasn't meant to FIT TOGETHER in the first place, right?

The answer is "no!".

We can change the rhythm, the pitch and go even further these days. We can take a sequence of sound and "pull it apart", creating PARALLEL TRACKS, parallel processes.

This is how you "parallelize" musical "code".

http://www.youtube.com/watch?v=LQ5gaoVBf0g&feature=related

When solving problems, it's rarely the CHALLANGE that requires the most work, it's being able to see things from a different perspective.

Arkadius said...

The blog ate my uber-long comment.

30mins of my life are gone.

I guess I won't write it again and "let it be". Here's an example of a parallel language that we already use for ages. Music.

And here's a REAL LIFE example of an algorithm that takes serial "code" and automatically translates it into parallel "code", that then can be altered IN CONTEXT.

http://www.youtube.com/watch?v=LQ5gaoVBf0g&feature=related

You guys are further down the "rabbit hole" than you might think, it's just that solving a problem isn't the biggest challenge. The biggest challenge is being able to look at the problem from another perspective.

For us musicians, this comes naturally.

Greg Pfister said...

@Arkadius,

Fortunately, the blog didn't permanently eat your comment. It just decided it was spam. Not sure which is worse in some sense :-), but at least it lets me fix the problem. I've no clue whether it's now permanently fixed, but I do keep an eye out for such things.

Thanks! for your several thoughtful comments, and I'm glad I could retrieve 30 minutes of your life. :-)

Greg Pfister

Greg Pfister said...

Also @Arkadius,

A few comments on what you wrote here.

First, a nit: I'd call your IM multi-conversation multitheaded, not parallel. (As I said, a nit.)

And yes, music is inherently parallel (I played piano for a long time, myself). There are adequate, but hardly optimal, languages for it. Dance is likewise parallel; not such a good language there (Labanotation isn't that widely understood or used, to my knowledge). Related -- I've been writing up descriptions of how to do Tai Chi (Taiji) motions, and found I had to invent a simple explicitly parallel notation to describe the moves. (Fork-join with barrier synch works in nearly all cases.)

I actually think, in retrospect (wrote this blog entry a while ago), that the whole language thing is barking up a wrong tree. The main use cases that have made good use of parallelism are trivially / embarrassingly / pleasantly massively parallel in the problem itself (like Monte Carlo methods, or graphics rendering) or in straightforward representations of the solution.

Since for successful cases the parallelism is brute obvious, we don't need much language as such. We just need frameworks able to express that obvious massive parallelism with a minimum of error-prone programmer tedium. Mapreduce comes to mind as an example of that.

Greg Pfister

Curt Sampson said...

If we leave aside some experimental stuff by John Cage and that crowd, music is without question entirely sequential and neither uses nor offers opportunities for parallelism in the sense that we use that word in computing science.

Polyphonic music may appear to be several parts happening in "parallel," but consider it more carefully. Is there any opportunity for more parallelism? I.e., can we further break down some of these sequental parts to be done by another voice at the same time, trading time for for voices and finishing the performance faster? Certainly not! Is there any way to remove some parallism and make things more sequental, trading voices for time? No; you can't have the trombone first play his own part, and then after he's done play the cello part, and have the same piece of music.

This becomes quite clear when you realize what a performance is: a single (very complex) waveform with only one specific amplitude at any specific point in time. (To avoid a lot of complexity I ignore stereophony for the moment, but I don't believe it would make a big difference to this argument.) Sure, we can break it down in our heads to, say, two violins, a viola and a cello, but we could just as well break the same waveform down into energy over time within thirty-odd frequency bands logarithmically distributed across the human hearing range. You wouldn't recognize melodies in that, but it is in fact something that sound engineers do all the time. And here, too, you can never trade time for bandwidth or vice versa.

By contrast, when you say that an 8-bit ALU does an add "in parallel," you're making a statement that the result would be just as valid as if you used a 1-bit ALU and had it repeat the add operation 8 times sequentially on each triplet of input bits.

You probably could convince me that there's concurrency in music, but as we all know, that's an entirely different thing. And life would be a lot simpler if we could live with just concurrency without parallelism.

Arkadius said...

Part 1

I will first reply to Curt's post, because Greg already re-saved the 30mins of my life. :)

Hi Curt, let me break down the answer by replying to the parts of Your comment that I find the most interesting:

"Polyphonic music may appear to be several parts happening in "parallel," but consider it more carefully. Is there any opportunity for more parallelism? I.e., can we further break down some of these sequental parts to be done by another voice at the same time, trading time for for voices and finishing the performance faster? Certainly not!"

You are assuming that music needs to be "performed" on a temporal "line". But what about the ability to CREATE said piece of music? To solve the „problem“ of creating said piece of music in the first place? A producer's job is often to listen to different parts of music and assembly them into one piece of music, in his brain.

It is not the most important part of the "equation" which part he hears first, or if certain groups of tracks play together or not (even though that makes the job easier). The most important part is being able to save the different heard parts in his memory and create spaces (plural/parallel) of context in which certain musical phrases fit together or not and then re-assemble them.

That is the performance part. By performance, you mean "getting the car from A to B".
But here's the problem (and the solution). The car doesn't need to get to B in one piece, all parts at the same time to still be considered a "car". The track doesn't need to get to B in one MIXED piece.

There are two obvious solutions to this problem. "Obvious" when you are composing, mixing AND collaborating on songs, like I do from time to time.

Can we split the individual sequences into "task groups"?
Yes! I can work on possible choruses, you can look at the bassline for vers 1 and 2, somebody else can create a drum track. It doesn't matter who gets finished first, as long as we are all musicians and are CREATIVE (as in, we are able to create more than ONE part for said song AND/OR consider several possible phrases while creating it AND after hearing the other parts of the song) and PRODUCTIVE (we are able to use our creativity to perform said aformentioned tasks AND use it at the same time to help out our friends, who are struggeling/behind us in their „schedeule“).

That is sequential parallelism. You are asigned a task, which you have to execute, but at the same time, you can use your creativity to look at the problem (not PERFORMING, but CREATING the different parts of the song) from another perspective; that of the drummer, guitar player or lyricist.
Yes, you will get a different song that way, but it doesn't mean that you knew exactly what said song will sound like when you are finished, doing it all sequentially, being the mulit-instrumentalist that you are...............to be continued

Arkadius said...

Part 2

........That's the first kind of solution. The „create & conquer“ solution.

There's part two though. Can we work on a SONG, asynchronously, meaning first creating the chorus, while our partner creates the drum track for the verses and outro and a SPACE of variations for the chorus he hasn't heard yet (because you are working on it, while he's working on the drums for the verses). Again, yes. We do that all the time. Many people make cover versions of famous songs. Creating the „best song“ (not the „2+2=4, always and forever“ aka „the perfect song“) would be to take all the individual parts of said cover versions and assemble the „best“ cover version.

Which is the answer to the following:

„Is there any way to remove some parallism and make things more sequental, trading voices for time? No; you can't have the trombone first play his own part, and then after he's done play the cello part, and have the same piece of music."

The solution is NOT wanting the same piece of music, each and every time you create music. What you want is being able to create 1000s of „Mozart-like“ pieces, instead of creating one which clearly says „well, he thought about the trombone first and then created the cello part“. That's the „2+2=4“ approach. The uniformity of „knowledge“, which can be done for some parts, but which shouldn't be the goal for the whole field of HPC, respectively AI. The HPC part is creating all possible answers, the AI part is getting the best guess without knowing them all. These two go hand in hand, in parallel so to speak.

There's this assumption in the world of (sci-fi?) physics, that even if we are able to create a time machine, we will never be able to go back before the time we created it/turned it on.
And that might be, for all I know, true. But there's a simple solution to that problem. Finding the alien race which created one BEFORE us. :)

Antromorphism lets you think, that we are the only intelligent race that will be able to build that time machine. Uniformity lets us think, that we all think/should look at a problem from the same perspective, or work on the same answers. Maybe that's the problem. We should make the machine „think“ like a schizophrenic, having different personalities AT THE SAME TIME. One of them is a great cook, the other knows how to build pans, the third one knows where to get the best roast beef, etc... The problem we think, is to get that food on the plate, as fast as we can. I think, it's deciding what we want to eat in the first place. :)

I hope my answer wasn't too „meta“. For me HPC is there to be able to waste cycles in the future and AI will be there to guess where they should be wasted. What sense does it make to create the perfect weather model and calculate it, come to the conclusion that we are responsible for global warming, not knowing that other planets get warmer too, without humans living there? :)

Greg Pfister said...

@Curt, I'll take a whack at this at a less meta level.

Sound in general -- not just music -- is produced by parallel (not just concurrent) processes and consumed in our heads through parallel (not just concurrent) processes.

The form by which it is transmitted may be considered either a serial encoding, or a parallel one aided by the superposition principle. Or if you like, multiple simultaneous phonons hitting our eardrums - like the mutiple photons in white light containing a (parallel) rainbow.

After all, since the parallel production process can produce sounds both from innocuous sources (leaves rustling, water flowing)and dangerous sources (roar of a predator), the ability to receive and process those parallel channels separately must be well selected-for in an evolutionary sense, and boy did I just write a long sentence.

Also, you seemed to indicate that if one can't find more and more parallelism indefinitely in some phenomenon, the phenomenon isn't parallel. That's just not true. There are limits to the parallelism available in anything.

Overall, I don't know about you, but when I play a chord on the piano I am definitely doing a parallel act (it's sort of SIMD, but still parallel). Likewise when playing a base line with the left hand and a melody with the right.

Greg Pfister

Curt Sampson said...

Let's keep in mind that I'm not talking about "parallelism" in some big, generic, philosophical arm-waving sense; I'm talking about it in the sense of the computing science term of art.

So Arkidus' comment that "The solution is NOT wanting the same piece of music, each and every time you create music" is exactly my point: who out there in the scienfic computation world wants a different answer every time he runs the same computation on the same body of data?

I'm going to leave the whole music production example alone for two reasons: first, it's changing the metaphor, and second, having been both a recording engineer and sometime music producer in a past life, and having done a fair amount of work in "experimental" and "avant-garde" music (including aleatoric music), I can see clearly how getting into a discussion about Arkidus' conception of how music could be made could turn into a hundred-thousand word extravaganza. Let's just say, it's not generally done that way.

In other words, yes, Arkidus, you are being too "meta." I just don't see the relevance of any of this to running computer programs or algorithms in parallel. (But I'll happily admit that I just missed it if you can produce a paper that shows a result related to this and survives reasonably well under comp-sci peer review.)

Greg, I understand what you're saying, but it just seems to me that applying it to multicomputing ignores the whole point of this blog. After all, by that standard, we've always been using parallel processors. Even without things like instruction pipelines and being able to work an integer and an FP instruction at the same time, you could go back to a processor as simple as a 4004 and find parallelism in it because, perhaps, it had a multi-line data bus, or updating the program counter happened in parallel with executing the just-loaded instruction. But that's rather small comfort when faced with a problem that takes a year to run sequentially on the fastest currently available CPU, isn't it?

Arkadius said...

Greg,

"likewise when playing a base line with the left hand and a melody with the right".

Exactly, the piece of music is created by said "parallelism". On top of that, sometimes you can/need to use the left/right hand to help out with the part the other hand is playing, thus sharing resources to perform a common task.
The limit to this is either the amount of notes the piece contains or the speed in which the hand can "jump" between positions. One hand uses the spare resources to play the part the other hand can't play, because it currently uses all of its resources to play certain notes.

The "schizophrenia" part of my "meta-post" would then be to use that principle to either a.) create a more complex piece (read: for more than two hands) or b.) play more than one piece of music on more than one piano. What Greg then says (as the analogy to the human being able to "tune in" into the important part of the "sound stream" (leafs vs. roars of a dangerous animal nearby) refers to you being able to to "tune in" into the piece of music you want to listen to, placed in said concerto of many pieces "performed" simultaneously.

And sorry for my Bible-length post before, but I wanted to explain the meaning behind the analogies I was using and why I think music can to a certain extent provide a framework for what we call "parallelism". I guess if we all sat down in the same room and disassembled my previous post and then reassambled it back together, we would come up with a much shorter post. "But we ain't parallel like that" :)

Greg Pfister said...

Curt & Arkadius,

Yes, Arkadius has it right in what my comment was aiming at.

But also yes, Curt, I was talking about parallel in a generic sense, not in the specific issue of this blog in general: Where do we find sufficient parallelism in a killer app that will drive client volumes. I got hung up on your statement that music is purely serial -- which it isn't -- but I agree, music creation isn't it.

Not that music creation can't use parallelism; there are lots of FFTs hiding in there. It's just not got a wide enough audience.

I said a bit more on this general topic in answer to another of Arkadius' comments, over here: http://goo.gl/3RzHE

Greg

Post a Comment

Thanks for commenting!

Note: Only a member of this blog may post a comment.