The Perils of Parallel: February 2009

Sunday, February 15, 2009

What Multicore Really Means (and a Larrabee/Cell Example)

So, now everybody's staring in rapt attention as Intel provides a peek at its upcoming eight-core chip. When they're not speculating about Larrabee replacing Cell on PlayStation 4, that is.

Sigh.

I often wish the guts of computers weren't so totally divorced from everyday human experience.

Just imagine if computers could be seen, heard, or felt as easily as, for example, cars. That would make what has gone on over the last few years instantly obvious; we'd actually understand it. It would be as if a guy from the computer (car) industry and a consumer had this conversation:

"Behold! The car!" says car (computer) industry guy. "It can travel at 15 miles per hour!"

"Oh, wow," says consumer guy, "that thing is fantastic. I can move stuff around a lot faster than I could before, and I don't have to scoop horse poop. I want one!"

Time passes.

"Behold!" says the industry guy again, "the 30 mile-an-hour car!"

"Great!" says consumer guy. "I can really use that. At 15 mph, it takes all day to get down to town. This will really simplify my life enormously. Gimme, gimme!"

Time passes once more.

"Behold!" says you-know-who, "the 60 mph car!"

"Oh, I need one of those. Now we can visit Aunt Sadie over in the other county, and not have to stay overnight with her 42 cats. Useful! I'll buy it!"

Some more time.

"Behold!" he says, "Two 62-mph cars!"

"Say what?"

"It's a dual car! It does more!"

"What is that supposed to mean? Look, where's my 120 mph car?"

"This is better! It's 124 mph. 62 plus 62."

"Bu… Wha… Are you nuts? Or did you just arrived from Planet Meepzorp? That's crazy. You can't add up speeds like that."

"Sure you can. One can deliver 62 boxes of muffins per hour, so the two together can deliver 124. Simple."

"Muffins? You changed what mph means, from speed to some kind of bulk transport? Did we just drop down the rabbit hole? Since when does bulk transport have anything to do with speed?"

"Well, of course the performance doubling doesn't apply to every possible workload or use. Nothing ever really did, did it? And this does cover a huge range. For example, how about mangos? It can do 124 mph on those, too. Or manure. It applies to a huge number of cases."

"Look, even if I were delivering mangos, or muffins, or manure, or even mollusks …"

"Good example! We can do those, too."

"Yeah, sure. Anyway, even if I were doing that, and I'm not saying I am, mind you, I'd have to hire another driver, make sure both didn't try to load and unload at the same time, pay for more oil changes, and probably do ten other things I didn't have to do before. If I don't get every one of them exactly right, I'll get less than your alleged 124 whatevers. And I have to do all that instead of just stepping on the gas. This is an enormous pain."

"We have your back on those issues. We're giving Jeb here – say hello, Jeb –"

"Quite pleased to meet you, I'm sure. Be sure to do me the honor of visiting my Universal Mango Loading Lab sometime."

"…a few bucks to get that all worked out for you."

"Hey, I'm sure Jeb is a fine fellow, but right down the road over there, Advanced Research has been working on massively multiple loading for about forty years. What can Jeb add to that?"

"Oh, that was for loading special High-Protein Comestibles, not every day mangos and muffins. HPC is a niche market. This is going to be used by everybody!"

"That is supposed to make it easier? Come on, give me my real 120 mile per hour car. That's a mile, not a munchkin, a monkey, a mattock, or anything else, just a regular, old, mile. That's what I want. In fact, that's what I really need."

"Sorry, the tires melt. That's just the way it is; there is no choice. But we'll have a Quad Car soon, and then eight, sixteen, thirty-two! We've even got a 128-car in our labs!"

"Oh, good grief. What on God's Green Earth am I going to do with a fleet of 128 cars?"

…

Yeah, yeah, I know, a bunch of separate computers (cars) isn't the same as a multi-processor. They're different kinds of things, like a pack of dogs is different from a single multi-headed dog. See illustrations here. The programming is very different. But parallel is still parallel, and anyway Microsoft and others will just virtualize each N-processor chip into N separate machines in servers. I'd bet the high-number multi-cores ultimately morph into a cluster-on-a-chip as time goes on, anyway, passing through NUMA-on-a-chip on the way.

But it's still true that:

Computers no longer go faster. We just get more of them. Yes, clock speeds still rise, but it's like watching grass grow compared to past rates of increase. Lots of software engineers really haven't yet digested this; they still expect hardware to bail them out like it used to.
The performance metrics got changed out from under us. SPECrate is muffins per hour.
Various hardware vendors are funding labs at Berkeley, UIUC, and Stanford to work on using them better, of course. Best of luck with your labs, guys, and I hope you manage to do a lot better than was achieved by 40 years of DARPA/NSF funding. Oh, but that was a niche.

My point in all of this is not to protest the rising of the tide. It's coming in. Our feet are already wet. "There is no choice" is a phrase I've heard a lot, and it's undeniably true. The tires do melt. (I sometimes wonder "Choice to do what?" but that's another issue.)

Rather, my point is this: We have to internalize the fact that the world has changed – not just casually admit it on a theoretical level, but really feel it, in our gut.

That internalization hasn't happened yet.

We should have reacted to multi-core systems like consumer guy seeing the dual car and hearing the crazy muffin discussion, instantly recoiling in horror, recognizing the marketing rationalizations as somewhere between lame and insane. Instead, we hide the change from ourselves, for example letting companies call a multi-core system "a processor" (singular) because it's packaged on one chip, when they should be laughed at so hard even their public relations people are too embarrassed to say it.

Also, we continue to casually talk in terms that suggest a two-processor system has the power of one processor running twice as fast – when they really can't be equated, except at a level of abstraction so high that miles are equated to muffins.

We need to understand that we've gone down a rabbit hole. So many standard assumptions no longer hold that we can't even enumerate them.

To ground this discussion in real GHz and performance, here's an example of what I mean by breaking standard assumptions.

In a discussion on Real World Technologies' Forums about the recent "Intel Larrabee in Sony PS4" rumors, it was suggested that Sony could, for backward compatibility, just emulate the PS3's Cell processor on Larrabee. After all, Larrabee is several processor generations after Cell, and it has much higher performance. As I mentioned elsewhere, the Cell cranks out "only" 204 GFLOPS (peak), and public information about Larrabee puts it somewhere in the range of at least 640 GFLOPS (peak), if not 1280 GFLOPS (peak) (depends on what assumptions you make, so call it an even 1TFLOP).

With that kind of performance difference, making a Larrabee act like a Cell should be a piece of cake, right? All those old games will run just as fast as before. The emulation technology (just-in-time compiling) is there, and the inefficiency introduced (not much) will be covered up by the faster processor. No problem. Standard thing to do. Anybody competent should think if it.

Not so fast. That's pre-rabbit-hole thinking. Those are all flocks of muffins flying past, not simple speed. Down in the warrens where we are now, it's possible for Larrabee to be both faster and slower than Cell.

In simple speed, the newest Cell's clock rate is actually noticeably faster than expected for Larrabee. Cell has shipped for years at 3.2 GHz; the more recent PowerXCell version uses newer fabrication technology to lower power (heat), not to increase speed. Public Larrabee estimates say that when it ships (late 2009 or 2010) it will be somewhere around 2 GHz., so in that sense Cell is about 1.25X faster than Larrabee (both are in-order, both count FLOPS double by having a multiply-add).

Larrabee is "faster" only because it contains much more stuff – many more transistors – to do more things at once than Cell does. This is true at two different levels. First, it has more processors: Cell has 8, while Larrabee at least 16 and may go up to 48. Second, while both Cell and Larrabee gain speed by lining up several numbers and operating on all of them at the same time (SIMD), Larrabee lines up more numbers at once than Cell: The GFLOPS numbers above assume Larrabee does 16 operations at once (512-bit vector registers), but Cell does only four operations at once (128-bit vector registers). To get maximum performance on both of them, you have to line up that many numbers at once. Unless you do, performance goes down proportionally.

This means that to match today's and several years' ago Cell performance, next year's Larrabee would have to not just emulate it, but extract more parallelism than is directly expressed in the program being emulated. It has to find more things to do at once than were there to begin with.

I'm not saying that's impossible; it's probably not. But it's certainly not at all as straightforward as it would have been before we went down the rabbit hole. (And I suspect that "not at all as straightforward" may be excessively delicate phrasing.)

Ah, but how many applications really use all the parallelism in Cell – get all its parts cranking at once? Some definitely do, and people figure out how to do more every day. But it's not a huge number, in part because Cell does not have the usual, nice, maximally convenient programming model exhibited by mainstream systems, and claimed for Larrabee; it traded that off for all that speed (in part). The idea was that Cell was not for "normal" programming; it was for game programming, with most of the action in intense, tight, hand-coded loops doing image creation from models. That happened, but certainly not all the time, and anecdotally not very often at all.

Question: Does that make the problem easier, or harder? Don't answer too quickly, and remember that we're talking about emulating from existing code, not rewriting from scratch.

A final thought about assumption breaking and Cell's notorious programmability issues compared with the usual simpler-to-use organizations: We may, one day, look back and say "It sure was nice back then, but we no longer have the luxury of using such nice, simple programming models." It'll be muffins all the way down. I just hope that we've merely gone down the rabbit hole, and not crossed the Mountains of Madness.

Monday, February 9, 2009

Larrabee In PS4? So What’s the Future of IBM’s Cell?

My, oh my. Intel "bought" use of Larrabee in the Sony PS4, says says The Inquirer. Sony tells TechRadar that's fiction. Other news sources repeat both statements over and over. Land sakes, who are we to believe? And, if Larrabee's in, what's the future of IBM's Cell, now in PS3? Let's look at the sources, and then consider possible results.

The Inquirer's information came from people manning the booth at the Consumer Electronics Show, including (and possibly limited to) a now famous but anonymous "nice Sony engineering lady" who is probably now wishing she were under a rock. TechRadar called up Sony Computer Entertainment Europe, and a "rep" said this was "quite possibly the best work of fiction I've read since Lord of the Rings."

If you call up SCEE and get someone you only call a "rep," it was a PR person. Of course a PR person will say it's fiction. For one thing, it really is fiction until legally binding contracts are signed. Even after that, it's fiction in PR terms until everybody has agreed on and staged the proper event to announce it. At that point, exactly the script agreed to is fact; anything else is fiction. Also, that PR person was rather far from the likely centers of negotiation, and so likely followed his current script with a very clear conscience.

However, engineers don't necessarily know what's really going on, either. They know what they've heard, but many an engineer's thoughts and analyses have been tossed into the Butter Dimension by an executive decision that seemingly came out of nowhere, influenced by things far beyond engineering's ken. Like, maybe, a really good round of golf. Or a whole lot of money. I've heard "You don't know the whole story" many more times than I'm comfortable with.

The "whole lot of money" angle was brought up in The Inquirer, and Jon Stokes over on Ars Technica gives a lot of good reasons why that may be extremely relevant, even if it throws out a lot of coding done on the PS3 for Cell. Stokes also said that Intel has, not surprisingly, been working that angle with Sony for quite a while. So our engineering lady may be reacting to the latest round and jumping the gun a bit.

I'm inclined to believe "jumping the gun," if for no other reason that anything prior to the legal signing ceremony really is jumping the gun. But I'm also inclined to believe a tipping point may well have been reached, for Stokes' and other reasons. So let's assume Intel Larrabee is the GPU for the PS4.

What does this mean for IBM's Cell processor – formally, the Cell Broadband Engine – rather famously used in the PlayStation 3? IBM might still participate as a provider of main processors for consoles; IBM's PowerPC is used in Nintendo Wii and Microsoft Xbox 360. But Cell was, and is, special.

In addition to Cell's use in the PS3 – where it never was the complete GPU; an Nvidia chip does the back end processing (that's a split Sony wanted, I heard) – it gained fame in HPC circles. Attached as an accelerator to conventional systems in Los Alamos' RoadRunner within QS22 blades, Cell was the reason Roadrunner garnered headlines for topping the 2008 TOP500 HPC chart and for being the first system ever to exceed 1 PetaFLOPS. (To those without the official decoder ring: A PetaFLOPS, and yes that S is part of it, is a thousand million million arithmetic operations every second, done on, roughly, fractional numbers as opposed to integers.)

Cell's raw massive FLOPage was what made that possible. You can do a lot with 155.5 GFLOPS on a single chip (single precision Linpack benchmark) – oops, that's the first release. The one in RoadRunner is the new and improved PowerXCell, which peaks at 204.8 GFLOPS single precision, 102.4 GFLOPS double, and you double that for a QS22 blade, each of which has two PowerXCell chips. (Peak quoted because I couldn't find per-chip or -blade Linpack numbers. Sorry.) (Decoder ring: GFLOPS = GigaFLOPS is a mere thousand million per second.)

Will IBM continue development of Cell without Playstation volumes? HPC volumes are a nice frosting on the console-volume cake for IBM, but hardly a full meal for a fab. Further development will be required to stay competitive, since a widely-leaked Intel Larrabee presentation, given in universities, indicates (with a little arithmetic) that an initial 16-processor Larrabee is supposed to reach 640 GFLOPS single precision peak, half that for double. That's supposed to see the light of day in late '09 or early 2010, and maybe 24-way, with 32-way and 48-way versions following, providing proportionally larger peak numbers (that's 1,920 GFLOPS for 48-way); and possibly even more with a process shrink. Those are really big numbers. And Cray is planning on using it, so there will be big RoadRunner-ish non-Cell future systems, maybe with a PS4-derived development system.

On the other hand, the newer PowerXCell isn't used in the PS3, and development always knew it wouldn't be a high-volume product. Will IBM continue to fund it, perhaps using some of their pile of profits from Services and Software? (In IBM, Hardware has lagged those two for many years in the money department.)

My guess is that it will depend on how much good will high-level executives think they get out of the front-page column-inches that result from topping supercomputer lists and breaking arbitrary performance barriers. That is a market force definitely not to be sneezed at. Here's an example from my personal experience:

Back in the late 90s, I was asked by an IBM account rep to give a presentation on IBM's HPC cluster (RS/6000 SP) to the CIO of a large, very conservative, IBM client in the US Midwest. A high-end HPC system pitch to that client? A client who said they liked being behind the leading edge? I was skeptical, but travel expense was on the account rep, so I went.

About five minutes into the presentation, the CIO asked a question causing me to point out, clearly, that the system consisted of utterly bog-standard IBM UNIX (POWER) systems, in a rack, with a fast network connecting them as an option. Utterly standard. Regular old OS. Regular old hardware. That's what he wanted to know. It turned out that's why I was there: To have someone not on commission say that with some credibility. They were consolidating IT locations because it was hard to find competent UNIX sysadmins in the boonies of the US Midwest, and the account rep noticed that buying several systems as a nominal HPC cluster was simply a cheaper and denser way to buy a whole bunch of UNIX systems at once. The CIO, not unreasonably, wanted to be sure he wasn't investing in some strange pile of gear only a Ph.D. could operate.

I could have left right then and there, after five minutes, mission accomplished. What I did was noodle through the rest of the presentation, have a nice chat about this and that, later have a drink with the account rep, and go home. Gave one for the company, heigh ho.

So, they bought one. Or maybe they bought five, or ten. I forget. But the story doesn't end there.

A few months later, there was all over the news a multi-day historic battle at the end of which the IBM Deep Blue system, which was built on and publicized as an RS/6000 SP, defeated Gary Kasparov, the reigning chess champion. So what? So I get a call from the account rep a few days later, practically oozing glee, saying he now owned that CIO.

It seems that the executives at that company had been walking around during the match, and quite a while after, saying to each other, in so many words, "Hey, don't we have one of those systems that defeated Kasparov? Didn't our CIO just buy it? And we're using it for our operations?" Nudge. Wink. Smile.

They had a player in a world-class game that they heard about day after day. It was "their" system playing Kasparov. They were fans, and their team was winning. They never got involved in stuff like that, and, though they'd probably never admit it out loud, they all thought it was just too cool. They would have been high-fiving that CIO at every turn, if anybody at that level in that company ever gave a high five, and the CIO was strutting around like a pigeon with his chest puffed out. He now had massive cred – the primary currency of both the street and the executive corridors – and it was all because he bought that system from IBM, on the advice of that account rep. You know he wasn't going elsewhere anytime soon.

Good will can work, big time.

So maybe Cell will still thrive even if it's not in the PS4. But it will require someone to look a few quarters beyond this afternoon's bottom line, and that's an increasingly difficult sell.

(P.S.: Some of you may have seen some stuttering in this blog's feed around this article. I originally jumped out after seeing the rumor, but hadn't seen the denial. Igor Ostrovsky almost instantly left a comment that pointed to the Sony denial, after which I obviously had to do some rewriting. That involved taking down the old post and replacing it with this one, a process that dropped Igor's comment. Thanks again, Igor.)

Tuesday, February 3, 2009

Is Cloud Computing Just Everything aaS?

As you may have read here before, defining "Cloud Computing" is a cottage industry. I'm not kidding about that; a new online magazine on Cloud Interoperability has, as its inaugural first article, definitions from twenty different people. Now I think I may see why.

This realization came in a discussion on the Google Cloud Interoperability Forum, in a thread discussing the recently published paper "Toward a Unified Ontology of Cloud Computing" by Lamia Yousef (UCSB), Maria Butrico and Dilma Da Silva (IBM Research) (here's the paper), discussed in John Willis' blog post. To this was added another more detailed ontology / taxonomy of Cloud Computing by Christopher Hoff in his blog, which has attracted enough comments to now be at version 1.3.

Here are the key figures from Yousef's presentation, and Hoff, in that order (I used the figure in Willis' blog):

When I look at these diagrams, I think there's something strange going on here. Nothing leaps out at me taxonomically / ontologically / structurally, in either of the two organizations, that causes either one of them to specifically describe a cloud.

They look like a generic ontology / taxonomy / structure / whatever attempting to cover all the conceivable contents of any IT shop, cloud or not.

Does "cloud" just mean you do such a generic description, and then at every useful point of entry, just add "aaS"? ("aaS" = "as a Service," as in: Software as a Service, Platform as a Service, Infrastructure as a Service, etc. ad nauseam.)

Maybe... that actually is what "cloud" means: *aaS.

I don't recall anybody trying to restrict what their cloud notions cover, in general -- everyone wants to be able to do it all. This means that the natural thing to do is try to define "it all," and then aaS it.

If the cloud community is unwilling or unable to accept any restrictions on what Cloud Computing may be or may do, and I can't imagine anything that conceivably enforce any such restrictions, I think that may be inevitable.

Anything-you-can-thing-of-aaS as a definition isn't very much help to anybody wanting to find out what all the Cloud Computing hype may actually mean, of course, whether it is or isn't the same as a Grid, etc.. I'm working up to a future post on that, but I have to find a particular philosophico-logical term first, though. (That search has nothing to do with "ontology," in case you were wondering. It's a name for a type of definition that comes out of Thomas Aquinas, I think.)

(P.S.: Yeah, still doing software / cloud postings. Gotta get some hardware in here.)

Monday, February 2, 2009

How to Blow Up a Cloud

Discussion seems to have started over a new kind of security threat, one that is unique to cloud computing. Christopher Hoff at Unisys first envisioned it (hasn't happened yet, as far as I know) and gave it the catchy name "EDoS": Economic Denial of Sustainability.

Actual conversation while I was writing this:

"What are you reading about?" asked my wife.

"Economic Denial of Sustainability."

"Ack Guk." Pantomimes sticking a finger down her throat. "I'm reading a murder mystery."

This sums up a little problem I've found when reading about this subject so far: What this "EDoS" mean gets lost, at least for me, behind a bunch of polysyllabic specialist words. So as an aid to clearing my own thoughts, I'm going to take a whack at explaining it here. I'll be trying hard to avoid jargon, so if you're knowledgeable don't rag on me for not using proper terms. Do rag if I get it wrong, of course.

First, a precursor concept: Distributed Denial of Service, or DDoS. (Now you know why EDoS actually is a catchy name, at least in certain quarters.)

DDoS begins when a bad guy uses a virus, worm, or similar kind of Bad Program to implant something that does his bidding into your Aunt Sadie's computer when she innocently connects to the Internet without protection (firewall, virus checker, etc.) for five minutes. Yes, according to numerous experiments in security labs in numerous companies, that's all the time it takes. The Internet is a dirty place.

Enormous numbers of computers are infected with such things. Groups of infected computers owned by a single bad guy or group have been estimated to consist of tens or hundreds of thousands of computers. Having infected them with his own stuff (and, amazingly, often then fixing the vulnerabilities so nobody else can take it away from him), the bad guy can then send commands out to all those computers to get them to do something for him.

Do what? Well, since we're presuming this is a bad guy, it's unlikely that he or she will use all those computers to help create a cure for cancer, analyze data searching for extraterrestrial intelligence, or any of the other laudable things that many people run when their computer isn't doing anything else. More likely, he'll use it to send spam.

Another typical bad guy use is to launch a Distributed Denial of Service (DDoS) attack on a web site: The bad guy orders all those computers to continually send repeated requests to some targeted web site. Hundreds of thousands of computers doing that all at once will at least clog the site, making it impossible for legitimate users of the site to get in. It may well cause the site to crash, going offline completely.

DDoS attacks are real, and have occurred "in the wild," meaning they happened for real, launched by genuinely bad guys, attacking real web sites, and are known about. They're a type of attack that's rather hard for a company to keep hidden from the public. I should look up some examples and cite them here, but I'm feeling lazy. Google "DDoS" or the expansion. DDoS has been around for a long time.

So, basically, DDoS is a bad guy causing a huge number of computers to whack on a web site until it falls over.

Now, cue Cloud Computing.

One of the definitions of Cloud Computing (defining it is a widespread hobby) is the use of large bunch of computers as a common shared pool of resources to run one or more applications. Most of the focus of this area is on using someone else's large bunch of computers to run your programs, so you are effectively "renting" the computers rather than buying them.

With the right kind of application – and web serving, in most cases, is the right kind – you can rent only what you need for what you do, expanding if your business goes up, and contracting when it drops. This is very cool, since you don't need to outguess your market and plan long-term investments in computers, software purchases, buildings to house the computers, people to run them, and so on. In the jargon, it turns CapEx into OpEx – capital expenses into operating expenses. This is an especially good thing nowadays.

Clouds are proving to be a great way to meet temporary demand. A star example of this was the night the NY Times converted 11 million articles into PDF, making web-available a huge historical archive of NY Times articles. This was done by Derek Gottfrid, who rented computing from Amazon; the books-and-everything-else-seller turns out to be the major cloud computing provider, selling its spare cycles as the Elastic Compute Cloud (EC2). I'm told it cost Derek around $300, small enough that he put it on his personal charge card. Buying the computers and installing them would have been ten to a hundred times more costly and taken rather a bit more than 24 hours.

Of course, the "elasticity," expanding and contracting on demand, can be automated so it responds to needs directly, and immediately. This is rightly considered a major feature of clouds that have it. In many clouds, it's the application software itself, potentially the best judge of its own busy-ness, that signals the cloud provider to add more oomph.

With that buildup, you can see where this is going. DDoD and cloud computing: Put them together.

Imagine a DDoS attack suddenly targeting a web site running on Huge Big Momma Towering Thunderhead Cloud Computing Corporation. Elasticity is on full auto. What happens?

WHOOMP!

Your application expands like a giant balloon on a compressed-air tank.

It grabs additional computers as fast as Huge Big Momma can fill them up with your application, which can be plenty fast. There may be no crash or even slowdown of legitimate users; the Huge Big Momma web site we are imagining is, indeed, huge, and it handles lots of people's applications. This is a good thing, since the same WHOOMP may happen if you have a really successful one-day sale.

Of course, HBM's bill-o-meter goes into overdrive, charging you for all the computers used, all the communication into the cloud, etc. (This, by the way, is a reason why Huge Big Momma is highly motivated to fill computers with applications in a big hurry.) Since you can be sure the infected computers aren't ordering anything, you potentially lose money very rapidly indeed.

A more insidious form of this attack can also be used, very simply: Just don't do it all at once. Instead, keep unobtrusive steady trickles of bogus requests coming in, just enough to increase the rent.

What this trickle attack accomplishes is eroding the economic advantage of the cloud, threatening a major justification for using cloud computing. As far as I can tell, this trickle attack is what Hoff actually had in mind when he coined the term EDoS.

Note that such a trickle attack may well not affect traditional self-owned web sites at all, since it would just soak up otherwise unused capacity (of which there usually is a lot), and add a tad of delay to legitimate users' requests. But since it is a trickle, it wouldn't be noticed. So nobody will bother mounting such an attack.

Are either the sudden or the trickle attacks real threats? Of course. They can be done, so they will be.

Sudden attacks can be blunted by attaching some rational policy to full-auto expansion, like "don't expand more than X% without checking with a human," with provision for prearranged one-day sales or the like. You do, however, need to know to do this. And, with application code itself doing the expansion, the cloud provider needs something to put on the brakes, since that code may have a bug. Bug-produced ballooning is a far more likely than EDoS, actually.

The trickle version is a bit less straightforward to deal with. It implies (and so does the sudden attack) that the traffic monitoring already done to avert other types of threats will have to be extended to cover these cases. Perhaps this can be done – and I'm definitely not a security guru, so this may be completely bogus – by looking at how similar the requests are. The messages spewed by robots may well have similarities to each other, or at patterns in their timing or source locations, that are distinguishable from human-generated requests. Work will be needed to discover those patterns, and Hoff is to be thanked for pointing out that this is a new style of Internet vulnerability arising in future cloud use.

My net is that EDoS is just another round in the never-ending black-hat – white-hat thrust and parry which, unlike a real sword fight, seems never-ending.

(P.S.: I swear to $deity that when I made up the Huge Big Momma name I was not thinking of any other company whose initials might end in an M preceded by a B. Really.)

(P.P.S: Lots of software-related posts recently. I promise some hardware will come up soon.)