Thursday, July 29, 2010

Standards Are About the Money

Nonstandard Cloud
Standards for cloud computing are a never-ending topic of cloud buzz ranging all over the map: APIs (programming interfaces), system management, legal issues, and so on.

With a few exceptions where the motivation is obvious (like some legal issues in the EU), most of these discussions miss a key point: Standards are implemented and used if and only if they make money for their implementers.

Whether customers think they would like them is irrelevant – unless that liking is strong enough to clearly translate into increased sales, paying back the cost of defining and implementing appropriate standards. "Appropriate" always means "as close to my existing implementation as possible" to minimize implementation cost.

That is my experience, anyway, having spent a number of years as a company representative to the InfiniBand Trade Association and the PCI-SIG, along with some interaction with the PowerPC standard and observation of DMTF and IETF standards processes.

Right now there's an obvious tension, since cloud customers see clear benefits to having an industry-wide, stable implementation target that allows portability among cloud system vendors, a point well-detailed in the Berkeley report on cloud computing.

That's all very nice, but unless the cloud system vendors see where the money is coming from, standards aren't going to be implemented where they count. In particular, when there are major market leaders, like Amazon and Google right now, it has to be worth more to those leaders than the lock-in they get from proprietary interfaces. I've yet to see anything indicating that they will, so am not very positive about cloud standards at present time.

But it could happen. The road to any given standard is very often devious, always political, regularly suffused with all kinds of nastiness, and of course ultimately driven throughout by good old capitalist greed. An example I'm rather familiar with is the way InfiniBand came to be, and semi-failed.

The beginning was a presentation by Justin Rattner at the 1998 Intel Developer Forum, in which he declared Intel's desire for their micros to grow up to be mainframes (mmmm… really juicy profit margins!). He thought they had everything except for IO. Busses were bad. He actually showed a slide with a diagram that could have come right out of an IBM Parallel Sysplex white paper, complete with channels and channel directors (switches) connecting banks of storage with banks of computers. That was where we need to go, he said, at a commodity price point. 

Shortly thereafter, Intel founded the Next Generation IO Forum (NGIO), inviting
 other companies to join in the creation of this new industry IO standard. That sounds fine, and rather a step better than IBM did when trying to foist Microchannel architecture on the world (a dismal failure), until you read the fine print in the membership agreement. There you find a few nasties. Intel had 51% of every vote. Oh, and if you have any intellectual property (IP) (patents) in the area, they now all belonged to Intel. Several companies did join, like Dell; they like to be "tightly integrated" with their suppliers.

A few folks with a tad of IP in the IO area, like IBM and Compaq (RIP), understandably declined to join. But they couldn't just let Intel go off and define something they would then have to license. So a collection of companies – initially Compaq, HP, and IBM – founded the rival Future IO Developer's Forum (FIO). Its membership agreement was much more palatable: One company, one vote; and if you had IP that was used, you had to promise to license it with terms that were "reasonable and nondiscriminatory," a phrase that apparently means something quite specific to IP lawyers.

Over the next several months, there was a steady movement of companies out of NGIO and into FIO. When NGIO became only Intel and Dell (still tightly integrated), the two merged as the InfiniBand Trade Association (IBTA). They even had a logo for the merger itself! (See picture.) The name "InfiniBand" was dreamed up by a multi-company collection of marketing people, by the way; when a technical group member told them he thought it was a great name (a lie) they looked worried. The IBTA had, in a major victory for the FIO crowd, the same key terms and conditions as FIO. In addition, Roberts' rules of order were to be used, and most issues were to be decided by a simple majority (of companies).

Any more questions about where the politics comes in? Let's cover devious and nasty with a sub-story:

While on one of the IBTA groups, during a contentious discussion I happened to be leading for one side, I mentioned I was going on vacation for the next two weeks. The first day I was on vacation a senior-level executive of a company on the other side in the dispute, an executive not at all directly involved in IBTA, sent an email to another senior-level executive in a completely different branch of IBM, a branch with which the other company did a very large amount of business. It complained that I "was not being cooperative" and I had said on the IBTA mailing lists that certain IBM products were bad in some way. The obvious intent was that it be forwarded to my management chain through layers of people who didn't understand (or care) what was really going on, just that I had made this key customer unhappy and had dissed IBM products. At the very least, it would have chewed up my time disentangling the mess left after it wandered around forwards for two weeks (I was on vacation, remember?); at worst, it could have resulted in orders to me to be more "cooperative," and otherwise undermined me within my own company. Fortunately, and to my wife's dismay, I had taken my laptop on vacation and watched email; and a staff guy in the different division intercepted that email, forwarded it directly to me, and asked what was going on. As a result, I could nip it all in the bud.

It's sad and perhaps nearly unbelievable that precisely the same tactic – complain at a high level through an unrelated management chain – had been used by that same company against someone else who was being particularly effective against them.

Another, shorter, story: A neighbor of mine who was also involved in a similar inter-company dispute told me that, while on a trip (and he took lots of trips; he was a regional sales manager) he happened to return to his hotel room after checking out and found people going through his trash, looking for anything incriminating.

Standards can be nasty.

Anyway, after a lot of the dust settled and IB had taken on a fairly firm shape, Intel dropped development of its IB product. Exactly why was never explicitly stated, but the consensus I heard was that compared with others' implementations in progress it was not competitive. Without the veto power of NGIO, Intel couldn't shape the standard to match what it was implementing. With Intel out, Microsoft followed suit, and the end result was InfiniBand as we see it today: A great interconnect for high-end systems that pervades HPC, but not the commodity-volume server part the founders hoped that it would be. I suspect there are folks at Intel who think they would have been more successful at achieving the original purpose if they had their veto, since then it would have matched their inexpensive parts. I tend to doubt that, since in the meantime PCI has turned into a hierarchical switched fabric (PCI Express), eliminating many of the original problems stemming from it being a bus.

All this illustrates what standards are really about, from my perspective. Any relationship with pristine technical discussions or providing the "right thing" for customers is indirect, with all motivations leading through money – with side excursions through political, devious, and just plain nasty.

Thursday, July 15, 2010

OnLive Follow-Up: Bandwidth and Cost

As mentioned earlier in OnLive Works! First Use Impressions, I've tried OnLive, and it works quite well, with no noticeable lag and fine video quality. As I've discussed, this could affect GPU volumes, a lot, if it becomes a market force, since you can play high-end games with a low-end PC. However, additional testing has confirmed that users will run into bandwidth and data usage issues, and the cost is not what I'd like for continued use.

To repeat some background, for completeness: OnLive is a service that runs games on their servers up in the cloud, streaming the video to your PC or Mac. It lets you run the highest-end games on very inexpensive systems, avoiding the cost of a rip-roaring gamer system. I've noted previously that this could hurt the mass market for GPUs, since OnLive doesn't need much graphics on the client. But there were serious questions (see my post Twilight of the GPU?) as to whether they could overcome bandwidth and lag issues: Can OnLive respond to your inputs fast enough for games to be playable? And could its bandwidth requirements be met with a normal household ISP?

As I said earlier, and can re-confirm: Video, check. I found no problems there; no artifacts, including in displayed text. Lag, hence gameplay, is perfectly adequate, at least for my level of skill. Those with sub-millisecond reflexes might feel otherwise; I can't tell. There's confirmation of the low lag from Eurogamer, which measured it at "150ms - similar to playing … locally".


Bandwidth, on the other hand, does not present a pretty picture.

When I was playing or watching action, OnLive continuously ran at about 5.8% - 6.4% utilization of a 100 Mb/sec LAN card. (OnLive won't run on WiFi, only on a wired connection.) This rate is very consistent. Displayed image resolution didn't cause it to vary outside that range, whether it was full-screen on my 1600 x 900 laptop display, full-screen on my 1920 x 1080 monitor, or windowed to about half the laptop screen area (which was the window size OnLive picked without input from me). When looking at static text displays, like OnLive control panels, it dropped down to a much smaller amount, in the 0.01% range; but that's not what you want to spend time doing with a system like this.

I observed these values playing (Borderlands) and watching game trailers for a collection of "coming soon" games like Deus Ex, Drive, Darksiders, Two Worlds, Driver, etc. If you stand still in a non-action situation, it does go down to about 3% (of 100 Mb/sec) for me, but with action games that isn't the point.

6.4% of 100 Mb/sec is about 2.9 GB (bytes) per hour. That hurts.

My ISP, Comcast, considers over 250 GB/month "excessive usage" and grounds for terminating your account if you keep doing it regularly. That limit and OnLive's bandwidth together mean that over a 30-day period, Comcast customers can't play more than 3 hours a day without being considered "excessive."


I also found that prices are not a bargain, unless you're counting the money you save using a bargain PC – one that costs, say, what a game console costs.

First, you pay for access to OnLive itself. For now that can be free, but after a year it's slated to be $4.95 a month. That's scarcely horrible. But you can't play anything with just access; you need to also buy a GamePass for each game you want to play.

A Full GamePass, which lets you play it forever (or, presumably, as long as OnLive carries the game) is generally comparable to the price of the game itself, or more for the PC version. For example, the Borderlands Full GamePass is $29.99, and the game can be purchased for $30 or less (one site lists it for $3! (plus about $9 shipping)). F.E.A.R. 2 is $19.99 GamePass, and the purchase price is $19-$12. Assassin's Creed II was a loser, with GamePass for $39.99 and purchased game available for $24-$17. The standalone game prices are from online sources, and don't include shipping, so OnLive can net a somewhat smaller total. And you can play it on a cheap PC, right? Hmmm. Or a console.

There are also, in many cases, 5 day and 3 day passes, typically $9-$7 for 5-day and $4-$6 for 3-day. As a try before you buy, maybe those are OK, but 30 minute free demos are available, too, making a reasonably adequate try available for free.

Not all the prices are that high. There's something called AAAAAAA, which seems to consist entirely of falling from tall buildings, with a full GamePass for $9.99; and Brain Challenge is $4.99. I'll bet Brain Challenge doesn't use much bandwidth, either.

The correspondence between Full GamePass and the retail price is obviously no coincidence. I wouldn't be surprised at all to find that relationship to be wired into the deals OnLive has with game publishers. Speculation, since I just don't know: Do the 5 or 3 day pass prices correspond to normal rental rates? I'd guess yes.

Simplicity & the Mac Factor

A real plus for OnLive is simplicity. Installation is just pure dead simple, and so is starting to play. Not only do you not have to acquire the game, there's no installation and no patching; you just select the game, get a PayPass (zero time with a required pre-registered credit card), and go. Instant gratification.

Then there's the Mac factor. If you have only Apple products – no console and no Windows PC – you are simply shut out of many games unless you pursue the major hassle of BootCamp, which also requires purchasing a copy of Windows and doing the Windows maintenance. But OnLive runs on Macs, so a wide game experience is available to you immediately, without a hassle.


To sum up:

Positive: great video quality, great playability, hassle-free instant gratification, and the Mac factor.

Negative: Marginally competitive game prices (at best) and bandwidth, bandwidth, bandwidth. The cost can be argued, and may get better over time, but your ISP cutting you off for excessive data usage is pretty much a killer.

So where does this leave OnLive and, as a consequence, the market for GPUs? I think the bandwidth issue says that OnLive will have little impact in the near future.

However, this might change. Locally, Comcast TV ads showing off their "Xfinity" rebranding had a small notice indicating that 105 Mb data rates would be available in the future. It seems those have disappeared, so maybe it won't happen. But a 10X data rate improvement wouldn't mean much if you also didn't increase the data usage cap, and a 10X usage cap increase would completely eliminate the bandwidth issue.

Or maybe the Net Neutrality guys will pick this up and succeed. I'm not sure on that one. It seems like trying to get water from a stone if the backbone won't handle it, but who knows?

The proof, however, is in the playing and its market share, so we can just watch to see how this works out. The threat is still there, just masked by bandwidth requirements.

(And I still think virtual worlds should evaluate this technology closely. Installation difficulty is a key inhibitor to several markets there, forcing extreme measures – like shipping laptops already installed – in one documented case; see Living In It: A Tale of Learning in Second Life.)

Monday, July 12, 2010

Who Does the Shoe Fit? Functionally Decomposed Hardware (GPGPUs) vs. Multicore.

This post is a long reply to the thoughtful comments on my post WNPoTs and the Conservatism of Hardware Development that were made by Curt Sampson and Andrew Richards. The issue is: Is functionally decomposed hardware, like a GPU, much harder to deal with than a normal multicore (SMP) system? (It's delayed. Sorry. For some reason I ended up in a mental deadlock on this subject.)

I agree fully with Andrew and Curt that using functionally decomposed hardware can be straightforward if the hardware performs exactly the function you need in the program. If it does not, massive amounts of ingenuity may have to be applied to use it. I've been there and done that, trying at one point to make some special-purpose highly-parallel hardware simulation boxes do things like chip wire routing or more general computing. It required much brain twisting and ultimately wasn't that successful.

However, GPU designers have been particularly good at making this match. Andrew made this point very well in a video'd debate over on Charlie Demerjian's SemiAccurate blog: Last minute changes that would be completely anathema to GP designs are apparently par for the course with GPU designs.

The embedded systems world has been dealing with functionally decomposed hardware for decades. In fact, a huge part of their methodology is devoted to figuring out where to put a hardware-software split to match their requirements. Again, though, the hardware does exactly what's needed, often through last-minute FPGA-based hardware modifications.

However, there's also no denying that the mainstream of software development, all the guys who have been doing Software Engineering and programming system design for a long time, really don't have much use for anything that's not an obvious Turing Machine onto which they can spin off anything they want. Traditional schedulers have a rough time with even clock speed differences. So, for example, traditional programmers look at Cell SPUs, with their manually-loaded local memory, and think they're misbegotten spawn on the devil or something. (I know I did initially.)

This train of thought made me wonder: Maybe traditional cache-coherent MP/multicore actually is hardware specifically designed for a purpose, like a GPU. That purpose is, I speculate, transaction processing. This is similar to a point I raised long ago in this blog (IT Departments Should NOT Fear Multicore), but a bit more pointed.

Don't forget that SMPs have been around for a very long time, and practically from their inception in the early 1970s were used transparently, with no explicit parallel programming and code very often written by less-than-average programmers. Strongly enabling that was a transaction monitor like IBM's CICS (and lots of others). All code is written as a relatively small chunk (debit this account) (and the cash on hand, and total cash in a bank…). That chunk is automatically surrounded by all locking it needs, called by the monitor when a customer implicitly invokes it, and can be backed out as needed either by facilities built into the monitor or by a back-end database system.

It works, and it works very well right up to the present, even with programmers so bad it's a wonder they don't make the covers fly off the servers. (OK, only a few are that bad, but the point is that genius is not required.)

Of course, transaction monitors aren't a language or traditional programming construct, and also got zero academic notice except perhaps for Jim Gray. But they work, superbly well on SMP / multicore. They can even work well across clusters (clouds) as long as all data is kept in a separate backend store (perhaps logically separate), which model, by the way, is the basis of a whole lot of cloud computing.

Attempts to make multicores/SMPs work in other realms, like HPC, have been fairly successful but have always produced cranky comments about memory bottlenecks, floating-point performance, how badly caches fit the requirements, etc., comments you don't hear from commercial programmers. Maybe this is because it was designed for them? That question is, by the way, deeply sarcastic; performance on transactional benchmarks (like TPC's) are the guiding light and laser focus of most larger multicore / SMP designs.

So, overall, this post makes a rather banal point: If the hardware matches your needs, it will be easy to use. If it doesn't, well, the shoe just doesn't fit, and will bring on blisters. However, the observation that multicore is actually a special purpose device, designed for a specific purpose, is arguably an interesting perspective.

Tuesday, July 6, 2010

OnLive Works! First Use Impressions

I've tried OnLive, and it works. At least for the games I tried, it seems to work quite well, with no noticeable lag and fine video quality. But I'm not sure about the bandwidth issue yet, or the cost.

OnLive is a service that runs games on their servers up in the cloud, streaming the video to your PC or Mac. I've noted previously that this could hurt the mass market for GPUs, since it doesn't need much graphics on the client. But there were serious questions (see my post Twilight of the GPU?) as to whether they could overcome bandwidth and lag issues: Can OnLive respond to your inputs fast enough for games to be playable? And could its bandwidth requirements be met with a normal household ISP?

As I said above: Lag, check. Video, check. I found no problems there. Bandwidth, inconclusive. Cost, ditto. More data will answer those, but I've not had the chance to gather it yet. Here's what I did:

I somehow was "selected" from their wait-list as an OnLive founding member, getting me free access for a year – which doesn't mean I play totally free for a year; see below – and tried it out today, playing free 30-minute demos of Assassin's Creed II a little bit, and Borderlands enough for a good impression.

Assassin's Creed II was fine through initial cutscenes and minor initial movement. But when I reached the point where I was reborn as a player in medieval times, I ran into a showstopper. As an introduction to the controls, the game wanted me to press <squiggle_icon> to move my legs. <squiggle_icon>, unfortunately, corresponds to no key on my laptop. I tried everything plus shift, control, and alt variations, and nothing worked. In the process I accidentally created a brag clip, went back to the OnLive dashboard, and did some other obscure things I never did figure out, but never did move my legs. I moved my arms with about four different key combinations, but the game wasn't satisfied with that. So I ditched it. For all I know there's something on the OnLive web site explaining this, but I didn't look enough to find it.

I was much more successful with Borderlands, a post-apocalyptic first-person shooter. I completed the initial training mission, leveled up, and was enjoying myself when the demo time – 30 minutes, which I consider adequately generous – ran out. Targeting and firing seemed to be just as good as native games on my system. I played both in a window and in fullscreen mode, and at no time was there noticeable lag or any visual artifacts. It just played smoothly and nicely.

I wanted to try Dragon Age – I'm more of an RPG guy – but while it shows up on the web site, I couldn't find it among the games available for play on the live system.

This is not to say there weren't hassles and pains involved in getting going. Here are some details.

First, my environment: The system I used is a Sony Vaio VGN-2670N, with Intel Core Duo @ 2.66 GHz, a 1600x900 pixel display, with 4GB RAM and an Nvidia GeForce 9300M; but the Nvidia display adapter wasn't being used. For those of you wondering about speed-of-light delays, my location is just North of Denver, CO, so this was all done more than 1000 miles from the closest server farm they have (Dallas, TX). My ISP is Comcast cable, nominally providing 10 Mb/sec; I have seen it peak as high as 15 Mb/sec in spurts during downloads. My OS is 32-bit Windows Vista. (I know…)

There was a minor annoyance at the start, since their client installer refuses to even try using Google Chrome as the browser. IE, Firefox, and Safari are supported. But that only required me to use IE, which I shun, for the install; it's not used running the client.

The much bigger pain is that OnLive adamantly refuses to run over Wifi. The launcher checks, gives you one option – exit – and points you to a FAQ, which pointer gets a 404 (page not found). I did find the relevant FAQ manually on the web site. There they apologize and say it "does indeed work well with good quality Wi-Fi connections, and in the future OnLive will support wireless" but initially they're scared of bad packet-dropping low-signal-strength crud. I can understand this; they're fighting an uphill battle convincing people this works at all, and do not need a multitude complaining they don't work when the problem is crummy Wi-Fi. (Or WiFi in a coffee shop – a more serious issue; see bandwidth discussion below.)

Nevertheless, this is a pain for me. I had to go down in the basement and set up a chair where my router is, next to my water heater, to get a wired connection. When I did go down there, after convincing Vista (I know!) to actually use the wired connection, things went as described above.

That leaves one question: Bandwidth. My ISP, Comcast, has a 250 GB/month limit beyond which I am an "excessive user" and apparently get a stern talking-to, followed by account termination if I don't mend my excessive ways. Up to now, this has been far from an issue. With OnLive, it may be a significant limitation.

Unfortunately, I didn't monitor my network use carefully when using OnLive, and ran out of time to go back and do better monitoring. I'll report more when I've done that. However, checking some numbers provided by Comcast after the fact, I can see the possibility that averaging four hours a day is all the OnLive I could do and not get terminated, since my hour of use may (just may) have sucked down 2 GB. This could be a significant issue, limiting OnLive to only very casual users, but I need better measurement to be sure.

This also points to a reason for not initially allowing Wifi that they didn't mention: I doubt your local free Wifi hot spot in a Starbucks or McDonald's is really up to the task of serving several OnLive players all day.

Finally, there's cost. What I have free is access to the OnLive system; after a year that's $4.95/month (which may be a "founding member" deal). But to play other than a free demo, I need to purchase a PlayPass for each game played. I didn't do that, and still need to check that cost. Sorry, time limitations again.

So where does this leave the market for GPUs? With the information I have so far, all I can say is that the verdict is inconclusive. I think they really have the lag and display issues licked; those just aren't a problem. If I'm wrong about the bandwidth (entirely possible), and the PlayPasses don't cost too much, it could over time deal a large blow to the mass market for GPUs, which among other problems would sink the volumes that make them relatively inexpensive for HPC use.

On the other hand, if the bandwidth and cost make OnLive suitable only for very casual gaming, there may actually be a positive effect on the GPU market, since OnLive could be used as a very good "try before you buy" facility. It worked for me; I've been avoiding first-person shooters in favor of RPGs, but found the Borderlands demo to be a lot more fun than I expected.

Finally, I'll just note that Second Life recently changed direction and is saying they're going to move to a browser-based client. They, and other virtual world systems, might do well to consider instead a system using this type of technology. It would expand the range of client systems dramatically, and, even though there is a client, simplify use dramatically.