Wednesday, September 24, 2008

IO, IO, We Really need IO

Now for something completely different:

I had always wondered what the heck was going on with recent large disk systems. They're servers, for heaven's sake, with directly attached disks and, often, some kind of nonvolatile cache. IBM happily makes money selling dual Power systems with theirs, while almost everybody else uses Intel. If your IO system were decent, shouldn't you just attach the disks directly? Is the whole huge, expensive infrastructure of adapters, fibre-channel (soon Data Center Ethernet (probably copyright Cisco)), switches, etc., reall the result of the mechanical millisecond delays in disks?

Apparently I'm not the only person thinking this. According to Is flash a cache or pretend disk drive? By Chris Mellor, an interview with Rick White, one of the three founders of Fusion-io, you just plop your NAND flash directly to your PCIe bus. Then use it as a big disk cache. You have to use some algorithms to spread out the writes so you don't wear out the NAND too fast, but lots of overhead just goes away.

Of course, if you want to share that cache among multiple systems, there are a whole lot of other issues; it's a big block of shared memory. But start with "how the heck to you connect to it?" LAN/SAN time again, with processors on both ends to do the communication. Dang.

Relevance to the topic of this blog: As more compute power is concentrated in systems, you need more and faster IO, or there's a whole new reason why this show may have problems.

No comments:

Post a Comment

Thanks for commenting!

Note: Only a member of this blog may post a comment.