I still have one IDF interview to transcribe (Joe Curley),
but I’m sick of doing transcriptions. So here are a few other random things I
observed at the 2011 Intel Developers Forum. It is nothing like comprehensive.
It’s also not yet the promised MIC dump; that will still come.
Exhibit Hall
I found very few products I had a direct interest in, but
then again I didn’t look very hard.
On the right, immediately as you enter, was a demo of a
Xeon/MIC combination clocking 600-700 GFLOPS (quite assuredly single precision) doing LRU Factorization. Questions
to the guys running the demo indicated: (1) They did part on the Xeon, and
there may have been two of those, they weren’t sure (the diagram showed two).
(2) They really learned how to say “We don’t comment on competitors” and “We
don’t comment on unannounced products.”
A 6-legged robot controlled by Atom, controlled by a game
controller. I included this here only because it looked funky and I took a picture
(q. v.). Also, for some reason it was in constant slight motion, like it couldn’t
sit still, ever.
There were three things that were interesting to me in
the Intel Labs section:
One Tbit/sec memory stack: To understand why this is
interesting, you need to know that the semiconductor manufacturing processes
used to make DRAM and logic are quite different. Putting both on the same chip
requires compromises in one or the other. The logic that must exist on DRAM
chips isn’t quite as good as it could be, for example. In this project, they
separated the two onto separate chips in a stack: Logic is on one, the bottom
one, that interfaces with the outside world. On top of this are multiple pure
memory chips, multiple layers of pure DRAM, no logic. They connect by solder
bumps or something (I’m not sure), and there are many (thousands of) “through silicon
vias” that go all the way through the memory chips to allow connecting a whole
stack to the logic at the bottom with very high bandwidth. This whole idea eliminates
the need to compromise on semiconductor processes, so the DRAM can be dense
(and fast), and the logic can be fast (and low power). One result is that they
can suck 1 Tbit/sec of data out of one of these stacks. This just feels right
to me as a direction. Too bad they’re unlikely to use the new IBM/3M thermally conductive glue
to suck heat out of the stack.
Stochastic Ray-Tracing:
What it says: Ray-tracing, but allows light to be probabilistically
scattered off surfaces, so, for example, shiny matte surfaces have realistically
blurred reflections on them, and produce more realistic color effects on other
surfaces to which they reflect. Shiny matte surfaces like the surface of the
golden dome in the center of the Austrian crown, reflecting the jewels in the
outer band, which was their demo image. I have a picture here, but it comes
nowhere near doing this justice. The large, high dynamic range monitor they
had, though – wow. Just wow. Spectacular. A guy was explaining this to me pointing
to a normal monitor when I happened to glance up at the HDR one. I was like “shut
up already, I just want to look at that.” To run it they used a cluster of four Xeon-based nodes, each apparently about 4U high, and that was not in real time; several seconds were required per
update. But wow.
Real-Time Ray-Tracing: This has been done before; I saw
it a demo on a Cell processor back in about 2006. This, however, was a much
more complex scene than I’d previously viewed. It had the usual shiny classic
car, but that was now in the courtyard of a much larger old palace-like
building, with lots of columns and crenellations and the like. It ran on a MIC,
of course – actually, several of them, all attached to the same Xeon system. Each had a complete copy of the scene
data in its memory, which is unrealistic but does serve to make the problem “pleasantly
parallel” (which is what I’m told is now the PC way to describe what used to be
called “embarrassingly parallel”). However, the demo was still fun. Here's a video of it I found. It apparently was shot at a different event, but still the same technology demonstrated. The intro is in Swedish, or something, but it reverts to English at the demo. And yes, all the Intel Labs guys wore white lab coats. I teased them a bit on that.
Keynotes
Otellini (CEO): Intel is going hot and heavy into
supporting the venerable Trusted Platform technology, a collection of
technology which might well work, but upon which nobody has yet bitten. This security
emphasis clearly fits with the purchase of MacAfee (everybody got a free
MacAfee package at registration, good for 3 systems). “2011 may be the year the
industry got serious about security!” I remain unconvinced.
Mooley Eden (General Manager, Mobile Platforms): OK. Right
now, I have to say that this is the one time in the course of these IDF posts
that I am going to bow to Intel’s having paid for me to attend IDF, bite my
tongue rather than succumbing to my usual practice of biting the hand that
feeds me, and limit my comments to:
Mooley Eden must be an acquired taste.
To learn more of my personal opinions on this subject, you are going to
have to buy me a craft beer (dark & hoppy) in a very noisy bar. Since I don’t
like noisy bars, and that’s an unusual combination, I consider this unlikely.
Technically… Ultrabooks, ultrabooks, “beyond thin and
light.” More security. They had a lame Ninja-garbed guy on stage, trying to
hack into trusted-platform-protected system, and of course failing. (Please see
this.) There was also a picture of a castle
with a moat, and a (deliberately) crude animation of knights trying to cross the moat and falling
in. (I mention this only because it’s relevant to something below.)
People never use hibernate, because it takes too long to
wake up. The solution is… to have the system wake up regularly. And run sync
operations. Eh what? Is this supposed to cause your wakeup to take less time
because the wakeup time is actually spent syncing? My own wakeup time is mostly wakeup. All I know is that
suspend/resume used to be really fast, reliable, and smart. Then it got transplanted to Windows from BIOS and has been unsatisfactory - slow and dumb - ever since.
This was my first time seeing Windows 8. It looks like
Mango phone interface. Is making phones & PCs look alike supposed to help in
some way? (Like boost Windows Phone sales?) I’m quite a bit less than intrigued. It
means I’m going to have to buy another laptop before Win 8 becomes the
standard.
Justin Rattner (CTO): Some of his stuff I covered in my
first post on IDF. One I didn’t cover was the massive deal made of CERN and the
LHC (Large Hadron Collider) (“the largest machine human
beings have ever created”) (everybody please now go “ooooohhh”) using MICs. Look,
folks, the major high energy physics apps are embarrassingly parallel: You get
a whole lot, like millions, billions, of particle collisions, gather each one’s
data, and do an astounding amount of floating-point computing on each completely independent
set of collision data. Separately. Hoping to find out that one is a Higgs boson or something. I saw people doing this in the late 1980s at
Fermilab on a homebrew parallel system. They even had a good software framework
for using it: Write your (serial) code for analyzing a collision your way, and
hand it to us; we run it many times in parallel, just handing out each event’s
data to an instance of your code. The only thing that would be interesting
about this would be if for some reason they actually couldn’t run HEP codes
very well indeed. But they can run them well. Which makes it a yawn for me. I’ve
no question that the LHC is ungodly impressive, of course. I just wish it were
in Texas and called something else.
Intel Fellows Panel
Some interesting questions asked and answered, many
questions lame. Like: “Will high-end Xeons pass mainframes?” Silly question.
Depends on what “pass” means. In the sense in which most people may mean, they
already have, and it doesn’t matter. Here are some others:
Q: Besides MIC, what else is needed for Exascale? A: We’re
having to go all the way down to device level. In particular, we’re looking at
subthreshold or near-threshold logic. We tried that before, but failed. Devices
turn out to be most efficient 20mv above threshold. May have to run at 800MHz.
[Implication: A whole lot of parallelism.] Funny how they talked about
near-threshold logic, and Justin Rattner just happened to have a demo of that
at the next day’s keynote.
Q: Are you running out of rare metals? A: It’s a question
of cost. Yes, we always try to move off expensive materials. Rare earths
needed, but not much; we only use them in layers like five atoms thick.
Q: Is Moore’s Law going to end? A: This was answered by Kelin
J. Kuhn, Fellow & Director of the Technology and Manufacturing Group – i.e.,
she really knows silicon. She noted that, by observation, at every given
generation it always looks like Moore’s Law ends in two generations. But it
never has. Every time we see a major impediment to the physics – several examples
given, going back to the 1980s and the end of Dennard scaling – something seems
to come along to avoid the problem. The exception seems to be right now: Unlike prior eras when it will end in two generations, there don't seem to be any clouds on this particular horizon at all. (While I personally know of no reason to dispute this, keep in mind that this is from Intel, whose whole existence seems tied to Moore's Law, and it's said by the woman who probably has the biggest responsibility to make it all come about.)
An aside concerning the question-taking woman with the microphone
on my side of the hall: I apparently reminded her of something she hates. She kept going elsewhere, even after standing right beside me for several minutes while I had my
hand raised. What I was going to ask was: This morning in the keynote we saw a
castle with a moat, and several knights dropping into the moat. The last two
days we also heard a lot about a knight which appears to take a ferry across
the moat of PCIe. Why are you strangling a TFLOP of computation with PCIe? Other accelerator vendors
don’t have a choice with their accelerators, but you guys own the whole
architecture. Surely something better could be done. Does this, perhaps, indicate
a lack of integration or commitment to the new architecture across the
organization?
Maybe she was fitted with a wiseass detection system.
Anyway, I guess I won’t find out this year.
No comments:
Post a Comment
Thanks for commenting!
Note: Only a member of this blog may post a comment.