I hinted back before Tech Ed that Phillip, my big gaming machine, had died. Actually, it died only a couple of weeks after I put it together, I just didn't want to talk about it.
One thing I noticed about this machine when I built it was that it did run hot. My other machines can keep their temperatures under 35C with no load, this one with no load struggled with 45C. Add in SETI@Home working the processor at full bore and 45C requires at least 50% fan power. And then there are those darn video cards...
So one day I finally sat down to try out the full potential of the new machine with its lovely, top-of-the-line SLI video cards. I set up Half Life 2 running in 1920x1200 mode. It runs smoothly at around 100fps, in the really gnarly stuff it gets as low as 70... obnoxious, innit?
Enjoying myself immensely, I set off on a campaign of maximum destruction in Half Life 2, enjoying the view, when I feel the heat on my back. I turn around to see the temperature of the water loop hit 75C. Yeop, most of the way to boiling the water. I shut down HL2 to get rid of the load, but the damage was done. Within minutes the machine had died and wasn't coming back. Motherboard baked.
I went and did some math and discovered that each video card ran at about 80 watts. The processor generated only 55W! So when the video cards were working hard, the machine cooked.
Now I had two problems - the first was fixing the machine, which meant a motherboard transplant. This is not normally something I fear, but with watercooling its much more difficult. Especially the water cooling on this machine, with two video cards and a Northbridge chip right between them. The hoses are short and twist all over the place. And the last thing you want to do is breach the water loop.
Here's Phillip sitting on the service desk. Notice I plugged a speaker in to get a listen to any BIOS error beeps. Unfortunately there were none, supporting my belief that the motherboard was cooked.
My first attempt at extraction was to pop both video cards out of their slots. I powered up again at this point in the hopes that perhaps the video cards were dead and now I'd get a missing video card beep pattern - alas, no luck, no noise, no nothing. The machine is still dead. I'd have to peel all the water cooling gear off.
The motherboard extracted. What you didn't see is that I had to unscrew the motherboard from the case and pull it clear of the jacks in the back, then lift the board up with the water gear still on it. The problem was the Northbridge chip, which had a pair of nylon nut-and-bolt sets holding it on. The only way to get those off was to get to the nuts under the board.
The CPU was a bit tricky to remove just because there was so much surface area, but twisting and prying got the block off.
With the motherboard free, I cleaned everything up and transferred the RAM and processor to the new motherboard. In addition I removed the Northbridge fan from the new board (and put it onto the old board so it would be stock again and RMA-able).
Here the new motherboard is all prepped with fresh thermal grease, ready to be installed. And yes, I would remember to clean off the CPU block before I mounted it on the CPU again.
Remounting the water gear on board starts with the Northbridge chip block, since it once again has to be bolted down from the back. The board sits in the case at an angle so that I can get to the back of it, and I slid the block and bolts into the holes, then gently placed the nuts on the bolts until the threads catch. Then its a process of turning each nut a bit so that the block is squarely over the Northbridge chip.
Once the Northbridge is in place, everything else is lifted up so that the board can slide into place. The motherboard is screwed down and then the video cards went into place, allowing the CPU block to be replaced as well. Then the power/reset switch plugs, power/hard drive LED plugs, USB plug for the Matrix Orbital controller, SATA plug for the hard drive, IDE plug for the two DVD drives, and then all the power plugs for the motherboard (there are three: main plug, secondary 12 volt and a molex for the video) plus the additional power plugs for the video cards.
A quick top-up of water into the reservoir and I was ready to power up again for the first time in more than a month.
And the beast lives! If you look close, the screen is stopped on a BIOS error because there's no CPU fan. Which is a reasonable error since there is no CPU fan. Some quick BIOS tweaks took care of that.
So, remember when I said there was two problems? The first one is now resolved - the machine is back to life with a motherboard transplant. Problem number two is how to avoid cooking the motherboard again. Within minutes of powering up, running no high load software (like SETI@Home), the machine is already at 44C. Add SETI@Home and the temperature immediately rises a couple of degrees, causing the fan controller to turn up the fan to cool it back down again.
Fire up Half Life 2... well, I wasn't going to do that again.
I found the answer at Sprite - the guys I get most of my gear from. For whatever reason, they happened to have an Innovatek RADI-Dual in stock. I don't know why, they'd never sell the thing... well, okay, maybe not never.
This radiator is twice the size of the ones that I use in the case, and has mountings for two 120mm fans. It wouldn't fit in the case, but it would offer a whole bunch more cooling. Would it be enough? With it immediately available, it was too easy not to try it.
I mounted a pair of ultra-quiet Vantec 120mm fans, directly powered... I've burned up a couple of these lovely fans with controllers, so I didn't want to take the chance. And besides, even at full power these fans only generate 28dB of noise, so you can't hear 'em at all.
To connect the radiator into the loop I disconnected the top-side connector of the existing radiator and moved it to the bottom feed on the new radiator, then added a new hose from the top connector of the new radiator down to the old radiator. Powered up and started adding water to the pump reservoir as fast as I could to fill that new radiator. Several ounces later, everything was full and ticking along.
Its a little on the Mad Max side of things, but it sure does work!
Check out the front view of the machine, you can see the temperature of the water - just below 31C!
When I fired up SETI@Home, the temperature didn't move at all. So then the real test: play some Half Life 2. After one hour of play, the water temperature got to 32C. Methinks the fix is in!
Obviously, the system can't stay like this. But I'm afraid the only real answer to this problem is going to be much more radical: converting to central water cooling. That would involve putting a set of pumps, radiators and reservoir inside the server closet and running hoses through the walls to the two workstation bays in the office. The same way that you have a wall plate for power and a wall plate for network access, there would be a wall plate with water input and output. Then you'd just plug the machines in.
There are a bunch of advantages to the central water cooling solution. The first is that there will be a lot more water, and that water will be chilled. So the ability to cool will increase substantially. The machines will be even quieter having no fans (except the whisper fans in the power supply), no pumps and no radiators. Another huge bonus will be that the heat of the machines will actually be taken out of the room, being dumped into the server closet with its great big AC unit.
The downside is that the machines are no longer self-contained for cooling. When I have to service them, I'd need to use an external water cooling module, something like the CoolerMaster Aquagate or the Koolance Exos 2. All resolvable stuff.
So, for the moment, everything seems to be functioning here in water cooling land. I'm watching Phillip closely for any water leaks, I'm a bit concerned that the heat event might have damaged some seals. But so far, so good.