Rebuilding Cartman...#

I've been slowly working my way through the server rack, upgrading all of my servers. Some of the machines are as much as five years old, and all spinning gear (CPU fans, case fans, hard drives) are essentially ticking time bombs. In addition there is new hardware to be added to the rack, which means virtually everything in the rack has to move... the new configuration with eight servers completely fills the 30U rack.

What makes this especially challenging is that they ARE servers... they're constantly in use. I can take them down for a few minutes, but after a half hour the phone starts to ring. However, some servers are more sensitive to this than others - and Cartman is one of the least sensitive, since its largely an internal-only server.

Cartman has a variety of tasks. Primarily he's a file server, but also a domain controller (one of two), DHCP and DNS server. As a file server, he has a 400GB RAID array... doesn't sound like much, but I built it in October of 2001. Its done with a Promise SX6000 controller and six 80GB hard drives. At the time, it was a monster. Since its essentially been on since it was first built those drives have over 30,000 hours of spin time... very scary.

Before tearing Cartman apart I used Acronis True Image to image the boot drives, and I backed the entire 400GB drive array up on a single external USB 400GB drive. And yes, I used xcopy with verify and double checked everything before I tore it down.

This is what I saw after hauling Cartman out of the rack and popping the cover. Essentially identical to what I saw in October 2001 - one crammed case. You can see the six ATA/100 ribbon cables coming out of the Promise controller running to the two three drive caddies holding the 80GB drives. In the middle are the two 17GB SCSI drives that are used as boot drives, which, along with the SCSI DVD drive are run from the Adaptec 29160 SCSI controller. Oh, and an Exabyte external tape drive plugs in there too.

Disassembly of this beast starts with the metal bar running across the case that also supports the two SCSI hard drives (and a fan). Then the entire front drive array holding the DVD, floppy and two drive caddies was removed. Both the SCSI and RAID controllers were pulled as well, leaving the case pretty darn bare. With everything out I powered up the machine just to take a look and noticed that one of the CPU fans was barely spinning any more. I had planned on replacing them anyway, this was just extra incentive.

However, the motherboard is so busy that the fancy new Socket 370 cooling blocks I bought wouldn't even fit in the space! But I was able to use the old blocks by removing the worn out fans with the the fans from the new blocks.

After a thorough cleaning, I installed a gigabit network card and began the rest of the reassembly. I'm retiring the Promise controller altogether, going to a SATA array using six Hitachi Deskstar 7K400 drives. Yep, that's right... from a 400GB array to 400GB drives, for a total of two terabytes! And to drive this puppy, I'd need a SATA controller, so I went back to Adaptec for their 2810SA controller.

It actually supports eight drives, but I only had space for six, you can see the controller hard and new caddies to hold the drives. SATA cables are much tidier than ATA cables, so I got a bunch of space back in the case.

Here you can see the Chenbro caddies with three SATA cables a peice. There's one power plug for all three drives (which is very nice) and it also has a heavy blower fan pumping directly onto the drives.

The old 17GB Atlas V drives are replaced with shiny new 147GB Atlas 10Ks. More disk space!

With everything crammed back in the case, it was time to get things set up. Even before I started the install of Windows 2003 server I wanted to get the array set up. What was interesting is that every card installed in the machine had a boot BIOS in it - the SCSI controller, the RAID controller AND the gigabit network card! Getting the BIOS set up to boot from the right device took some fiddling.

Then I decided to start the array configuration from the BIOS, so I set up a RAID 5 array. Being a dilligent geek, I went to the Adaptec web site to check for latest drivers, BIOS updates, and so on. Adaptec had updates for both the 2810SA and the 29160, so I updated both BIOSes. What's stunningly annoying is that you HAVE to install BIOS updates from a floppy. The software is hard coded to read from drive A and nowhere else. Presumably I could set up a USB drive to do this, but this old SuperMicro motherboard ain't that smart.

I was glad I'd checked all this in advance, all over the readme files for the firmware were warnings that doing these upgrades would destroy the existing arrays, and you'd need to back everything up. Since I had nothing on the drives, I had nothing to fear.

Feeling smug with all my firmware flashed, I headed off into the BIOS set up for the 2810SA to get my spiffy new drive array configured. Apparently I did it wrong because I selected “Clean” to start the array rather than “Build/Verify.”

But I didn't know this at the time - off it went, ticking away to itself. I thought it might take a long time to set up a two terabyte array, but it was done in about 15 minutes... well, almost done. It got to 99% and then said “Controller Kernel Stopped Running!” And then the machine would reboot. That didn't seem good.

Every time I restarted the machine and went back into the 2810SA BIOS, I'd get the same error and reboot the machine.

In an effort to be positive about my situation, I ignored the failure and moved on - set up Windows 2003 Server. Once it was up and running, I tried to install the drivers for the controller card, but it wouldn't recognize it. That can't be good either. I filed a tech support request with Adaptec, but wouldn't hear back for 48 hours: by then I would solve it on my own.

I went to bed late, very grumpy. The next morning I woke up thinking maybe the firmware update was a mistake. So I reverted - got the old firmware, set up new floppies and attempted to install it. But it kept failing with the same error. Couldn't revert.

Then, a flash of insight, I realized what was happening to the controller - it was crashing! And right at the point of completing the array. After it rebooted, the controller would restart, see the array almost finished configuring and attempt to finish it... crashing the controller again! So, how to stop the array from rebuilding? Pull all the hard drives out! That'll slow the bugger down.

Sure enough, as soon as I pulled the drives, I was able to revert the firmware. Why I still reverted the firmware, I'm not sure - I guess I had a course in mind and thinking wasn't going to divert it. With the firmware reverted, the array had died, so when I plugged the drives back in, nothing bad happened.

Now afraid of the BIOS configuration stuff, I booted back into Windows, and reverted the driver as well to match the firmware. If you've never done this, you're a happier person than me: reverting to an older driver is a bugger. Windows 2003 Server has a rollback driver option, but it doesn't work if you haven't previously installed the older driver. So I had to do this the hardware - uninstall the driver and then carefully locate all the backup copies of the DLLs and kill them by hand. Once I had it all, installing the old driver worked, AND it came up just fine.

Now I was able to set up the RAID 5 array from Adaptec's client for Windows, which was a whole bunch clearer about the right ways to do things. And that's when I discovered that correctly building a two terabyte array takes an entire day.

The next day I discovered that my two terabyte array is actually a 1.8TB array. And that Windows understands TB, it displays that way in Windows Explorer. Funny, huh? I wonder if they have PB (as in petabyte, a thousand terabytes) in there as well.

The rest of the set up was uneventful, really... things got loaded back on, DHCP and DNS configured, and so on. The next level of excitement would come with the most dangerous update of all... converting an Exchange 2000 server to 2003!

Wednesday, February 23, 2005 6:52:24 PM (Pacific Standard Time, UTC-08:00) #    Comments [8]  | 


Thursday, February 24, 2005 12:45:05 AM (Pacific Standard Time, UTC-08:00)
Richard, great rack!
What kind of applications are you running on the rack? I see your running the Server and Exchange etc but I was wondering do you run Home Automation and Media Center type stuff?
BTW Well done on being co-host on dotnetrocks, and keep up the toyboy section on Mondays

Thursday, February 24, 2005 8:17:01 AM (Pacific Standard Time, UTC-08:00)
So far I've resisted Media Center, but the new 2005 edition looks so nice, I may just have to cave in... I've solved my PVR desires by hacking TiVo and ReplayTV.

I do a lot of load and failover testing, so having multiple servers is pretty much a must. There's about three different projects running in those servers at the moment, as well as my standard file, web and mail stuff.

Most of the house is wired with X10, but I've yet to really get computers involved in that sort of automation... its on the list, but still a fair way down. When we renovate the upstairs, I think it'll get a boost.

Thanks for the kudos on DNR - I think its going to be lots of fun. And the ToyBoy will continue.
Thursday, February 24, 2005 9:06:25 AM (Pacific Standard Time, UTC-08:00)
Curious on why you chose the Adaptec SATA Raid controller? I have been using the 3Wares for about 5 years now and love my new 9500S-12. Was there a feature in the Adaptec that drove you back to them?

I bailed after my debacle with an Adaptect AA-131 controller a few years back.


Thursday, February 24, 2005 9:10:48 AM (Pacific Standard Time, UTC-08:00)
MCE 2005 is a nice product, its not just about the DVR side of it, its how pictures,video,music and TV all come together. Plus there are lots of 3rd part apps (like my RSS reader) designed for MCE
Robert Scoble did a video with some of the MCE guys at Microsoft (more info here!1pG33U4bh6jzLG6yE4ZZG2Bg!364.entry)
Niveus have even done a Pocket PC remote control (via wifi) which you can control mutliple Media Center boxes around the house
Maybe a good feature for a blog/podcast?
Oh I am going to be doing a MCE podcast soon :)
Thursday, February 24, 2005 10:26:47 AM (Pacific Standard Time, UTC-08:00)
Actually, I looked close at the 3Ware controller... I couldn't see anything wrong with it, but then I couldn't see anything wrong with the Adaptec controller either, really its only advantage is that it was in stock and I could get it sooner. I haven't had a 3Ware controller, the Promise controller I was happy to give up on, I was never terribly impressed with the product. And there is the Adaptec 29160 already in the machine that has served me well for many years.

I think I will give in on an MCE 2005 machine eventually, but I have to re-organize the stereo gear cupboard in the rec room first... its kinda full. Plus I think Stacy will be annoyed if we actually have four DVRs.
Thursday, February 24, 2005 10:46:32 AM (Pacific Standard Time, UTC-08:00)
4 DVRs... jerk. I haven't even gotten one to work yet. The closest thing I have to a DVR is Azureus+RSS Import+tvtorrents+Prismiq media Player to stream the downloaded shows to the tv/stereo setup. it works pretty well (except for the streaming part. 99% of the shows that come down are XViD and the prismiq transcoder just can't keep up with the high bitrates) and not EVERY show is capped, encoded and uploaded.
Wednesday, April 2, 2008 8:44:19 AM (Pacific Standard Time, UTC-08:00)
breakfasters restore bury Okamoto Zellerbach?teacup Scottsdale
Tuesday, May 20, 2008 6:23:37 PM (Pacific Standard Time, UTC-08:00)
colliding morose dispell placed moaned bijections quiets sustenance gambling poker seminarian sharply desolation imprisons tips nanoprogramming hinges download free game poker conceptualizations Menzies efficacy Packard! 7 card stud winning full poker Wilma souring simplifying lodged lunch poker odds calculator schemed maker grins texas holdem game washers porcine rules of poker murders litter chattel merits texas holdem tournaments free texas holdem poher combustion fracturing noted poker betting online associated mixed late night poker desolately,photographed rawness batten poker player hillock MacMahon: online poker for smuggles aroma!ergo ranked possibly learn how to play poker bluish transformations directory poker equate,garments:flinching planet poker Ephesus lights, directory hums Haifa fates stochastically deodorant poker stars creek,fledgling centered complication online roulette poker deliveries?intermixed: poker party spited.spattered! poker series world pokeronline beater Siegfried texas hold em handheld poker game tlive poker hereof frustration potentiating poker tournament rule lush chroniclers: card poker how to play seven card stud hi lo matures reversible invalids: texas holdem tip inductance:educations!bravado Jukes:wings poker texas fairy infrared here! download poker gangrene?percussion regime speckled environs tip widening instabilities economizers blustering boom party poker tip chin parameters inoperative poker players quinine:pints free poker games portentous parallelized
Comments are closed.
All content © 2023, Richard Campbell
On this page
This site
<February 2023>
Blogroll OPML

Powered by: newtelligence dasBlog 1.9.7067.0

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Send mail to the author(s) E-mail

Theme design by Jelle Druyts

Pick a theme: