Friday, November 10, 2006

Expand an existing raid 5 array

My server currently sits with 4x120gig sata drives in raid5. I purchased 4x320gig drives to replace the drives already in there that are now full. Two options for transferring the raid 5 array over to the larger disks immediately sprang to mind.

Option 1: In Parallel

Take the existing four disks out of the server and place the new 4x320's in. Boot up the server on a livecd and create a new raid 5 array and mount it. After mounting the raid device configure networking. Take the old 4x120 drives and place them in another computer. Boot up a linux livecd and re-assemble the raid array. Configure networking and rsync or copy the data across to the server. Pros Still have the original drive set untouched. If any problems arise with the 320 set, can easily revert back to the 120 set. Cons This requires a lot of drive moving around which is just a pain. It requires having another computer that can connect 4 sata drives up. Renders two computers unusable throughout this process. Server services are not available.

Option 2: Replace

Take out one 120gig drive at a time and put in a 320gig. Let the array re-sync itself with the new fourth disk. It takes about an hour to add a new drive into the array. Rinse & repeat untill all four drives have been swapped over. At this point (approximately. four hours later), the same array is sitting on the 4x320 set. This however is an exact replica of the existing array and does not make use of the extra drive space the 320gig drives provide. In order to grow the array we can use mdadm (raid tool) to grow the array.
 # mdadm /dev/md5 -G   -z    max
Note, this is the same as
 # mdadm /dev/md5 --grow  --size=max
Now that th array has grown, the file system needs to be expanded to match.
  If using ext2 you can use resize2fs
  If using reiserfs you can use resize_reiserfs
An example using reiserfs
 # resize_reiserfs /dev/md5
At this point the entire size should now be available. Note Problems can occur when you start mucking about with the superblock so be careful. Resizing an array is risky so do it at your own risk and make sure you have a backup. For added safety don't perform these operations with the array mounted. Unmount if first. If you cannot unmount it first because it is in use. Boot into a livecd and perform the operation there. Pros Still have the original drive set untouched. If any problems arise with the 320 set, can easily revert back to the 120 set. Only ties up one computer (the server) during this time. Minus 5mins down time in between swapping out drives, the server is still useable from 3/4 drives whilst it re-syncs the fourth drive. This means all services are still available. Cons If the main system exists on the raid array then you can't really perform the resizing operations. It is still safer to boot into a livecd, re-assemble the array and grow/re-size it there. So, during this final operation you will have system downtime.

Result

I chose option two as it meant I didn't have to render another computer unusable during this time. The intial phase of swapping out the four 120gig drives with the 320gig drives went smoothly. I did however make one change to the process I outlined. Put in a 320gig drive. Partition it. Instead of making the raid partition the same size as the 120gig counterpart, I made it the larger size (ensuring it would fill the disk). Though the raid would only be sittin on part of it at the moment. Creating the file systems the larger size up front would allow for cutting out a possibly risky resize operation later. So far so good. Four 320gig drives formatted appropriately now with the same smaller raid array. All that was left was to grow the array to fill the extra space. I didn't think to perform this in a livecd environment first where I could unmount the array first. As such, I ran into some real problems and had to reboot. This then left the superblock in bad shape. Having hosed the superblock (and relying on the integrity of the actual data to still be ok), I booted into a livecd and created a raid device.
 # mknod /dev/md5 b 9 0
Re-assembling the raid device worked fine. Though it had lost its personality (raid 5) and couldn't read the superblock properly.
 # mdadm --assemble /dev/md5 /dev/sda2 /dev/sdb2
      /dev/sdc2 /dev/sdd2
At this point it was obvious the superblock needed to be re-constructed. Re-creating the array (not formatting) can fix this.
 # mdadm -C /dev/md5 --level=5 -size=max
   --raid-devices=4 /dev/sda2 /dev/sdb2
   /dev/sdc2 /dev/sdd2
Here the personality/raid level is specified (raid 5), as this could not originally be determined. The size size command can then be used to make sure it uses all available space (saving a grow command). Lastly, the devices are specified from which to construct the array. Because an array does already exist on these drives, you will be prompted if you still wish to create an array anyway. Answering yes, the array is re-created. At this point I then performed a check to see if I could now mount the array and check that all existing data was still intact.
 # mount /dev/md5 /mnt/gentoo
I then proceeded to check that the array was infact now recovering.
 # watch -n1 cat /proc/mdstat
The output of which confirmed that the array was recovering and would take a bit over three hours to complete. Taking a quick squiz through /mnt/gentoo everything still seemed to be in order. The real test will be when this recovery has finished and I close down the livecd and reboot the system. Edit: System is back up and running with no data loss. I did however have to reboot into a livecd to resize the array after all to get it to recognise the extra space.
 # resize_reiserfs
One annoying thing is the amount of *actual* formatted space vsmarketed disk space. 320Gig drives format to 298Gig whilst 500Gig drives format to about 420Gig. So the total size of the 4x320 in raid 5 is actually 3x298=894. That's enough room to keep me out of trouble for a little while. By the time that is full I will just replace them with something bigger. 1TB disks anyone?

6 Comments:

Anonymous pimaster said...

Just a little concerned about your approach to number 2. If all of your services are live whilst you replace one disk at a time, doesn't that mean any logs, data changes (this that and the other) mean that the original 120's would eventually be out of sync?

Glad to hear it all worked, but I just want to know if I have read this correctly.

11/10/2006 09:42:00 PM  
Blogger Joshua Hayes said...

I'm not entirely sure if I know what you mean but the system does have to be turned off for a few minutes to swap drives over if that's what you mean.

"Minus 5mins down time in between swapping out drives, the server is still useable from 3/4 drives whilst it re-syncs the fourth drive. This means all services are still available."

With regards to changes occuring whilst the fourth drive is re-syncing....The other 3 drives work as normal. Logs etc still get updated and file here and there (as you put it :) I assume in changes that occur to the array (during this process) also get rectifed (at the end possibly). Otherwise it would not 'add' it back into the array. It would be discarded as 'dirty'. By default, I believe the size difference of a partion can be < 1mb also.

I haven't looked into the specifics of 'how it resyncs' though I know of a few different algorithms you can set. But I have never had any problems with using it whilst syncing. There has been a couple of times when the power has gone out and it must have been reading/writing to/from the disk. Thus causing the array to become degraded. I've ssh'd in from Uni and re-added the drive back in and re-syncd it whilst never taking it down.

Just working on some cron jobs at the moment for automated backup ;)

11/12/2006 12:46:00 AM  
Anonymous pimaster said...

Me English is Good
One of your pros is "Still have the original drive set untouched. If any problems arise with the 320 set, can easily revert back to the 120 set."
If at the point you have 2x120 installed and 2x320 you have to go back to the 120's for some reason.

Actually, this could get very difficult to explain unless I could use my fingers.

Don't worry about it :P

11/14/2006 03:37:00 PM  
Anonymous Anonymous said...

how to expand scsi raid? I have 3x scsi with adaptec 2120S and need to replace all dirve to gain bigger space but i don't want to reinstall OS.

2/19/2007 07:38:00 PM  
Blogger john said...

yes it creates a huge loss when you lost all your data from your hardrive.One of my friend used raid 5 recovery services and he recovered hi lost data.

5/21/2010 07:09:00 PM  
Anonymous raid 5 recovery said...

Data is divided into blocks, and each block is stored on a different drive. RAID 5 efficiently utilizes the capacity of the disks, reducing the capacity by one disk. In case of a single disk failure, the array can be rebuilt.

9/10/2010 09:48:00 PM  

Post a Comment

Links to this post:

Create a Link

<< Home