I've managed to set up my system with two drives, and RAID 1 mirror across them both so I am protected in case of a failure using the various howtos on this site. Many thanks for that. Now what I really want to do is use this setup as a backup for when I do upgrades etc. So when I'm about to do an upgrade I will split the mirror/fail one of the drives, and do the upgrade on the other drive. If the upgrade works I will then add in the old drive again. Unless anyone can see any issues I believe this is the easy part. However if the upgrade fails, I'd want to fail my current broken drive and add in the OLD backup disk. This is where I'm a bit unsure. Does failing a disk with mdadm make it useless, or is it still effectively a raid disk that can be used again? If so how would I rebuild the array against this. My other option is to power down, physically disconnect a drive, then do the upgrade. If it fails I'd then have wipe and disconnect the new broken drive and connect my old one, before adding the new clean drive. I dont see any issues with this working, but its a right pain opening and disconnecting cases. I thought this would be a common scenario but I'm struggling to find a decent howto. If anyone does it or has any idea how to fail and rebuild from a failed drive that would be very useful. Thanks in advance P.S. Not sure what category this should go in, so can the mods please move as required.
OK, I've been playing about with a vmware machine and the following seems to work: Set up a system with /dev/md0 (sda1 and sdb1) as swap, /dev/md1 (sda2 and sdb2) as / I used fedora 11 Install bootloader onto both drives Boot as normal Check status: cat /proc/mdstat prove that both disks used Make a marker file: touch /BOTH Now fail drive B before we pretend to do upgrade mdadm --manage /dev/md0 --fail /dev/sdb1 mdadm --manage /dev/md1 --fail /dev/sdb2 mdadm --manage /dev/md0 --remove /dev/sdb1 mdadm --manage /dev/md1 --remove /dev/sdb2 now do our pretend upgrade we just create a file we recognise touch /UPGRADE cat /proc/mdstat and show that only drive A present reboot to check again that only drive A used and / contains UPGRADE cat /proc/mdstat ls / Now lets back out the upgrade Boot from opensuse 11.1 liveCD cat /proc/mdstat to show we still have only one drive used (A) Let mount it to prove we currently have the upgraded system mkdir /mnt/temp mount /dev/md1 /mnt/temp ls /mnt/temp should show the UPGRADE file we created earlier so we know we are on the new upgraded system unmount it again: umount /dev/md1 now lets delete the md drives mdadm --manage /dev/md0 --fail /dev/sda1 mdadm --manage /dev/md1 --fail /dev/sda2 mdadm --manage --stop /dev/md0 mdadm --manage --stop /dev/md1 cat /proc/mdstat will show nothing as we dont now have md drives Now lets rebuild against B drive (the old data we want to restore) mdadm --assemble /dev/md0 /dev/sdb1 mdadm --assemble /dev/md1 /dev/sdb2 mdadm --manage --run /dev/md0 mdadm --manage --run /dev/md1 Now cat /proc/mdstat shows the drive built with sdb Mount and check that it is the old, non upgraded system: mount /dev/md1 /mnt/temp ls /mnt/temp DOESNT have the UPGRADE file created earlier so we are back before the upgrade we can also see with mdadm --detail /dev/md1 Now we need to blank off the A DISK otherwise it seems to automatically get added back into array on reboot, and appears to take presidence again. mdadm --zero-superblock /dev/sda1 mdadm --zero-superblock /dev/sda2 Now reboot into system as normal ls / will show that we still dont have an UPGRADE file so we are on the old system as desired cat /proc/mdstat will show that we are still using sdb only for the arrays So we can now add the A drives back in: mdadm --manage /dev/md0 --add /dev/sda1 mdadm --manage /dev/md1 --add /dev/sda2 Which will do a full rebuild of A drives (check progress with watch cat /proc/mdstat) Do we now need to redo grub installation? I'm not sure but probably should to be safe grub install etc etc Now one final reboot to ensure we are all done.
I'm still not confident enough to try it on my real system as yet, as I need to check out a few things. What exactly does mdadm --zero-superblock /dev/sda1 do? Does this totally destroy data on that mirror? I'd prefer to keep it intact but unused at that point in case anything went wrong, but without doing this it seems to treat the A drive as the master and we end up back with the upgraded system we dont want. It also seems a bit brute force to have to actually drop the md volume, and recreate it. At this point I was hoping there was a 'rebuild using A' option instead. Has anyone who knows mdadm got any comments or a better/safer way of doing this? Thanks
I did something similar, when I moved server to another datacenter, and I wanted to have min. downtime. In my scenario I had two hardware identical server, so I remove one of the raid disk from working server and add it to identical server, and later re-add second disk. First be sure, that you have grub installed on both drives. Here is described how to do that. http://howtoforge.com/forums/showthread.php?p=80473#post80473 Now you can power down your machine and remove one of the disk. On the second one you do upgrades. If it works, delete partitionson the removed drive and poke it back to raid. If upgrades mess up your server, simpy swap drives, delete partition on messed up disk, delete partitions and add it to raid. Best way to do that is here http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array
Thanks for the reply lano. Thats pretty much what I've done then I think. The only (and main) difference is I want to do this regularly so really want to do it all in mdadm or through command line rather than physically unplugging drives. Otherwise I will need to keep opening cases etc and my computers aren't that accessible. It looks like I'm getting there, I just need a bit more confidence in exactly what everything is doing. Its one think playing with vmware, quite another with real data!
Hi, i need to upgrade (better: reinstall) a hosted server. Therefore I tried such a solution on my own. Alternative to this is running two hosted servers for a month, which costs me about 300+ EUR. Did anyone tested this method is working in production? Sure I know ... it needs testing on my own. Thanks in advance, best regards, Florian Lagg.