Hi, I've been setting up a RAID1 (Ubuntu software RAID) and a fileserver ubuntu server 12.04.5/samba I made a test where I yanked drive 0 out to see if the other disk would work/take over (to test the RAID1). Then I shutdown the system and installed the disk again. The RAID still works but when testing using these commands, I get these funne results. md1 state shows "clean, degraded" !?!? And, MD1 also shows "removed" where "active sync /dev/sdb2" should have been. The disk I "removed" (physically, not a command) was disk0 where theres a part of MD0 and MD1 on. I made 2 x 2Gb swapfile on each disk and 2 x 495Gb Ext4 (bootable) for the data. (Building a fileserver) I re-added (physically, not a command) the disk and have read that it should re-sync itself: ".....Sometimes a disk can change to a faulty state even though there is nothing physically wrong with the drive. It is usually worthwhile to remove the drive from the array then re-add it. This will cause the drive to re-sync with the array. If the drive will not sync with the array, it is a good indication of hardware failure..." https://help.ubuntu.com/12.04/serverguide/advanced-installation.html#raid-maintenance Code: # mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Sun Aug 24 06:03:45 2014 Raid Level : raid1 Array Size : 1950656 (1905.26 MiB 1997.47 MB) Used Dev Size : 1950656 (1905.26 MiB 1997.47 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sun Aug 24 06:44:28 2014 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : whitley:0 UUID : 712d496a:18bddce6:1248aafd:cab10a66 Events : 17 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 # mdadm -D /dev/md1 /dev/md1: Version : 1.2 Creation Time : Sun Aug 24 06:04:08 2014 Raid Level : raid1 Array Size : 483267392 (460.88 GiB 494.87 GB) Used Dev Size : 483267392 (460.88 GiB 494.87 GB) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is persistent Update Time : Sun Aug 24 11:11:11 2014 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 Name : whitley:1 UUID : 7e1f4b1b:d9351351:7ee5ff14:5bf84c84 Events : 158 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 0 0 1 removed Code: # watch -n1 cat /proc/mdstat Every 1.0s: cat /proc/mdstat Sun Aug 24 11:32:17 2014 Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [ra id10] md0 : active raid1 sda1[0] sdb1[1] 1950656 blocks super 1.2 [2/2] [UU] md1 : active raid1 sda2[0] 483267392 blocks super 1.2 [2/1] [U_] unused devices: <none> Hope you can help me. I you need info, plase don't hesitate to say so, and I will provide it for you NB! The sync was completed .
Solved the problem. MD needed to be rebuild Hi, OK, I solved it myself Had to read a bit, but then I found the solution. I think this is what coursed it. I did an unsafe ejection of the diskdrive (disk0) and it chrashed the array. Using the command Code: # mdadm -D /dev/md0 and Code: # mdadm -D /dev/md1 I could see that it was MD1 we needed to fix. And I could see that sda2 was allright, which meant that sdb2 had to be the one missing. In the Server documentation it says: Code: If a disk fails and needs to be removed from an array enter: sudo mdadm --remove /dev/md0 /dev/sda1 Change /dev/md0 and /dev/sda1 to the appropriate RAID device and disk. Similarly, to add a new disk: sudo mdadm --add /dev/md0 /dev/sda1 I know sdb2 was missing, but I tried the command to remove it first anyroad: Code: mdadm --remove /dev/md1 /dev/sdb2 And I got an error because it wasn't there in the first place. I expected that! Then I tried to add the same and missing part using this command: Code: mdadm --add /dev/md1 /dev/sdb2 because it's in MD1 it's missing and the missing part is sdb2 After doing this, here's how it looks. Look at the last line. Rebuilding... Nice word to see right now: Code: # mdadm -D /dev/md1 /dev/md1: Version : 1.2 Creation Time : Sun Aug 24 06:04:08 2014 Raid Level : raid1 Array Size : 483267392 (460.88 GiB 494.87 GB) Used Dev Size : 483267392 (460.88 GiB 494.87 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sun Aug 24 12:07:44 2014 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Rebuild Status : 0% complete Name : whitley:1 UUID : 7e1f4b1b:d9351351:7ee5ff14:5bf84c84 Events : 260 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 2 8 18 1 spare rebuilding /dev/sdb2 And after some time, the rebuilding is complete and it's works perfectly again: Code: # mdadm -D /dev/md1 /dev/md1: Version : 1.2 Creation Time : Sun Aug 24 06:04:08 2014 Raid Level : raid1 Array Size : 483267392 (460.88 GiB 494.87 GB) Used Dev Size : 483267392 (460.88 GiB 494.87 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sun Aug 24 15:17:41 2014 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : whitley:1 UUID : 7e1f4b1b:d9351351:7ee5ff14:5bf84c84 Events : 485 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 2 8 18 1 active sync /dev/sdb2 Fact: When the Server Documentaion says "...It is usually worthwhile to remove the drive from the array then re-add it..", then it's a "remove command" and a "add command". Not just to remove the drive and "plug it" back in .
Dan, Thanks alot for this! Your solution saved me a lot of hours and grief! I just had to replace a failing disk in my Synology DS413 and after more than 90% of rebuilding, the disk group crashed / degraded again without proper error and with all disks reporting "normal". After reading your post I identified the failing raid array (/dev/md3 in my case) and the missing volume in there. Just did the re-add and it started to re-sync again. Hope it will finish now wihtout further hiccups. Thanks alot Kerry