RAID1 - State : clean, degraded

Discussion in 'Installation/Configuration' started by danhansen@denmark, Aug 24, 2014.

  1. danhansen@denmark

    danhansen@denmark Member HowtoForge Supporter

    Hi,

    I've been setting up a RAID1 (Ubuntu software RAID) and a fileserver ubuntu server 12.04.5/samba

    I made a test where I yanked drive 0 out to see if the other disk would work/take over (to test the RAID1). Then I shutdown the system and installed the disk again. The RAID still works but when testing using these commands, I get these funne results. md1 state shows "clean, degraded" !?!? And, MD1 also shows "removed" where "active sync /dev/sdb2" should have been.
    The disk I "removed" (physically, not a command) was disk0 where theres a part of MD0 and MD1 on. I made 2 x 2Gb swapfile on each disk and 2 x 495Gb Ext4 (bootable) for the data. (Building a fileserver)

    I re-added (physically, not a command) the disk and have read that it should re-sync itself:
    ".....Sometimes a disk can change to a faulty state even though there is nothing physically wrong with the drive. It is usually worthwhile to remove the drive from the array then re-add it. This will cause the drive to re-sync with the array. If the drive will not sync with the array, it is a good indication of hardware failure..."
    https://help.ubuntu.com/12.04/serverguide/advanced-installation.html#raid-maintenance

    Code:
    # mdadm -D /dev/md0
    /dev/md0:
            Version : 1.2
      Creation Time : Sun Aug 24 06:03:45 2014
         Raid Level : raid1
         Array Size : 1950656 (1905.26 MiB 1997.47 MB)
      Used Dev Size : 1950656 (1905.26 MiB 1997.47 MB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
        Update Time : Sun Aug 24 06:44:28 2014
              State : clean
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0
    
               Name : whitley:0
               UUID : 712d496a:18bddce6:1248aafd:cab10a66
             Events : 17
    
        Number   Major   Minor   RaidDevice State
           0       8        1        0      active sync   /dev/sda1
           1       8       17        1      active sync   /dev/sdb1
    
    
    
    # mdadm -D /dev/md1
    /dev/md1:
            Version : 1.2
      Creation Time : Sun Aug 24 06:04:08 2014
         Raid Level : raid1
         Array Size : 483267392 (460.88 GiB 494.87 GB)
      Used Dev Size : 483267392 (460.88 GiB 494.87 GB)
       Raid Devices : 2
      Total Devices : 1
        Persistence : Superblock is persistent
    
        Update Time : Sun Aug 24 11:11:11 2014
              State : clean, degraded
     Active Devices : 1
    Working Devices : 1
     Failed Devices : 0
      Spare Devices : 0
    
               Name : whitley:1
               UUID : 7e1f4b1b:d9351351:7ee5ff14:5bf84c84
             Events : 158
    
        Number   Major   Minor   RaidDevice State
           0       8        2        0      active sync   /dev/sda2
           1       0        0        1      removed
    
    Code:
    # watch -n1 cat /proc/mdstat
    
    Every 1.0s: cat /proc/mdstat                            Sun Aug 24 11:32:17 2014
    
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [ra
    id10]
    md0 : active raid1 sda1[0] sdb1[1]
          1950656 blocks super 1.2 [2/2] [UU]
    
    md1 : active raid1 sda2[0]
          483267392 blocks super 1.2 [2/1] [U_]
    
    unused devices: <none>
    

    Hope you can help me. I you need info, plase don't hesitate to say so, and I will provide it for you ;)

    NB! The sync was completed ;)

    .
     
    Last edited: Aug 24, 2014
  2. danhansen@denmark

    danhansen@denmark Member HowtoForge Supporter

    Solved the problem. MD needed to be rebuild ;)

    Hi,


    OK, I solved it myself ;) Had to read a bit, but then I found the solution.

    I think this is what coursed it. I did an unsafe ejection of the diskdrive (disk0) and it chrashed the array.

    Using the command
    Code:
    # mdadm -D /dev/md0
    and
    Code:
    # mdadm -D /dev/md1
    I could see that it was MD1 we needed to fix. And I could see that sda2 was allright, which meant that sdb2 had to be the one missing.
    In the Server documentation it says:

    Code:
    If a disk fails and needs to be removed from an array enter:
    sudo mdadm --remove /dev/md0 /dev/sda1
    
    Change /dev/md0 and /dev/sda1 to the appropriate RAID device and disk.
    
    Similarly, to add a new disk:
    sudo mdadm --add /dev/md0 /dev/sda1
    I know sdb2 was missing, but I tried the command to remove it first anyroad:
    Code:
    mdadm --remove /dev/md1 /dev/sdb2
    And I got an error because it wasn't there in the first place. I expected that!
    Then I tried to add the same and missing part using this command:
    Code:
    mdadm --add /dev/md1 /dev/sdb2
    because it's in MD1 it's missing and the missing part is sdb2

    After doing this, here's how it looks. Look at the last line. Rebuilding... Nice word to see right now:
    Code:
    # mdadm -D /dev/md1
    /dev/md1:
            Version : 1.2
      Creation Time : Sun Aug 24 06:04:08 2014
         Raid Level : raid1
         Array Size : 483267392 (460.88 GiB 494.87 GB)
      Used Dev Size : 483267392 (460.88 GiB 494.87 GB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
        Update Time : Sun Aug 24 12:07:44 2014
              State : clean, degraded, recovering
     Active Devices : 1
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 1
    
     Rebuild Status : 0% complete
    
               Name : whitley:1
               UUID : 7e1f4b1b:d9351351:7ee5ff14:5bf84c84
             Events : 260
    
        Number   Major   Minor   RaidDevice State
           0       8        2        0      active sync   /dev/sda2
           2       8       18        1      spare rebuilding   /dev/sdb2
    And after some time, the rebuilding is complete and it's works perfectly again:
    Code:
    # mdadm -D /dev/md1
    /dev/md1:
            Version : 1.2
      Creation Time : Sun Aug 24 06:04:08 2014
         Raid Level : raid1
         Array Size : 483267392 (460.88 GiB 494.87 GB)
      Used Dev Size : 483267392 (460.88 GiB 494.87 GB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent
    
        Update Time : Sun Aug 24 15:17:41 2014
              State : clean
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0
    
               Name : whitley:1
               UUID : 7e1f4b1b:d9351351:7ee5ff14:5bf84c84
             Events : 485
    
        Number   Major   Minor   RaidDevice State
           0       8        2        0      active sync   /dev/sda2
           2       8       18        1      active sync   /dev/sdb2
    

    Fact: When the Server Documentaion says "...It is usually worthwhile to remove the drive from the array then re-add it..", then it's a "remove command" and a "add command". Not just to remove the drive and "plug it" back in ;)

    .
     
    Last edited: Aug 24, 2014
  3. KerryXEX

    KerryXEX New Member

    Dan,
    Thanks alot for this! Your solution saved me a lot of hours and grief!
    I just had to replace a failing disk in my Synology DS413 and after more than 90% of rebuilding, the disk group crashed / degraded again without proper error and with all disks reporting "normal".
    After reading your post I identified the failing raid array (/dev/md3 in my case) and the missing volume in there. Just did the re-add and it started to re-sync again.
    Hope it will finish now wihtout further hiccups.

    Thanks alot
    Kerry
     

Share This Page