Badblocks in LVM

Discussion in 'Installation/Configuration' started by linuxidiot, Sep 9, 2008.

  1. linuxidiot

    linuxidiot New Member

    hi

    I have a lvm setup with 4 hard disks in total 2TB size. Reiserfs is the file system used. Well coming to the problem.

    One of the hard disk has around 130 badblocks. I ran badblocks against the disk which is /dev/sda and it has given the block entries which are corrupted in a file.

    What is the best solution to fix this issue. I have a huge database running on that server which doesnt want to start due to this isssue. After doing little search i found that if the problem is on a single hard drive with ext2 file system

    The blocks exist on the sda4 partition on my case

    fsck -t ext -l badblocks-logfile /dev/sda4

    should fix the issue

    if its reiserfs then

    reiserfsck -B badblocks-logfile /dev/sda4

    should i do the same with lvm and will it work fine ? what am worried is if am going to attempt the above step will it disturb the current lvm setup or is there any other bestway to do it.

    I did come across another document which i couldnt follow the first step itself. http://smartmontools.sourceforge.net/BadBlockHowTo.txt

    Any help would be much appreciated.
     
  2. falko

    falko Super Moderator Howtoforge Staff

    Did you try the part starting with "From: Frederic BOITEUX" on that page?
     
  3. linuxidiot

    linuxidiot New Member

    I didnt try it because when i started with smartctl at the begining it failed so just left it.

    I will try from this part and will get back to you.

    Thank you for your reply.
     
  4. linuxidiot

    linuxidiot New Member

    I Followed each step carefully mentioned in the url http://smartmontools.sourceforge.net/BadBlockHowTo.txt by Federic BOITEUX and here are the Results of the test:

    Step 1
    smartctl -l selftest /dev/sda

    === START OF READ SMART DATA SECTION ===
    SMART Self-test log structure revision number 1
    Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

    # 1 Short offline Completed: read failure 90% 13365 227328439

    Step 2

    Disk /dev/sda: 60801 cylinders, 255 heads, 63 sectors/track
    Units = sectors of 512 bytes, counting from 0

    Device Boot Start End #sectors Id System
    /dev/sda1 * 63 401624 401562 83 Linux
    /dev/sda2 401625 12691349 12289725 82 Linux swap / Solaris
    /dev/sda3 12691350 24981074 12289725 83 Linux
    /dev/sda4 24981075 976768064 951786990 8e Linux LVM

    Step 3

    (227328439 - 24981075) = 202347364

    Step 4

    pvdisplay -c /dev/sda4 | awk -F: '{print $8}'
    4096

    To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this
    number by 2 : 4096 * 2 = 8192 blocks for each PE.

    Either, you can look in /etc/lvm/backup :
    # grep pe_start $(grep -l /dev/sda4 /etc/lvm/backup/*)
    pe_start = 384

    Step 5
    Then, we search in which PE is the badblock, calculating the PE rank
    in which the faulty block of the partition is :
    physical partition's bad block number / sizeof(PE) =

    202347364 / 8192 = 24700.6059


    Step 6

    server#lvdisplay --maps
    --- Logical volume ---
    LV Name /dev/vg/lv
    VG Name vg
    LV UUID zkcUSw-Dpum-aIXr-jE2Y-ob8Z-eqlF-AKQWnP
    LV Write Access read/write
    LV Status available
    # open 2
    LV Size 1.81 TB
    Current LE 473886
    Segments 4
    Allocation inherit
    Read ahead sectors 0
    Block device 253:0

    --- Segments ---
    Logical extent 0 to 119233:
    Type linear
    Physical volume /dev/sdb1
    Physical extents 0 to 119233

    Logical extent 119234 to 238467:
    Type linear
    Physical volume /dev/sdd1
    Physical extents 0 to 119233

    Logical extent 238468 to 354651:
    Type linear
    Physical volume /dev/sda4
    Physical extents 0 to 116183

    Logical extent 354652 to 473885:
    Type linear
    Physical volume /dev/sde1
    Physical extents 0 to 119233


    Step 7

    * bad block number for the filesystem :
    ---------------------------------------

    Since my physical extent for the partition /dev/sda4 starts from 0

    Physical extent

    (0 * 8192) + 384 = 384

    (202347364 - 384) = 202346980 /(sizeof(fs block) / 512)


    202346980 / (4096/512) =

    202346980 / 8 = 25293372.5

    As we can see from my lvdisplay all harddisks physical extents start from 0, so if i use the formula using physical extent its not working out. I need a formula which would get me the right block using logical extent range Logical extent 238468 to 354651: the calculated value in Step 5 24700.6059 also comes within the range. I tried using the same formula in Step 7 but with the logical extent values but the outputs are negative.

    Logical extent 238468 to 354651:

    (238468 * 8192) + 384 = 1953529856 + 384 = 1953530240

    (202347364 - 1953530240) = -1751182876 / 8 = -218897859.5


    Step 8

    * Test of the fs bad block :

    dd if=/dev/vg/lv of=block25293372 bs=4096 count=1 skip=25293372

    This test returns successful which means the calculated block is wrong for the lvm.

    #smartctl -A /dev/sda
    smartctl version 5.37 [i386-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
    Home page is http://smartmontools.sourceforge.net/

    === START OF READ SMART DATA SECTION ===
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
    3 Spin_Up_Time 0x0003 218 217 021 Pre-fail Always - 6100
    4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 98
    5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
    7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
    9 Power_On_Hours 0x0032 081 081 000 Old_age Always - 14034
    10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0
    11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0
    12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 97
    194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 44
    196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
    197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 3
    198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 1

    199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
    200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0

    As we can see from the above output there are 3 currently pending sectors and 1 Offline Uncorrectable sector. Badblocks sees around 100 badblocks in the /dev/sda hard disk.

    Also Mutt has two mails saying the following which confirms there is badblocks in the /dev/sda hard disk.

    The following warning/error was logged by the smartd daemon:

    Device: /dev/sda, 3 Currently unreadable (pending) sectors

    For details see host's SYSLOG (default: /var/log/messages).

    The following warning/error was logged by the smartd daemon:

    Device: /dev/sda, 1 Offline uncorrectable sectors

    For details see host's SYSLOG (default: /var/log/messages).



    Apart from the above steps i have tried reiserfsck --check /dev/vg/lv which said no corruptions found.

    debugreiserfs -B badblock.log /dev/vg/lv gave me no currently marked badblocks in the file system. This step is mentioned here http://chichkin_i.zelnet.ru/bad-block-handling.html

    I Really dont know how to fix this. Any help would be much appreciated.



    Thanks.

    Mohan.
     
    Last edited: Oct 8, 2008

Share This Page