Backup directory /var/backup could not be unmounted

Discussion in 'Installation/Configuration' started by Stelios, Nov 24, 2020.

  1. Stelios

    Stelios Active Member HowtoForge Supporter

    Hi all,
    I've got 10 servers which all using the exact same way of taking backups.
    They all mount the backup storage over ssh.
    Only on two of them, I'm getting the following warning every day after the backup runs:
    Backup directory /var/backup could not be unmounted
    The command
    /usr/local/ispconfig/server/scripts/backup_dir_umount.sh
    failed.


    Under server config I got selected the option that is a remote mount. My fstab entry works fine as I can mount and umount manually without a problem.
    This is the specific entry in fstab just for reference:
    [email protected]:/mnt/raid5/email1 /var/backup fuse.sshfs defaults 0 0

    This is what I got in my backup_dir_umount.sh

    Code:
    root@email1:~# cat /usr/local/ispconfig/server/scripts/backup_dir_umount.sh
    #!/bin/bash
    umount /var/backup
    Running backup_dir_umount.sh manually mounts fine the remote storage and also the backup_dir_umount umount it. It just throws that error in my email every day but backups are stored fine on it.
    How on earth I can see why it throws that error?
    ISPconfig is 3.2

    Thanks

    PS: By the way, when the 3.2.1 will be available for download?
     
  2. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    When it is ready.
    You can follow developer info to find out more exactly. My guess is before end of month.
    Back to your problem.
    What kind of setup is this? What is in the script file backup_dir_mount.sh ?
    umout usually fails because some process is still using that mount as working directory. Try man lsof and something like
    Code:
    lsof /var/backup
    before the umount to see which processes have which files open.
     
  3. Stelios

    Stelios Active Member HowtoForge Supporter

    The backup_dir_mount.sh just mounts the directory and all it has is:
    Code:
    #!/bin/bash
    if ! mount | grep -q backup; then
        mount /var/backup || echo "Couldn't mount" && exit
    fi
    How I know when the back is about to finish to run the lsof /var/backup?
    That is running 3am and as mentioned in my post when I run the scripts manually including the backup it doesn't produce any error and mounts/umounts fine.
     
  4. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    Put the command in the unmount script.

    You could try adding a sleep command to wait a little before unmounting. Once you can identify what processes are using the mount, loop and wait on them to finish before unmounting.

    Also if you would, post here what you find, I'd be curious what is using the mount, if it's something from how the backup system now works, or just something local.
     
  5. Stelios

    Stelios Active Member HowtoForge Supporter

    @Jesse Norell I had sleep 5m and was still the same. I will raise the value and also include the lsof to see what will happen.
     
  6. Stelios

    Stelios Active Member HowtoForge Supporter

    @Jesse Norell I've added a sleep of 45m and still the same but the storage is not mounted; it just throws that email warning.
    The functionality works as expected.
     
  7. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    Just clarifying, with a 45min sleep time, the storage is unmounted, but you also get an email with the warning "Backup directory /var/backup could not be unmounted
    The command /usr/local/ispconfig/server/scripts/backup_dir_umount.sh failed.
    " ?

    Were you able to tell what processes were using it? (Via lsof, or even just browsing through ps ouput?)
     
  8. Stelios

    Stelios Active Member HowtoForge Supporter

    Yes @Jesse Norell exactly that is happening. I changed the backup time yesterday to run earlier so I can monitor the process but it didn't run. I guess I have to run ispconfig cron or something else before it gets the value?
    Once I switch to a later time it worked.
     
  9. Stelios

    Stelios Active Member HowtoForge Supporter

    I got some progress....in one of the server that I was getting the email I didn't receive today. I logged in and saw that /var/backup was still mounted.

    Code:
    root@email1:~# lsof /var/backup
    COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    php     31042 root    5r   DIR   0,46     4096   84 /var/backup/mail6
    
    root@email1:~# ps ax | grep 31042
    14208 pts/0    S+     0:00 grep 31042
    31042 ?        S      0:01 /bin/php -q -d disable_classes= -d disable_functions= -d open_basedir= /usr/local/ispconfig/server/cron.php
    
     
  10. Stelios

    Stelios Active Member HowtoForge Supporter

    @Jesse Norell shall I report this to gitlab?
    For some reason the ispconfig cron still using the mount while there is nothing to backup.
     
  11. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    Would you have multiple backups setup to run at different times, where a later one fires off while an earlier hasn't finished yet?
     
  12. Stelios

    Stelios Active Member HowtoForge Supporter

    Yes I do. I can change the times if you think that is causing the problem.
    Today one more problem appeared. I got the warning email from email3 server which when I logged in the remote path wasn't mounted and the backup finished ok.
    On the other hand on email1 server, I got no emails but the remote storage is still mounted by the same process (cron) and looks that still is running (is not getting any backup as that started 9 hours ago and took just an hour or so to finish).
    This is the output:

    Code:
    root@email1:~# lsof /var/backup
    COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    php     318 root    5r   DIR   0,46     4096   84 /var/backup/mail6
    root@email1:~# ps ax | grep 318
      318 ?        S      0:00 /bin/php -q -d disable_classes= -d disable_functions= -d open_basedir= /usr/local/ispconfig/server/cron.php
    23133 pts/0    S+     0:00 grep 318
    root@email1:~# 
     
  13. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    It would probably be worth filing an rfe in the issue tracker for the server to track ongoing backups and not run the unmount script if any other backups have not completed. For now, either change your script to only run if nothing is using the mount point (eg. based on lsof output), or just kill the error (add " 2>/dev/null") and the last running unmount should get the job done.
    Ie. if I read that correctly, a backup finished and 8 hours later the php process hasn't exited? See what shows up in both ispconfig.log and cron.log if you turn up the debug level in both server config and $conf['log_priority'] in /usr/local/ispconfig/server/lib/config.inc.php.

    Also see what other processes would be related, eg. "ps --ppid 318", or look up the process group id for the running php daemon ("ps -p 318 -o pgrp") and see what else is in that group ("ps -wg ####").
     
  14. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

  15. jeensg

    jeensg Member

    Hey there, especially Stelios,
    did you find out anything else?
    I do have the same problem and don't find a reason why this happens. Everything works as expected, every morning I have a look - backups are done and backup is unmounted ... but the error email is there.
    lsof /var/backup in /usr/local/ispconfig/server/scripts/backup_dir_umount.sh doesn't show anything (before umounting, yes).
    I'n not all updated with ISPConfig right now (3.2.5, will do it soon) but I figure there was no change to that since the bug-report didn't close.
    Thank you for your hints.
     
  16. jeensg

    jeensg Member

    Just checked right now after Mail came again. The normal backup-processes are still running, I do see especially gzip/mysqldump of all websites one after another (webxxx). Did anybody follow on this or shall I ignore it by using the hint from Jesse Norell (add " 2>/dev/null"). If so, add it to the line "umount /var/backup"? Thank you for help again.
     
  17. Stelios

    Stelios Active Member HowtoForge Supporter

    @jeensg sorry for the late reply.
    This is what I got in my backup_dir_umount.sh and seems to work ok.

    Code:
    #!/bin/bash
    sleep 2m
    umount /var/backup
     
  18. jeensg

    jeensg Member

    Thanks for your answer! Unfortunately this doesn'T work for me. I tried exactly the same content, but it won't work.
    I also tried sleep 60m , but then the same email arrives just one hour later as normal, although nothing is using the mount-point anymore. I don't get it.
    Also the "lsof /var/backup" doesn't appear in the mail, altough the mountpoint is still used / shown when logged in.

    Next thing I'll try is the "2>/dev/null" - but actually I'm not satisfied at all with that, because I will not get any error messages then :-(
     
  19. jeensg

    jeensg Member

    So, another Try :) the following I put in my umount-script-file (yes, MISspelling the last mount-point):
    Code:
    #!/bin/bash
    sleep 2m
    lsof /var/backup
    echo test
    umount /var/backu
    This resulted in 2 (!) error-mails, the last one 8 minutes after the first one (and the mount-point is NOT unmounted, as expected). If I see it right, somehow the script is triggered twice, right? But why? Does anybody know how I could check this?
    btw: the lsof-command has no output in the email ... how can I produce this output?
     
  20. jeensg

    jeensg Member

    Next try, I'll stay up to it :) so I corrected the lsof-part in the umount-script to really get an email:
    Code:
    #!/bin/bash
    lsof /var/backup | mail -s lsof-check root
    umount /var/backup
    -> approximately 4 min after Backup-start in night, the email with lsof-check arrives and tells me, that a php-process is still running (backup of a mail-account/-domain) -- the same time an email is arriving, that the mount-point could not be unmounted
    -> 10min after Backup-start another lsof-check-mail arrives saying, that nothing is going on ... which means, that nothing is using the mount-point anymore and the umounting was working as expected

    The question remains why the unmount-script is triggered too soon ... and therefor twice. I hope somebody can help me. (would it be better to open another thread? Please also tell me. Or another bug report somewhere, if it is one?)

    btw: does anybody on how to find out which domains/accounts are associated with folder "mailxy"? For the websites there is an ID in ISPConfig, for mails unfortunately not.
     

Share This Page