cluster : unison and SQL replication broken

Discussion in 'General' started by electronico_nc, Apr 24, 2015.

  1. Hello,
    So :
    2 servers (ubuntu 14.04.2)
    unison to replicate (via cron job) /var/www and /var/vmail to server2.
    MySQL replication between the 2 servers.

    Server1 had a problem and had to be hard reset.
    Server2 has well done its job and service were not stopped for a long time.

    Server1 : unison :
    When manually launching unison (/usr/bin/unison), outputs:
    Code:
    Failed [www/clients/client47/web41/dev]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web41/.unison.dev.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    Failed [www/clients/client47/web55/dev]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web55/.unison.dev.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
      1%  00:18 ETAFailed [www/clients/client47/web41/bin]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web41/.unison.bin.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
      5%  00:05 ETAFailed [www/clients/client47/web41/lib]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web41/.unison.lib.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    Failed [www/clients/client47/web41/lib64]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web41/.unison.lib64.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    43%  00:00 ETAFailed [www/clients/client47/web41/usr]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web41/.unison.usr.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    45%  00:00 ETAFailed [www/clients/client47/web55/bin]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web55/.unison.bin.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    48%  00:00 ETAFailed [www/clients/client47/web55/lib]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web55/.unison.lib.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    Failed [www/clients/client47/web55/lib64]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web55/.unison.lib64.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    86%  00:00 ETAFailed [www/clients/client47/web55/usr]: Error in creating directory:
    Permission denied [mkdir(/var/www/clients/client47/web55/.unison.usr.7558b044758fe1f2365309328acf2f4f.unison.tmp)]
    99%  00:00 ETAFailed [www/clients/client19/web19/log/20150424-access.log]: The source file /var/www/clients/client19/web19/log/20150424-access.log
    has been modified during synchronization.  Transfer aborted.
    I still can SSH from Server1 to Server2 without password.

    Server1 : MySQL :
    Code:
    mysql>  SHOW SLAVE STATUS \G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: IP2.IP2.IP2.IP2
                      Master_User: slaveuser
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin.000239
              Read_Master_Log_Pos: 2173317
                   Relay_Log_File: mysqld-relay-bin.000617
                    Relay_Log_Pos: 2787604
            Relay_Master_Log_File: mysql-bin.000237
                 Slave_IO_Running: Yes
                Slave_SQL_Running: No
                  Replicate_Do_DB:
              Replicate_Ignore_DB:
               Replicate_Do_Table:
           Replicate_Ignore_Table:
          Replicate_Wild_Do_Table:
      Replicate_Wild_Ignore_Table:
                       Last_Errno: 1594
                       Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 2790748
                  Relay_Log_Space: 52518640
                  Until_Condition: None
                   Until_Log_File:
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File:
               Master_SSL_CA_Path:
                  Master_SSL_Cert:
                Master_SSL_Cipher:
                   Master_SSL_Key:
            Seconds_Behind_Master: NULL
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error:
                   Last_SQL_Errno: 1594
                   Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
      Replicate_Ignore_Server_Ids:
                 Master_Server_Id: 2
    1 row in set (0.00 sec)
    So I'm told Relay_Master_Log_File: mysql-bin.000237, but I don't have this file anymore :
    Code:
    ls -lh /var/log/mysql
    total 6,2G
    -rw-r----- 1 mysql adm  80K avril 24 14:51 error.log
    -rw-r----- 1 mysql adm  12K avril 24 06:09 error.log.1.gz
    -rw-r----- 1 mysql adm  14K avril 23 06:25 error.log.2.gz
    -rw-r----- 1 mysql adm  15K avril 22 06:41 error.log.3.gz
    -rw-r----- 1 mysql adm  16K avril 21 06:25 error.log.4.gz
    -rw-r----- 1 mysql adm 7,2K avril 20 06:34 error.log.5.gz
    -rw-r----- 1 mysql adm  12K avril 19 06:19 error.log.6.gz
    -rw-r----- 1 mysql adm 8,0K avril 18 06:10 error.log.7.gz
    -rw-rw---- 1 mysql adm 148M avril 14 06:39 mysql-bin.000251
    -rw-rw---- 1 mysql adm 501M avril 15 04:00 mysql-bin.000252
    -rw-rw---- 1 mysql adm  29M avril 15 06:19 mysql-bin.000253
    -rw-rw---- 1 mysql adm 451M avril 16 06:27 mysql-bin.000254
    -rw-rw---- 1 mysql adm 501M avril 17 01:11 mysql-bin.000255
    -rw-rw---- 1 mysql adm 398M avril 17 06:27 mysql-bin.000256
    -rw-rw---- 1 mysql adm 501M avril 18 04:07 mysql-bin.000257
    -rw-rw---- 1 mysql adm 321M avril 18 06:36 mysql-bin.000258
    -rw-rw---- 1 mysql adm 502M avril 19 01:33 mysql-bin.000259
    -rw-rw---- 1 mysql adm 179M avril 19 06:32 mysql-bin.000260
    -rw-rw---- 1 mysql adm 501M avril 20 02:31 mysql-bin.000261
    -rw-rw---- 1 mysql adm  64M avril 20 06:36 mysql-bin.000262
    -rw-rw---- 1 mysql adm 461M avril 21 06:34 mysql-bin.000263
    -rw-rw---- 1 mysql adm 364M avril 22 06:41 mysql-bin.000264
    -rw-rw---- 1 mysql adm  29K avril 22 06:41 mysql-bin.000265
    -rw-rw---- 1 mysql adm 1,1M avril 22 06:41 mysql-bin.000266
    -rw-rw---- 1 mysql adm  306 avril 22 06:41 mysql-bin.000267
    -rw-rw---- 1 mysql adm  126 avril 22 06:41 mysql-bin.000268
    -rw-rw---- 1 mysql adm 1,1K avril 22 06:41 mysql-bin.000269
    -rw-rw---- 1 mysql adm  126 avril 22 06:41 mysql-bin.000270
    -rw-rw---- 1 mysql adm  48K avril 22 06:41 mysql-bin.000271
    -rw-rw---- 1 mysql adm 502M avril 23 05:46 mysql-bin.000272
    -rw-rw---- 1 mysql adm  37M avril 23 06:32 mysql-bin.000273
    -rw-rw---- 1 mysql adm 254M avril 23 13:45 mysql-bin.000274
    -rw-rw---- 1 mysql adm 6,3M avril 23 14:22 mysql-bin.000275
    -rw-rw---- 1 mysql adm 202M avril 23 20:48 mysql-bin.000276
    -rw-rw---- 1 mysql adm 343M avril 24 06:16 mysql-bin.000277
    -rw-rw---- 1 mysql adm  90M avril 24 14:55 mysql-bin.000278
    -rw-rw---- 1 mysql adm  896 avril 24 06:16 mysql-bin.index
    
    Thanks a lot if you can assist a bit !
    Nicolas
     
  2. florian030

    florian030 Well-Known Member HowtoForge Supporter

    For mysql try:
    Code:
    stop slave;
    reset slave;
    change master to master_log_file='mysql-bin.000239', master_log_pos=2790748;
    start slave;
    This restart the replication from the last working position. If this fails, you can also dump the replicated databases from server and import them on the 2nd (you will loose any sql-data that is available only in the databases of the 2nd server).

    For unison:
    looks like a permission-error on your filesystem. Did you run fscheck?
     
  3. Thanks a lot !
    It seems that things have moved on Server2, so the command you told wasn't successfull.
    Server2 showed:
    Code:
    show master status;
    +------------------+----------+--------------+------------------+
    | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
    +------------------+----------+--------------+------------------+
    | mysql-bin.000239 | 43771583 |              |                  |
    +------------------+----------+--------------+------------------+
    1 row in set (0.00 sec)
    So I ran on Server1 :
    Code:
    SLAVE STOP;
    SLAVE RESET;
    CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin.000239', MASTER_LOG_POS = 43771583;
    SLAVE START;
    And thanks to you : replication is now OK !
    Many thanks !
    Nicolas
     
  4. florian030

    florian030 Well-Known Member HowtoForge Supporter

    Sure, but you lost anything between the log_pos 2790748 and 43771583 because you just set the pos to a higher level
     
  5. Fortunately, there has been only a client & website creation and deletion during Server1 downtime.
    Databases are now OK.
    I forced an fsck on Server2 FileSystem and rebooted, but I still get the same kind of errors when running unison.
    Reading https://alliance.seas.upenn.edu/~bcpierce/wiki/?n=Main.UnisonFAQTroubleshooting :

    I even can't manually create the required directory on Server2 (loggued as root):
    Code:
    # LANG="en_US" mkdir /var/www/clients/client47/web41/dev
    mkdir: cannot create directory '/var/www/clients/client47/web41/dev': Permission denied
    Permissions are normal on parent directory:
    Code:
    # ls -lha /var/www/clients/client47/web41
    total 44K
    drwxr-xr-x 11 root  root     4,0K mars   5 09:10 .
    drwxr-xr-x  6 root  root     4,0K févr. 27 09:27 ..
    drwxr-xr-x  2 web41 client47 4,0K déc.  24 09:07 cgi-bin
    drwxr-xr-x  6 root  root     4,0K mars   5 09:11 etc
    drwxr-xr-x  2 root  root     4,0K avril 25 08:28 log
    drwx--x---  2 web41 client47 4,0K déc.  24 09:07 private
    drwxr-xr-x  2 root  root     4,0K déc.  24 09:07 ssl
    drwxrwxrwx  2 web41 client47 4,0K déc.  24 09:07 tmp
    drwxr-xr-x  3 root  root     4,0K mars   5 09:10 var
    drwx--x--x 26 web41 client47 4,0K mars   5 11:19 web
    drwx--x---  2 web41 client47 4,0K déc.  24 09:07 webdav
    Never seen this ...
    Anything thought on what to check next ?
     
  6. florian030

    florian030 Well-Known Member HowtoForge Supporter

    Remove the immutable-attribute. chattr -i /var/ww/clients/client47/web41
     
  7. Hello Florian,
    Thanks a lot for the help !
    So I removed the immutable attribute on the Server2 folders with:
    Code:
    chattr -i /var/www/clients/client47/web41
    chattr -i /var/www/clients/client47/web55
    And launched manually /usr/bin/unison:
    Code:
             error            www/clients/client47/web41/dev/log           error            www/clients/client47/web41/dev/null           error            www/clients/client47/web41/dev/tty           error            www/clients/client47/web41/dev/urandom           error            www/clients/client47/web55/dev/log           error            www/clients/client47/web55/dev/null           error            www/clients/client47/web55/dev/tty           error            www/clients/client47/web55/dev/urandom   57%  00:00 ETAFailed [www/clients/client19/web19/log/20150427-access.log]: The source file /var/www/clients/client19/web19/log/20150427-access.log
    has been modified during synchronization.  Transfer aborted.
    Unison log shows:
    Code:
    [ERROR] Skipping www/clients/client47/web41/dev/log
      path /var/www/clients/client47/web41/dev/log has unknown file type
    [ERROR] Skipping www/clients/client47/web41/dev/null
      path /var/www/clients/client47/web41/dev/null has unknown file type
    [ERROR] Skipping www/clients/client47/web41/dev/tty
      path /var/www/clients/client47/web41/dev/tty has unknown file type
    [ERROR] Skipping www/clients/client47/web41/dev/urandom
      path /var/www/clients/client47/web41/dev/urandom has unknown file type
    [ERROR] Skipping www/clients/client47/web55/dev/log
      path /var/www/clients/client47/web55/dev/log has unknown file type
    [ERROR] Skipping www/clients/client47/web55/dev/null
      path /var/www/clients/client47/web55/dev/null has unknown file type
    [ERROR] Skipping www/clients/client47/web55/dev/tty
      path /var/www/clients/client47/web55/dev/tty has unknown file type
    [ERROR] Skipping www/clients/client47/web55/dev/urandom
      path /var/www/clients/client47/web55/dev/urandom has unknown file type
    HERE GOOD UPDATED FILES
    Synchronization incomplete at 09:36:18  (8 items transferred, 8 skipped, 1 failed)
      skipped: www/clients/client47/web41/dev/log
      skipped: www/clients/client47/web41/dev/null
      skipped: www/clients/client47/web41/dev/tty
      skipped: www/clients/client47/web41/dev/urandom
      skipped: www/clients/client47/web55/dev/log
      skipped: www/clients/client47/web55/dev/null
      skipped: www/clients/client47/web55/dev/tty
      skipped: www/clients/client47/web55/dev/urandom
      failed: www/clients/client19/web19/log/20150427-access.log
    Thinking Files might need to be updated from Server2 to Server1, I have run the chattr remove command on Server1 web41 and web55 folders too.
    But I still got the same errors with Unison.
    Here is the dev folder on Server1 (empty on Server2):
    Code:
    # ls -lh /var/www/clients/client47/web41/dev
    total 0
    srw-rw-rw- 1 root root    0 avril 23 14:24 log
    crw-rw-rw- 1 root root 1, 3 mars   4 23:07 null
    crw-rw-rw- 1 root root 5, 0 mars   4 23:07 tty
    crw-rw-rw- 1 root root 1, 9 mars   4 23:07 urandom
    Thanks again for your help.
    Nicolas
     

Share This Page