Hello, So : 2 servers (ubuntu 14.04.2) unison to replicate (via cron job) /var/www and /var/vmail to server2. MySQL replication between the 2 servers. Server1 had a problem and had to be hard reset. Server2 has well done its job and service were not stopped for a long time. Server1 : unison : When manually launching unison (/usr/bin/unison), outputs: Code: Failed [www/clients/client47/web41/dev]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web41/.unison.dev.7558b044758fe1f2365309328acf2f4f.unison.tmp)] Failed [www/clients/client47/web55/dev]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web55/.unison.dev.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 1% 00:18 ETAFailed [www/clients/client47/web41/bin]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web41/.unison.bin.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 5% 00:05 ETAFailed [www/clients/client47/web41/lib]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web41/.unison.lib.7558b044758fe1f2365309328acf2f4f.unison.tmp)] Failed [www/clients/client47/web41/lib64]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web41/.unison.lib64.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 43% 00:00 ETAFailed [www/clients/client47/web41/usr]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web41/.unison.usr.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 45% 00:00 ETAFailed [www/clients/client47/web55/bin]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web55/.unison.bin.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 48% 00:00 ETAFailed [www/clients/client47/web55/lib]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web55/.unison.lib.7558b044758fe1f2365309328acf2f4f.unison.tmp)] Failed [www/clients/client47/web55/lib64]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web55/.unison.lib64.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 86% 00:00 ETAFailed [www/clients/client47/web55/usr]: Error in creating directory: Permission denied [mkdir(/var/www/clients/client47/web55/.unison.usr.7558b044758fe1f2365309328acf2f4f.unison.tmp)] 99% 00:00 ETAFailed [www/clients/client19/web19/log/20150424-access.log]: The source file /var/www/clients/client19/web19/log/20150424-access.log has been modified during synchronization. Transfer aborted. I still can SSH from Server1 to Server2 without password. Server1 : MySQL : Code: mysql> SHOW SLAVE STATUS \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: IP2.IP2.IP2.IP2 Master_User: slaveuser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000239 Read_Master_Log_Pos: 2173317 Relay_Log_File: mysqld-relay-bin.000617 Relay_Log_Pos: 2787604 Relay_Master_Log_File: mysql-bin.000237 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1594 Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Skip_Counter: 0 Exec_Master_Log_Pos: 2790748 Relay_Log_Space: 52518640 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1594 Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Replicate_Ignore_Server_Ids: Master_Server_Id: 2 1 row in set (0.00 sec) So I'm told Relay_Master_Log_File: mysql-bin.000237, but I don't have this file anymore : Code: ls -lh /var/log/mysql total 6,2G -rw-r----- 1 mysql adm 80K avril 24 14:51 error.log -rw-r----- 1 mysql adm 12K avril 24 06:09 error.log.1.gz -rw-r----- 1 mysql adm 14K avril 23 06:25 error.log.2.gz -rw-r----- 1 mysql adm 15K avril 22 06:41 error.log.3.gz -rw-r----- 1 mysql adm 16K avril 21 06:25 error.log.4.gz -rw-r----- 1 mysql adm 7,2K avril 20 06:34 error.log.5.gz -rw-r----- 1 mysql adm 12K avril 19 06:19 error.log.6.gz -rw-r----- 1 mysql adm 8,0K avril 18 06:10 error.log.7.gz -rw-rw---- 1 mysql adm 148M avril 14 06:39 mysql-bin.000251 -rw-rw---- 1 mysql adm 501M avril 15 04:00 mysql-bin.000252 -rw-rw---- 1 mysql adm 29M avril 15 06:19 mysql-bin.000253 -rw-rw---- 1 mysql adm 451M avril 16 06:27 mysql-bin.000254 -rw-rw---- 1 mysql adm 501M avril 17 01:11 mysql-bin.000255 -rw-rw---- 1 mysql adm 398M avril 17 06:27 mysql-bin.000256 -rw-rw---- 1 mysql adm 501M avril 18 04:07 mysql-bin.000257 -rw-rw---- 1 mysql adm 321M avril 18 06:36 mysql-bin.000258 -rw-rw---- 1 mysql adm 502M avril 19 01:33 mysql-bin.000259 -rw-rw---- 1 mysql adm 179M avril 19 06:32 mysql-bin.000260 -rw-rw---- 1 mysql adm 501M avril 20 02:31 mysql-bin.000261 -rw-rw---- 1 mysql adm 64M avril 20 06:36 mysql-bin.000262 -rw-rw---- 1 mysql adm 461M avril 21 06:34 mysql-bin.000263 -rw-rw---- 1 mysql adm 364M avril 22 06:41 mysql-bin.000264 -rw-rw---- 1 mysql adm 29K avril 22 06:41 mysql-bin.000265 -rw-rw---- 1 mysql adm 1,1M avril 22 06:41 mysql-bin.000266 -rw-rw---- 1 mysql adm 306 avril 22 06:41 mysql-bin.000267 -rw-rw---- 1 mysql adm 126 avril 22 06:41 mysql-bin.000268 -rw-rw---- 1 mysql adm 1,1K avril 22 06:41 mysql-bin.000269 -rw-rw---- 1 mysql adm 126 avril 22 06:41 mysql-bin.000270 -rw-rw---- 1 mysql adm 48K avril 22 06:41 mysql-bin.000271 -rw-rw---- 1 mysql adm 502M avril 23 05:46 mysql-bin.000272 -rw-rw---- 1 mysql adm 37M avril 23 06:32 mysql-bin.000273 -rw-rw---- 1 mysql adm 254M avril 23 13:45 mysql-bin.000274 -rw-rw---- 1 mysql adm 6,3M avril 23 14:22 mysql-bin.000275 -rw-rw---- 1 mysql adm 202M avril 23 20:48 mysql-bin.000276 -rw-rw---- 1 mysql adm 343M avril 24 06:16 mysql-bin.000277 -rw-rw---- 1 mysql adm 90M avril 24 14:55 mysql-bin.000278 -rw-rw---- 1 mysql adm 896 avril 24 06:16 mysql-bin.index Thanks a lot if you can assist a bit ! Nicolas
For mysql try: Code: stop slave; reset slave; change master to master_log_file='mysql-bin.000239', master_log_pos=2790748; start slave; This restart the replication from the last working position. If this fails, you can also dump the replicated databases from server and import them on the 2nd (you will loose any sql-data that is available only in the databases of the 2nd server). For unison: looks like a permission-error on your filesystem. Did you run fscheck?
Thanks a lot ! It seems that things have moved on Server2, so the command you told wasn't successfull. Server2 showed: Code: show master status; +------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +------------------+----------+--------------+------------------+ | mysql-bin.000239 | 43771583 | | | +------------------+----------+--------------+------------------+ 1 row in set (0.00 sec) So I ran on Server1 : Code: SLAVE STOP; SLAVE RESET; CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin.000239', MASTER_LOG_POS = 43771583; SLAVE START; And thanks to you : replication is now OK ! Many thanks ! Nicolas
Sure, but you lost anything between the log_pos 2790748 and 43771583 because you just set the pos to a higher level
Fortunately, there has been only a client & website creation and deletion during Server1 downtime. Databases are now OK. I forced an fsck on Server2 FileSystem and rebooted, but I still get the same kind of errors when running unison. Reading https://alliance.seas.upenn.edu/~bcpierce/wiki/?n=Main.UnisonFAQTroubleshooting : I even can't manually create the required directory on Server2 (loggued as root): Code: # LANG="en_US" mkdir /var/www/clients/client47/web41/dev mkdir: cannot create directory '/var/www/clients/client47/web41/dev': Permission denied Permissions are normal on parent directory: Code: # ls -lha /var/www/clients/client47/web41 total 44K drwxr-xr-x 11 root root 4,0K mars 5 09:10 . drwxr-xr-x 6 root root 4,0K févr. 27 09:27 .. drwxr-xr-x 2 web41 client47 4,0K déc. 24 09:07 cgi-bin drwxr-xr-x 6 root root 4,0K mars 5 09:11 etc drwxr-xr-x 2 root root 4,0K avril 25 08:28 log drwx--x--- 2 web41 client47 4,0K déc. 24 09:07 private drwxr-xr-x 2 root root 4,0K déc. 24 09:07 ssl drwxrwxrwx 2 web41 client47 4,0K déc. 24 09:07 tmp drwxr-xr-x 3 root root 4,0K mars 5 09:10 var drwx--x--x 26 web41 client47 4,0K mars 5 11:19 web drwx--x--- 2 web41 client47 4,0K déc. 24 09:07 webdav Never seen this ... Anything thought on what to check next ?
Hello Florian, Thanks a lot for the help ! So I removed the immutable attribute on the Server2 folders with: Code: chattr -i /var/www/clients/client47/web41 chattr -i /var/www/clients/client47/web55 And launched manually /usr/bin/unison: Code: error www/clients/client47/web41/dev/log error www/clients/client47/web41/dev/null error www/clients/client47/web41/dev/tty error www/clients/client47/web41/dev/urandom error www/clients/client47/web55/dev/log error www/clients/client47/web55/dev/null error www/clients/client47/web55/dev/tty error www/clients/client47/web55/dev/urandom 57% 00:00 ETAFailed [www/clients/client19/web19/log/20150427-access.log]: The source file /var/www/clients/client19/web19/log/20150427-access.log has been modified during synchronization. Transfer aborted. Unison log shows: Code: [ERROR] Skipping www/clients/client47/web41/dev/log path /var/www/clients/client47/web41/dev/log has unknown file type [ERROR] Skipping www/clients/client47/web41/dev/null path /var/www/clients/client47/web41/dev/null has unknown file type [ERROR] Skipping www/clients/client47/web41/dev/tty path /var/www/clients/client47/web41/dev/tty has unknown file type [ERROR] Skipping www/clients/client47/web41/dev/urandom path /var/www/clients/client47/web41/dev/urandom has unknown file type [ERROR] Skipping www/clients/client47/web55/dev/log path /var/www/clients/client47/web55/dev/log has unknown file type [ERROR] Skipping www/clients/client47/web55/dev/null path /var/www/clients/client47/web55/dev/null has unknown file type [ERROR] Skipping www/clients/client47/web55/dev/tty path /var/www/clients/client47/web55/dev/tty has unknown file type [ERROR] Skipping www/clients/client47/web55/dev/urandom path /var/www/clients/client47/web55/dev/urandom has unknown file type HERE GOOD UPDATED FILES Synchronization incomplete at 09:36:18 (8 items transferred, 8 skipped, 1 failed) skipped: www/clients/client47/web41/dev/log skipped: www/clients/client47/web41/dev/null skipped: www/clients/client47/web41/dev/tty skipped: www/clients/client47/web41/dev/urandom skipped: www/clients/client47/web55/dev/log skipped: www/clients/client47/web55/dev/null skipped: www/clients/client47/web55/dev/tty skipped: www/clients/client47/web55/dev/urandom failed: www/clients/client19/web19/log/20150427-access.log Thinking Files might need to be updated from Server2 to Server1, I have run the chattr remove command on Server1 web41 and web55 folders too. But I still got the same errors with Unison. Here is the dev folder on Server1 (empty on Server2): Code: # ls -lh /var/www/clients/client47/web41/dev total 0 srw-rw-rw- 1 root root 0 avril 23 14:24 log crw-rw-rw- 1 root root 1, 3 mars 4 23:07 null crw-rw-rw- 1 root root 5, 0 mars 4 23:07 tty crw-rw-rw- 1 root root 1, 9 mars 4 23:07 urandom Thanks again for your help. Nicolas