I've lose the mysql master-master replication of ISPConfig

Discussion in 'Installation/Configuration' started by voltron81, Sep 1, 2010.

  1. voltron81

    voltron81 New Member

    Hi all,
    I have 2 mailservers based on ISPConfig 3.0.1.6.
    Basically they are replicated servers, with the master-master replication of the mysql and the sync of /var/vmail realized with glusterfs. It was working perfectly for almost one year. If one of the server was down for every reason, the replication was working just fine.

    Few days ago for an hardware fault, one of the server went down(SRV1). Now it's up again, but I've lose the replication of the mysql.
    I've realized the mysql replication following this how-to: http://www.howtoforge.com/mysql-5-master-master-replication-fedora-8

    Checking the slave status of mysql in both servers, I've this situation:

    Code:
    SRV1: Slave_IO_Running  Slave_SQL_Running
                 NO                    NO
    
    SRV2: Slave_IO_Running  Slave_SQL_Running
                 NO                    YES


    What is going on? Any suggestion?
    Thanks
    Michele
     
  2. Mark_NL

    Mark_NL Member

    Is there anything in the Error field of the slave that stopped working?
     
  3. voltron81

    voltron81 New Member

    Where I can check it?
    That is the result of the show master slave:
    Code:
    | Slave_IO_State | Master_Host   | Master_User | Master_Port | Connect_Retry | Master_Log_File  | Read_Master_Log_Pos | Relay_Log_File     | Relay_Log_Pos | Relay_Master_Log_File | Slave_IO_Running | Slave_SQL_Running | Replicate_Do_DB | Replicate_Ignore_DB | Replicate_Do_Table  | Replicate_Ignore_Table | Replicate_Wild_Do_Table | Replicate_Wild_Ignore_Table | Last_Errno | Last_Error | Skip_Counter | Exec_Master_Log_Pos | Relay_Log_Space | Until_Condition | Until_Log_File | Until_Log_Pos | Master_SSL_Allowed | Master_SSL_CA_File | Master_SSL_CA_Path | Master_SSL_Cert | Master_SSL_Cipher | Master_SSL_Key | Seconds_Behind_Master |
    
    
    
    |                | xxx.xxx.xxx.xxx | slave2_user |        3306 |            60 | mysql-bin.000359 |            17135291 | slave-relay.001352 |      17077099 | mysql-bin.000359      | No               | No                |                 |                     | dbispconfig.mail_user,dbispconfig.cron,dbispconfig.spamfilter_users,dbispconfig.mail_domain,dbispconfig.test,dbispconfig.mail_content_filter,dbispconfig.mail_transport,dbispconfig.client_template,dbispconfig.mail_forwarding,dbispconfig.firewall,dbispconfig.spamfilter_wblist,dbispconfig.client,dbispconfig.spamfilter_policy,dbispconfig.mail_user_filter,dbispconfig.dns_rr,dbispconfig.mail_access,dbispconfig.dns_soa,dbispconfig.mail_traffic,dbispconfig.dns_template,dbispconfig.mail_mailman_domain,dbispconfig.mail_get,dbispconfig.mail_greylist |                        |                         |                             |          0 |            |            0 |            17135291 |               0 | None            |                |             0 | No                 |                    |                    |                 |                   |                |                  NULL |
    
    I had a look into the log as well, but I can not find nothing...
    Thanks
    Michele
     
  4. Mark_NL

    Mark_NL Member

    just run: START SLAVE on the slave that is not running then (i couldn't see any errors)
     
  5. voltron81

    voltron81 New Member

    Hi Mark.
    I've tried to run the slave and that is the result:
    Code:
    mysql> START SLAVE;
    ERROR 1201 (HY000): Could not initialize master info structure; more error messages can be found in the MySQL error log
    I'm trying to understand which log I need to check, but I'm a bit lost.
    In /etc/mysql/my.cnf, I can see this line:
    Code:
    log-bin = /var/log/mysql/mysql-bin.log
    but in /var/log/mysql/ I've just this files:
    Code:
    mysql-bin.000531  mysql-bin.000533  mysql-bin.000535  mysql-bin.000537  mysql-bin.000539  mysql-bin.000541  mysql-bin.000543  mysql-bin.index
    mysql-bin.000532  mysql-bin.000534  mysql-bin.000536  mysql-bin.000538  mysql-bin.000540  mysql-bin.000542  mysql-bin.000544
    
    Suggestions?
    Thanks
    Michele

    PS: I was thinking about a solution like the one suggested in this website: http://blog.bit-matrix.com/2008/11/19/mysql-replication-error-1201-could-not-initialize-master-info-structure/
    Do you know if I can do it even if the 2 databases are not the same anymore? (because in this days I've changed some values in one of them)
     
    Last edited: Sep 1, 2010
  6. voltron81

    voltron81 New Member

    Ok I've a news.
    I was following this website http://blogama.org/node/49 and I was able to run the slave.
    Anyway I've got exatly the same error that the how-to said abot the LOAD DATA FROM MASTER;
    He explain how to do solve it, but I don't know the commands...

    Now in both servers if I run the show slave status I'll have:
    Code:
    SRV1: Slave_IO_Running  Slave_SQL_Running
                 NO                    YES
    
    SRV2: Slave_IO_Running  Slave_SQL_Running
                 NO                    YES
    But still no replication...

    :confused:
     
  7. Mark_NL

    Mark_NL Member

    Ehm, nice to see you found a site with a possible solution, but that just doesn't seem right ..

    replication takes place by sending all the queries that are entered on the master to the slave and execute them there as well. All those queries are saved in the binlog files and the slave reads that binlog and saves it on his own machine to execute it.
    if a binlog reaches a certain size, it'll close the file and start a new one .. depending on your configuration it will start deleting binlogs when it (let's say) 4th file ..

    let's say your replication runs fine and your master is writing his incoming queries in file mysql-bin.000001 .. happy writing etc, file full starts writing in mysql-bin.000002 etc etc .. until it reaches mysql-bin.000005 he will delete mysql-bin.000001

    so you have:
    mysql-bin.000002
    mysql-bin.000003
    mysql-bin.000004
    mysql-bin.000005

    if you stopped your replication (or replication crashed) when he was still writing halfway into mysql-bin.000001, and you start it again when it's writing in mysql-bin.000005 you'll never be able to create a consistent replicated server, because you're missing half the queries in mysql-bin.000001, that file is not on the server anymore ..

    oke, long story about replication ;-)

    short story: You need to stop all slaves to create a complete dump of the server on which the slave is running, dump it on the master and correct the master.info on the broken slave, then you'll be abdle to start the replication both ways again.

    good luck!

    edit: so check the master.info on the non working slave, look at the binlog it was last reading when it stopped working (line 2 in the file) .. if that file does NOT exist on the working slave, then you need to create a new dump.
     
    Last edited: Sep 2, 2010

Share This Page