I have a server running innodb. It's about 40G in size. It's getting hit with reads and writes pretty often. I'd like to minimize its downtime. I'm looking at ways to avoid taking it down and running mysqldump on a regular basis. As discussed here: mysql_database_replication 1. My basic plan was to sort out replication and back up the slave's database. From what I've read online, there seems to be some school of thought that replication is not reliable enough. i.e. the slave and master will drift and the data will not be restored properly in the case of losing the master. Is this a valid concern? 2. Is it feasible to setup this plan on a single server running two mysqlds and replicating itself on localhost? I would then copy the backups to a remote or external drive. 3. Perhaps I can take some shortcuts in my case. Let's say I'm not concened about the data being exact. If I have 1 billion rows, I want to backup everything, but I'm only really concerned about the last 100k rows say as they are currently most active. That is, the older rows are unlikely to change; and even if they did vary slightly, that would be okay. Is it feasible to have a script that would passively dump rows bit by bit in the background and rebuild the database slowly over time? Perhaps dumping 100 rows every 5 minutes say until the whole database was parsed for example.
Do you think you could explain how to do this to the same host? Run two mysqlds and have the master and slave on the same machine?
Actually, nevermind. I'm just going to setup another machine as the slave. The first problem I encountered was this: /usr/libexec/mysqld: File '/var/log/mysql-bin.index' not found (Errcode: 13) 091011 22:59:21 [ERROR] Aborting I changed: log-bin to /var/log/mysql/mysql-bin.log (rather than /var/log/mysql/ which I misread the guide as) and gave mysql permissions to /var/log/mysql The alternative could be to use /var/lib/mysql...
I setup the master with those additional innodb settings. My data files now look something like this: Code: 4.0K mysql-bin.000001 4.0K mysql-bin.000002 4.0K mysql-bin.000003 4.0K mysql-bin.000004 68K mysql-bin.000005 4.0K mysql-bin.index mysql-bin.000005 seems to be the only file growing. The index file isnt changing size. Here's my my.cnf: Code: [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql old_passwords=1 max_connections=500 query-cache-type = 1 query-cache-size = 128M set-variable=long_query_time=2 log-slow-queries=/var/log/log-slow-mysql.log # lines below setup logging for master binlog-do-db=db4120 server-id=1 innodb_flush_log_at_trx_commit = 1 #innodb_safe_binlog sync-binlog = 1 log-bin = /var/lib/mysql/mysql-bin log-bin-index = /var/lib/mysql/mysql-bin.index [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
I've followed the howto; have the db imported into the slave, but I'm having trouble getting the slave to connect to the master. I don't have iptables running on the master. I added: ALL: 10.0.0.103 to the masters's hosts.allow. Here's the error I'm seeing: It also won't let me log into phpMyAdmin on the slave, reporting: I assume this is the problem: Is there something I need to do to tell 10.0.0.111 to listen on 3306? I don't have skip-networking set...
Okay, so it now connects fine, but I get some errors. The filenames and position numbers look correct: This sounds like an innodb issue... more digging...
I can't seem to get the slave to start with innodb; those errors kill it. Here's the my.cnf on the slave: http://pastie.org/657206
Ack, it looks like this is my problem, missing a major step for innodb: http://dev.mysql.com/doc/refman/5.0/en/innodb-backup.html
I copied over the .frm files from the master to the slave, but it's still failing saying they have incorrect information. The file it's complaining about is the same on master and slave.. and has not changed on the master since I did the mysqldump. Copying the frm files also makes phpMyAdmin unable to load the database proberly, so I'm guessing this isn't the correct path...
I ended up dropping the slave's db and reimporting the dump file. It worked this time. I'm not sure what changed since the first attempt as I tried so many things.
So I was forced to upgrade the slave server. Unfortunately, this broke my replication. I'm not sure how to get this going again without starting over from scratch: Code: 100602 23:43:07 mysqld started 100602 23:43:07 [Warning] The syntax 'for replication startup options' is deprecated and will be removed in MySQL 6.0. Please use 'CHANGE MASTER' instead. /usr/local/mysql/libexec/mysqld: Table 'mysql.plugin' doesn't exist 100602 23:43:07 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it. InnoDB: Error: log file /usr/local/mysql/var/ib_logfile0 is of different size 0 5242880 bytes InnoDB: than specified in the .cnf file 0 16777216 bytes! 100602 23:43:07 [ERROR] Plugin 'InnoDB' init function returned error. 100602 23:43:07 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 100602 23:43:07 [ERROR] /usr/local/mysql/libexec/mysqld: unknown option '--skip-bdb' 100602 23:43:07 [ERROR] Aborting 100602 23:43:07 [Note] /usr/local/mysql/libexec/mysqld: Shutdown complete 100602 23:43:07 mysqld ended I tried running upgrade as it says: Code: [/usr/local/mysql] # ./bin/mysql_upgrade Looking for 'mysql' as: ./bin/mysql Looking for 'mysqlcheck' as: ./bin/mysqlcheck Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/tmp/mysql.sock' ./bin/mysqlcheck: Got error: 2002: Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2) when trying to connect FATAL ERROR: Upgrade failed I can't run the server because it needs to upgrade, and I can't upgrade because it needs to run the server... hrm.
I found a thread about removing the ib_logfiles. I backed these up and tried this. This brings the server online, but I'm not sure yet if it's replicating: Code: 100603 00:16:39 mysqld started 100603 0:16:39 [Warning] The syntax 'for replication startup options' is deprecated and will be removed in MySQL 6.0. Please use 'CHANGE MASTER' instead. /usr/local/mysql/libexec/mysqld: Table 'mysql.plugin' doesn't exist 100603 0:16:39 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it. 100603 0:16:39 InnoDB: Started; log sequence number 5 400666124 100603 0:16:39 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist 100603 0:16:39 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=VaHBoMM-relay-bin' to avoid this problem. 100603 0:16:39 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000328' at position 158108412, relay log './VaHBoMM-relay-bin.000626' position: 812320 100603 0:16:39 [ERROR] Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0 100603 0:16:39 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error 100603 0:16:39 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594 100603 0:16:39 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000328' position 158108412 100603 0:16:39 [ERROR] Column count of mysql.db is wrong. Expected 22, found 20. Created with MySQL 50037, now running 50136. Please use mysql_upgrade to fix this error. 100603 0:16:39 [ERROR] mysql.user has no `Event_priv` column at position 29 100603 0:16:39 [ERROR] Cannot open mysql.event 100603 0:16:39 [ERROR] Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler. 100603 0:16:39 [Note] /usr/local/mysql/libexec/mysqld: ready for connections. Version: '5.1.36-log' socket: '/tmp/mysql.sock' port: 3306 Source distribution 100603 0:16:39 [Note] Slave I/O thread: connected to master '[email protected]:3306',replication started in log 'mysql-bin.000331' at position 238320516 given what SHOW SLAVE STATUS is returning, I'm guess it's still offline.
Using this guide to step over corrupted logs, it appears to be slowly working: http://www.softwareprojects.com/resources/programming/t-how-to-recover-from-mysql-replication-event-too-sma-1859.html Should I expect a lot of these errors, or is something else making it fail? I've stepped over 5 of these errors manually so far, so I suspect something must be very wrong.
Yes, that's probably best. I have no idea what the state of the data on the slave is now really. After skipping about 10 corrupted bits of logs, it eventually ran clean for the remainder of the replication and is now working.