Load-Balanced MySQL Cluster Error with Cluster

stylez · Sep 25, 2006

Hello,

I've followed the tutorial on how to setup a load-balanced Mysql Cluster and everything seems to be working fine but just recently as I checked up on the services, one of the mysql-cluster isn't being recognized by ndb_mgm app. I've had this problem twice before and I thought I misconfigured it and reinstall the whole system on VM's, I thought I solved it but it seems to be reoccuring after a few days of completing the setup.

Here is my configuration for the 5 machines: (note all VMs)

sql-1 172.30.0.7 (runs ndbd and mysql)
sq-2 172.30.0.8 (runs ndbd and mysql)
loadb-1 172.30.0.110 (runs lb1 and ndb_mgm) [active]
loadb-2 172.30.0.9 (runs lb2) [passive]

virtual IP for cluster: 172.30.0.111

I can ping the virtual IP, I can access the mysql db's from 0.7 and 0.8 but when I try from 0.111, I get an error trying to connect.

Here's the output from show in ndb_mgm

Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 @172.30.0.7 (Version: 4.1.21, Nodegroup: 0, Master)
id=3 @172.30.0.8 (Version: 4.1.21, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @172.30.0.110 (Version: 4.1.21)

[mysqld(API)] 2 node(s)
id=4 (not connected, accepting connect from any host)
id=5 @172.30.0.8 (Version: 4.1.21)

I've restarted mysql on 0.7 and it seems to run fine, but ndb_mgm doesn't see it and even so, 0.8 is running it fine but I still can't connect. Everything worked last week when I completed the setup and I don't know what else I could do to check what may be erroring so that the cluster isn't working. Loadb-1 is the active load-balancers and it should direct the db to sql-2 but it doesn't seem to. I ran all the checks found on http://www.howtoforge.com/loadbalanced_mysql_cluster_debian_p8 and it all checks out fine and the active loadb-1 has the ip 172.30.0.111 as the virutal. If anyone has experience this or could shed some light on what I might be doing wrong that would be great. As I said, everything work 100% when I completed the inital install and I even tested when a single cluster and load balancer would go down, and it worked as the tutorial stated.

falko · Sep 26, 2006

Can you run the tests from http://www.howtoforge.com/loadbalanced_mysql_cluster_debian_p8 and post the results here? Also, are there any errors in the logs?

stylez · Sep 26, 2006

Command "ip addr sh eth0"

loadb-1:
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:a7:30:cf brd ff:ff:ff:ff:ff:ff
inet 172.30.0.110/24 brd 172.30.0.255 scope global eth0
inet 172.30.0.111/24 brd 172.30.0.255 scope global secondary eth0

loadb-2
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:1f:46:fd brd ff:ff:ff:ff:ff:ff
inet 172.30.0.9/24 brd 172.30.0.255 scope global eth0

Command "ldirectord ldirectord.cf status"

loadb-1:
ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 919

loadb-2:
ldirectord is stopped for /etc/ha.d/ldirectord.cf

Command: "

loadb-1: "ipvsadm -L -n"
IP Virtual Server version 1.0.11 (size=4096)
Prot LocalAddressort Scheduler Flags
-> RemoteAddressort Forward Weight ActiveConn InActConn
TCP 172.30.0.111:3306 wrr
-> 172.30.0.8:3306 Route 0 0 0
-> 172.30.0.7:3306 Route 0 0 0

loadb-2:
IP Virtual Server version 1.0.11 (size=4096)
Prot LocalAddressort Scheduler Flags
-> RemoteAddressort Forward Weight ActiveConn InActConn

Command: "/etc/ha.d/resource.d/LVSSyncDaemonSwap master status"

loadb-1:
master running
(ipvs_syncmaster pid: 1046)

loadb-2:
master stopped

Everything seems to check out but I'm still unable to connect. When I first installed the app and tested ndb_mgm, both NDB's show up, ndb MGM shows up and so does both MYSQLD. Now when I run a show all I get this the following:

[ndbd(NDB)] 2 node(s)
id=2 @172.30.0.7 (Version: 4.1.21, Nodegroup: 0)
id=3 @172.30.0.8 (Version: 4.1.21, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @172.30.0.110 (Version: 4.1.21)

[mysqld(API)] 2 node(s)
id=4 @172.30.0.8 (Version: 4.1.21)
id=5 (not connected, accepting connect from any host)

You can see that 172.30.0.7 mysqld isn't showing up, but it's running on 0.7 and I can access the mysql directly from it.

falko · Sep 27, 2006

What's the output of
Code:
netstat -tap
and
Code:
df -h
on 172.30.0.7? Are there any errors in the logs on 172.30.0.7?

stylez · Sep 27, 2006

sql-1:~# netstat -tap
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:mysql *:* LISTEN 27158/mysqld
tcp 0 0 *:www *:* LISTEN 813/apache2
tcp 0 0 *:ssh *:* LISTEN 800/sshd
tcp 0 0 sql-1.localdomain:2202 *:* LISTEN 27099/ndbd
tcp 0 0 sql-1.localdomain:35463 172.30.0.110:1186 ESTABLISHED27098/ndbd
tcp 0 0 sql-1.localdomain:35466 172.30.0.110:1186 ESTABLISHED27158/mysqld
tcp 0 0 sql-1.localdomain:mysql 172.30.0.110:56547 TIME_WAIT -
tcp 0 0 sql-1.localdomain:2202 172.30.0.8:49152 ESTABLISHED27099/ndbd
tcp 0 0 sql-1.localdomain:mysql 172.30.0.110:56521 TIME_WAIT -
tcp 0 148 sql-1.localdomain:ssh 172.30.0.2:1800 ESTABLISHED18132/0
tcp 0 0 sql-1.localdomain:35465 172.30.0.110:2202 ESTABLISHED27099/ndbd
tcp 0 0 sql-1.localdomain:2202 172.30.0.8:49149 ESTABLISHED27099/ndbd
tcp 0 0 sql-1.localdomain:35468 172.30.0.8:2202 ESTABLISHED27158/mysqld

sql-1:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 883M 424M 412M 51% /
tmpfs 126M 0 126M 0% /dev/shm

(The sql data I'm storing will be < 1mb in total, it's just user's ftp login information)

I've checked the logs and nothing seems out of place, there are no errors being thrown.

falko · Sep 28, 2006

stylez said:

sql-1:~# netstat -tap
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:mysql *:* LISTEN 27158/mysqld
tcp 0 0 *:www *:* LISTEN 813/apache2
tcp 0 0 *:ssh *:* LISTEN 800/sshd
tcp 0 0 sql-1.localdomain:2202 *:* LISTEN 27099/ndbd
tcp 0 0 sql-1.localdomain:35463 172.30.0.110:1186 ESTABLISHED27098/ndbd
tcp 0 0 sql-1.localdomain:35466 172.30.0.110:1186 ESTABLISHED27158/mysqld
tcp 0 0 sql-1.localdomain:mysql 172.30.0.110:56547 TIME_WAIT -
tcp 0 0 sql-1.localdomain:2202 172.30.0.8:49152 ESTABLISHED27099/ndbd
tcp 0 0 sql-1.localdomain:mysql 172.30.0.110:56521 TIME_WAIT -
tcp 0 148 sql-1.localdomain:ssh 172.30.0.2:1800 ESTABLISHED18132/0
tcp 0 0 sql-1.localdomain:35465 172.30.0.110:2202 ESTABLISHED27099/ndbd
tcp 0 0 sql-1.localdomain:2202 172.30.0.8:49149 ESTABLISHED27099/ndbd
tcp 0 0 sql-1.localdomain:35468 172.30.0.8:2202 ESTABLISHED27158/mysqld

sql-1:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 883M 424M 412M 51% /
tmpfs 126M 0 126M 0% /dev/shm

(The sql data I'm storing will be < 1mb in total, it's just user's ftp login information)

I've checked the logs and nothing seems out of place, there are no errors being thrown.
Click to expand...

What's in /etc/fstab? I could imagine it's a problem with your disk space or memory as a MySQL cluster needs lots of memory...

stylez · Sep 29, 2006

sql-1:~# cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
/dev/sda1 / ext3 defaults,errors=remount-ro 0 1
/dev/sda5 none swap sw 0 0
/dev/hda /media/cdrom0 iso9660 ro,user,noauto 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto 0 0

falko · Sep 29, 2006

You don't have much swap (only 126MB). And if your memory is low that could cause a problem... What's the output of
Code:
cat /proc/meminfo
?

stylez · Sep 29, 2006

sql-1:~# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 263208960 256610304 6598656 0 25546752 80146432
Swap: 82210816 0 82210816
MemTotal: 257040 kB
MemFree: 6444 kB
MemShared: 0 kB
Buffers: 24948 kB
Cached: 78268 kB
SwapCached: 0 kB
Active: 59388 kB
Inactive: 163688 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 257040 kB
LowFree: 6444 kB
SwapTotal: 80284 kB
SwapFree: 80284 kB

So you think I should bump up the memory? I default these VM's to have about 256mb of ram. I didn't think that the cluster would require much since its not hold much information.

stylez · Sep 29, 2006

So I bumped up the memory on both sql-1 and sql-2 to 512mb of ram.

sql-1:~# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 528752640 223784960 304967680 0 14512128 72069120
Swap: 82210816 0 82210816
MemTotal: 516360 kB
MemFree: 297820 kB
MemShared: 0 kB
Buffers: 14172 kB
Cached: 70380 kB
SwapCached: 0 kB
Active: 40808 kB
Inactive: 161348 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 516360 kB
LowFree: 297820 kB
SwapTotal: 80284 kB
SwapFree: 80284 kB

Still no change.

stylez · Sep 30, 2006

So I've looked into the issue abit more, when I try to access the connectioncheck table I get the following message:

ERROR 1105 (HY000): Failed to open 'connectioncheck', error while unpacking from engine

Also since I'm running VM's, I always ssh to the machine and didn't realize there was an error getting printed to the console.

DBI connect('database=ldirectordb;host=172.30.0.140ort3306','ldirector',...) failed: Unknown database 'ldirectordb' at /etc/ha.d/resource.d/ldirector line 1950

I saw that people were having this issue after restarting their cluster

http://forums.mysql.com/read.php?25,80009,80009

I wasn't sure if you've seen this before, because when you start from scratch it works, but after 1 reboot, it seems that the database somehow gets corrupted or something. I've tried dropping the database, but still doesn't work.

falko · Oct 1, 2006

Strange, I never had these problems after a reboot... Maybe you need to update your Perl-DBI module?

stylez · Oct 1, 2006

The perl DBI modules are the latest version. This really sucks because it works when I first have it initially setup. It's only after a restart, the sql db seems to get corrupted and the active load balancer will start to throw the error about the connectioncheck table error.

Log in or Sign up

Load-Balanced MySQL Cluster Error with Cluster

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

stylez New Member

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

Share This Page

Log in or Sign up

Load-Balanced MySQL Cluster Error with Cluster

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

stylez New Member

stylez New Member

falko Super Moderator Howtoforge Staff

stylez New Member

Share This Page

Useful Searches