MySQL Cluster HOWTO - Cannot load Virtual IP address

samu · Jan 10, 2007

Hi, I followed the MySQL Cluster HOWTO (many compliments to the author!).
The cluster nodes and the management node are set up correctly in fact from ndb_mgmd I'm able to see all the nodes connected to the cluster manager.

I'm running Ubuntu 6.10 and I installed via apt-get the packages heartbeat-2 and ldirectord-2 from Ubuntu repositories (I need version 2 because with the packages available from ultramonkeys my system hangs).

My problem comes when I try to configure the load balancer, my system doesn't show the virtual IP address when I run: ip addr sh.

My configuration files are listed below:

ha.cf
*******************************
logfacility local0
auto_failback off
bcast eth0
mcast eth0 225.0.0.1 694 1 0
node ron
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster

haresources
*********************************
ron \
LVSSyncDaemonSwap::master \
ldirectord::ldirectord.cf \
IPaddr2::192.168.1.65/24/eth0/192.168.1.255

authkeys
*********************************
auth 3
3 md5 myauthpassword

ldirectord.cf
*********************************
# Global Directives
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes

virtual=192.168.1.65:3306
service=mysql
real=192.168.0.62:3306 gate
real=192.168.0.100:3306 gate
checktype=negotiate
login="root"
passwd="mysqlrootpassword"
database="ldirectord"
request="SELECT * FROM connectioncheck"
scheduler=wrr

I also set in /etc/sysctl.conf, net.ipv4.ip_forward=1.

I have the cluster manager on 192.168.1.61 and I want the virtual IP to be 192.168.1.65.
But I'm not able to see any virtual IP address.

#ip addr sh
...
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:02:3f:be:13:95 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.61/24 brd 192.168.1.255 scope global eth0
inet6 fe80::202:3fff:febe:1395/64 scope link
valid_lft forever preferred_lft forever

Someone has any ideas of where is the problem?

Thanks for any help,
Samuele.

sohaileo · Jan 10, 2007

Dear samuele,

First of all check the ha-log file which will you find in log directory, i.e /var/log. Whenever your heartbeat comes up and running it writes very useful information in form of logs. With the help of this you can diagnose the problem.
You can do this that restart your heartbeat and paste tail -n 50 here to check what is going on your configuration.

Regards,

samu · Jan 10, 2007

Thanks for your answer.
I checked /var/log but there's no ha-log file.

I started heartbeat:
root@ron:/etc/ha.d# /etc/init.d/heartbeat start
Starting High-Availability services:
2007/01/10_13:41:29 INFO: IPaddr2 Resource is stopped
Done.

And tail -f /var/log/messages says:
Jan 10 13:41:28 localhost logd: [7047]: info: logd started with default configuration.
Jan 10 13:41:28 localhost logd: [7047]: WARN: Core dumps could be lost if multiple dumps occur
Jan 10 13:41:28 localhost logd: [7047]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
Jan 10 13:41:28 localhost logd: [7048]: info: G_main_add_SignalHandler: Added signal handler for signal 15
Jan 10 13:41:28 localhost logd: [7047]: info: G_main_add_SignalHandler: Added signal handler for signal 15
Jan 10 13:41:29 localhost heartbeat: [7199]: WARN: Core dumps could be lost if multiple dumps occur
Jan 10 13:41:29 localhost heartbeat: [7199]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
Jan 10 13:41:29 localhost heartbeat: [7199]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
Jan 10 13:41:29 localhost heartbeat: [7199]: info: **************************
Jan 10 13:41:29 localhost heartbeat: [7199]: info: Configuration validated. Starting heartbeat 2.0.7
Jan 10 13:41:29 localhost heartbeat: [7200]: info: heartbeat: version 2.0.7
Jan 10 13:41:29 localhost heartbeat: [7200]: info: Heartbeat generation: 18
Jan 10 13:41:29 localhost heartbeat: [7200]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 10 13:41:29 localhost heartbeat: [7200]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 10 13:41:29 localhost heartbeat: [7200]: info: Removing /var/run/heartbeat/rsctmp failed, recreating.
Jan 10 13:41:29 localhost heartbeat: [7200]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Jan 10 13:41:29 localhost heartbeat: [7200]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Jan 10 13:41:29 localhost heartbeat: [7200]: info: glib: UDP multicast heartbeat started for group 225.0.0.1 port 694 interface eth0 (ttl=1 loop=0)
Jan 10 13:41:29 localhost heartbeat: [7200]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Jan 10 13:41:29 localhost heartbeat: [7200]: info: Comm_now_up(): updating status to active
Jan 10 13:41:29 localhost heartbeat: [7200]: info: Local status now set to: 'active'

I try these commands to check if heartbeat is running:

root@ron:/etc/ha.d# cl_status hbstatus
Heartbeat is stopped on this machine.

root@ron:/etc/ha.d# /etc/ha.d/resource.d/LVSSyncDaemonSwap master eth0 status
master stopped

root@ron:/etc/ha.d# ldirectord /etc/ha.d/ldirectord.cf status
ldirectord is stopped for /etc/ha.d/ldirectord.cf

It seems that heartbeat is not running... but why? I've just started it...

sohaileo · Jan 10, 2007

Well check this is the one i've using for my mysql HA.
Code:
#
debugfile /var/log/ha-debug
#
logfile /var/log/ha-log
#
logfacility local0
#
keepalive 2
#
deadtime 30
#
ucast eth1 10.0.0.156
#
auto_failback off
#
node db1
node db2
#
ping RouterIP
#
respawn hacluster /usr/lib/heartbeat/ipfail
#
May this help you regarding your configuration.

Regards,

samu · Jan 10, 2007

I've added to my ha.cf the following lines:
debugfile /var/log/ha-debug
logfile /var/log/ha-log

And now I have the heartbeat log files ha-log and ha-debug in /var/log.

I restarted heartbeat and this is the content of the log files:

ha-log:
-------

heartbeat[13819]: 2007/01/10_22:44:42 WARN: Core dumps could be lost if multiple dumps occur
heartbeat[13819]: 2007/01/10_22:44:42 WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
heartbeat[13819]: 2007/01/10_22:44:42 WARN: Logging daemon is disabled --enabling logging daemon is recommended
heartbeat[13819]: 2007/01/10_22:44:42 info: **************************
heartbeat[13819]: 2007/01/10_22:44:42 info: Configuration validated. Starting heartbeat 2.0.7
heartbeat[13820]: 2007/01/10_22:44:42 info: heartbeat: version 2.0.7
heartbeat[13820]: 2007/01/10_22:44:42 info: Heartbeat generation: 21
heartbeat[13820]: 2007/01/10_22:44:42 info: G_main_add_TriggerHandler: Added signal manual handler
heartbeat[13820]: 2007/01/10_22:44:42 info: G_main_add_TriggerHandler: Added signal manual handler
heartbeat[13820]: 2007/01/10_22:44:42 info: Removing /var/run/heartbeat/rsctmp failed, recreating.
heartbeat[13820]: 2007/01/10_22:44:42 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
heartbeat[13820]: 2007/01/10_22:44:42 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
heartbeat[13820]: 2007/01/10_22:44:42 info: glib: UDP multicast heartbeat started for group 225.0.0.1 port 694 interface eth0 (ttl=1 loop=0)
heartbeat[13820]: 2007/01/10_22:44:42 info: G_main_add_SignalHandler: Added signal handler for signal 17
heartbeat[13820]: 2007/01/10_22:44:42 info: Comm_now_up(): updating status to active
heartbeat[13820]: 2007/01/10_22:44:42 info: Local status now set to: 'active'
heartbeat[13820]: 2007/01/10_22:44:42 ERROR: socket_wait_conn_new: trying to create in /var/run/heartbeat/register bind:: No such file or directory
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Emergency Shutdown: Master Control process died.
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Killing pid 13820 with SIGTERM
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Killing pid 13824 with SIGTERM
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Killing pid 13825 with SIGTERM
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Killing pid 13826 with SIGTERM
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Killing pid 13827 with SIGTERM
heartbeat[13823]: 2007/01/10_22:44:44 CRIT: Emergency Shutdown(MCP dead): Killing ourselves.

(the content o file ha-debug is equal)
There's an error, that's why heartbeat doesn't start...
But I have no idea of what kind of error it is...

samu · Jan 10, 2007

Ok, browsing on google I found a patch to apply to the /etc/init.d/heartbeat file in order to create the directories that heartbeat needs in /var/run.

I've added these lines in file /etc/init.d/heartbeat inside the function StartHA() :
if [ ! -d $RUNDIR/heartbeat ]; then
mkdir -p $RUNDIR/heartbeat/{ccm,crm}
chown -R hacluster:haclient $RUNDIR/heartbeat
chmod -R 750 $RUNDIR/heartbeat
fi

Ok, now restarting heartbeat I get NO ERROR on ha-log file.
root@ron:/home/sam# /etc/init.d/heartbeat start
Starting High-Availability services:
2007/01/10_23:13:17 INFO: IPaddr2 Resource is stopped
Done.

And now running:
root@ron:/home/sam# ps aux | grep heartbeat
root 14866 0.0 2.4 12516 12516 ? SLs 23:13 0:00 heartbeat: master control process
nobody 14869 0.0 1.1 5920 5920 ? SL 23:13 0:00 heartbeat: FIFO reader
nobody 14870 0.0 1.1 5916 5916 ? SL 23:13 0:00 heartbeat: write: bcast eth0
nobody 14871 0.0 1.1 5916 5916 ? SL 23:13 0:00 heartbeat: read: bcast eth0
nobody 14872 0.0 1.1 5916 5916 ? SL 23:13 0:00 heartbeat: write: mcast eth0
nobody 14873 0.0 1.1 5916 5916 ? SL 23:13 0:00 heartbeat: read: mcast eth0
113 14874 0.0 0.2 4196 1424 ? S 23:13 0:00 /usr/lib/heartbeat/ipfail
root@ron:/home/sam# cl_status hbstatus
Heartbeat is running on this machine.

But the problem is still not solved in fact:
root@ron:/home/sam# /etc/ha.d/resource.d/LVSSyncDaemonSwap master eth0 status
master stopped
root@ron:/home/sam# ldirectord ldirectord.cf status
ldirectord is stopped for /etc/ha.d/ldirectord.cf
root@ron:/home/sam# ip addr sh eth0
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:02:3f:be:13:95 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.61/24 brd 192.168.1.255 scope global eth0
inet6 fe80::202:3fff:febe:1395/64 scope link
valid_lft forever preferred_lft forever

And this is the tail -f /var/log/messages:
Jan 10 23:13:17 localhost heartbeat: [14865]: WARN: Core dumps could be lost if multiple dumps occur
Jan 10 23:13:17 localhost heartbeat: [14865]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
Jan 10 23:13:17 localhost heartbeat: [14865]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
Jan 10 23:13:17 localhost heartbeat: [14865]: info: **************************
Jan 10 23:13:17 localhost heartbeat: [14865]: info: Configuration validated. Starting heartbeat 2.0.7
Jan 10 23:13:17 localhost heartbeat: [14866]: info: heartbeat: version 2.0.7
Jan 10 23:13:18 localhost heartbeat: [14866]: info: Heartbeat generation: 23
Jan 10 23:13:18 localhost heartbeat: [14866]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 10 23:13:18 localhost heartbeat: [14866]: info: G_main_add_TriggerHandler: Added signal manual handler
Jan 10 23:13:18 localhost heartbeat: [14866]: info: Removing /var/run/heartbeat/rsctmp failed, recreating.
Jan 10 23:13:18 localhost heartbeat: [14866]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Jan 10 23:13:18 localhost heartbeat: [14866]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Jan 10 23:13:18 localhost heartbeat: [14866]: info: glib: UDP multicast heartbeat started for group 225.0.0.1 port 694 interface eth0 (ttl=1 loop=0)
Jan 10 23:13:18 localhost heartbeat: [14866]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Jan 10 23:13:18 localhost heartbeat: [14866]: info: Comm_now_up(): updating status to active
Jan 10 23:13:18 localhost heartbeat: [14866]: info: Local status now set to: 'active'
Jan 10 23:13:18 localhost heartbeat: [14866]: info: Starting child client "/usr/lib/heartbeat/ipfail" (113,117)
Jan 10 23:13:18 localhost heartbeat: [14866]: info: Local status now set to: 'up'
Jan 10 23:13:18 localhost heartbeat: [14874]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 113 gid 117 (pid 14874)
Jan 10 23:13:19 localhost heartbeat: [14866]: info: Link ron:eth0 up.
Jan 10 23:13:22 localhost ipfail: [14874]: info: Link Status update: Link ron/eth0 now has status up

I can't understand why it does not work...

These are the config files:

ha.cf
****
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
auto_failback off
bcast eth0
mcast eth0 225.0.0.1 694 1 0
node ron
respawn hacluster /usr/lib/heartbeat/ipfail

haresources
*********
ron IPaddr2::192.168.1.65/24/eth0/192.168.1.255 LVSSyncDaemonSwap::master::eth0 ldirectord::ldirectord.cf

ldirectord.cf
*********
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes
virtual=192.168.1.65:3306
service=mysql
real=192.168.0.62:3306 gate
real=192.168.0.100:3306 gate
checktype=negotiate
login="root"
passwd="mysqlrootpassword"
database="ldirectord"
request="SELECT * FROM connectioncheck"
scheduler=wrr

Any ideas to make it work?

sohaileo · Jan 11, 2007

Ok There is some problem that heartbeat is not taking resources from haresources file.... because when you start heartbeat it says IPaddr2 is stopped. More do the following to take resources manually.
run the following command.

/usr/lib/heartbeat/hb_takeover all

This script is using by heartbeat to take resources. Then see what happened...also check logs... too

Regards,

Log in or Sign up

MySQL Cluster HOWTO - Cannot load Virtual IP address

samu New Member

sohaileo New Member

samu New Member

sohaileo New Member

samu New Member

samu New Member

sohaileo New Member

Share This Page

Log in or Sign up

MySQL Cluster HOWTO - Cannot load Virtual IP address

samu New Member

sohaileo New Member

samu New Member

sohaileo New Member

samu New Member

samu New Member

sohaileo New Member

Share This Page

Useful Searches