I've configured ispconfig 3 in cluster mode, following the guide on how to forge. All works fine, but i've a problem with glusterfs. I've used 2 virtual machine on vmware esxi. The problem is that when i'll try to unplug the network for test failover both partition /var/www and /var/vmail become inaccessibile on both servers. Any idea about that?
Hi there i've two server 172.16.46.244 and 172.16.46.243, when i unplug the .243 in the logs i can't see anything. [2012-04-04 13:03:20] N [server-protocol.c:6788:notify] server: 172.16.46.243:1023 disconnected [2012-04-04 13:03:21] N [server-protocol.c:6788:notify] server: 172.16.46.243:1022 disconnected [2012-04-04 13:03:21] N [server-protocol.c:6788:notify] server: 172.16.46.243:1019 disconnected [2012-04-04 13:03:21] N [server-helpers.c:842:server_connection_destroy] server: destroyed connection of cluster02-777-2012/04/02-22:59:06:137852-remote1-vmail [2012-04-04 13:03:24] N [server-protocol.c:5852:mop_setvolume] server: accepted client from 172.16.46.243:1023 [2012-04-04 13:03:30] N [server-protocol.c:5852:mop_setvolume] server: accepted client from 172.16.46.243:1022 [2012-04-04 13:03:30] N [server-protocol.c:5852:mop_setvolume] server: accepted client from 172.16.46.243:1019 Seems something like that when i unplug the connection, gluster got a long timeout before he can recognize lost of connection, infact when i reconnect he recognize the disconnection and reastablish the connection. Is that true?
If you followed the tutorial, then glusterfs has two connections for each volume, one to the local server and one to the remote server. If you disconnect the remote server then gluterfs will use the connection to the local server to accessthe volume data, so the volume does not get disconnected. Maybe there is something wrong with your glusterfs config so that the volume runs on the remote connection only.
Here are my two configs file glusterfsd.vol Code: # Configuration for the vmail server volume volume posix-vmail type storage/posix option directory /data/export-vmail end-volume volume locks-vmail type features/locks subvolumes posix-vmail end-volume volume brick-vmail type performance/io-threads option thread-count 8 subvolumes locks-vmail end-volume # Configuration for the www server volume volume posix-www type storage/posix option directory /data/export-www end-volume volume locks-www type features/locks subvolumes posix-www end-volume volume brick-www type performance/io-threads option thread-count 8 subvolumes locks-www end-volume # export all volumes volume server type protocol/server option transport-type tcp subvolumes brick-vmail brick-www # Authentification options for the vmail volume option auth.addr.brick-vmail.allow 172.16.46.244,172.16.46.243 option auth.login.brick-vmail.allow user-vmail option auth.login.user-vmail.password XXXX # authentification options for www option auth.addr.brick-www.allow 172.16.46.244,172.16.46.243 option auth.login.brick-www.allow user-www option auth.login.user-www.password XXXX end-volume glusterfs-www.vol Code: volume remote1-www type protocol/client option transport-type tcp option remote-host 172.16.46.244 option remote-subvolume brick-www option username user-www option password XXXX end-volume volume remote2-www type protocol/client option transport-type tcp option remote-host 172.16.46.243 option remote-subvolume brick-www option username user-www option password XXXX end-volume volume replicate-www type cluster/replicate subvolumes remote1-www remote2-www end-volume volume writebehind-www type performance/write-behind option window-size 1MB subvolumes replicate-www end-volume volume cache-www type performance/io-cache option cache-size 256MB subvolumes writebehind-www end-volume
Strange thing.... About 15 i've unplugged the host .244 after some minutes (may be 10-15 minutes) glusterfs on the .243 started to work correctly. The strange thing is that the online node (.243) had recognized the previous disconnection only 2 hourse later [2012-04-04 17:02:39] N [server-protocol.c:6788:notify] server: 172.16.46.244:1020 disconnected [2012-04-04 17:06:16] N [server-protocol.c:6788:notify] server: 172.16.46.244:1018 disconnected Whats' going on? Some Timouts? Retry? Something else?