High Availability Samba cluster - DRBD + Heartbeat

Discussion in 'Server Operation' started by djalex, Aug 18, 2006.

  1. djalex

    djalex New Member

    Hello everyone,

    This is my first experience with Linux and I am trying to setup a high availability samba cluster with DRBD and Heartbeat.

    E N V I R O N M E N T _ D E T A I L S

    Primary server
    Server name: test02
    IP address: 192.168.50.152
    Subnet mask: 255.255.255.0
    OS: CentOS 4.3 (Kernel version: 2.6.9-34.EL)

    Applications installed: DRBD version 0.7.9, Heartbeat version 2.0.7, SAMBA version 3.0.10-1.4E.6

    Secondary server
    Server name: test01
    IP address: 192.168.50.151
    Subnet mask: 255.255.255.0
    OS: CentOS 4.3 (Kernel version: 2.6.9-34.EL)

    Applications installed: DRBD version 0.7.9, Heartbeat version 2.0.7, SAMBA version 3.0.10-1.4E.6

    Client system
    System name: test03
    IP address: 192.168.50.153
    OS: Windows XP Professional sp2

    SAMBA is serviced on the IP address 192.168.50.195

    Configuration files are as follows:

    drbd.conf (test01/test02)

    resource r0
    {

    protocol A;
    incon-degr-cmd "halt -f";

    startup
    {
    degr-wfc-timeout 120; # 2 minutes
    }

    disk
    {
    on-io-error detach;
    }

    net
    {

    }

    syncer
    {
    rate 10M;
    group 1;
    al-extents 257;
    }

    on test01
    {
    device /dev/drbd0;
    disk /dev/hda5;
    address 192.168.50.151:7789;
    meta-disk internal;
    }

    on test02
    {
    device /dev/drbd0;
    disk /dev/hda5;
    address 192.168.50.152:7789;
    meta-disk internal;
    }
    }

    ha.cf (test01/test02)

    logfacility local0
    logfile /var/log/ha-log
    debug 1
    bcast eth0
    keepalive 2
    deadtime 10
    auto_failback off
    node test01
    node test02
    ping test01
    ping test02
    #respawn hacluster /user/lib/heartbeat/ipfail

    haresources (test01/test02)
    test02 IPaddr::192.168.50.195
    test02 drbddisk::r0 Filesystem::/dev/drbd0 smb

    authkeys (test01/test02)
    auth 3
    3 md5 goose

    smb.conf (test01)
    [global]
    workgroup = Workgroup
    server string = SAMBA_TEST
    admin users = root
    share modes = yes
    browseable = yes
    username map = /etc/samba/smbusers
    interfaces = 192.168.50.195

    [goose01]
    path = /mnt/goose01
    writeable = yes
    guest ok = yes

    smb.conf (test02)
    [global]
    workgroup = Workgroup
    server string = SAMBA_TEST
    admin users = root
    share modes = yes
    browseable = yes
    username map = /etc/samba/smbusers
    interfaces = 192.168.50.195

    [goose02]
    path = /mnt/goose02
    writeable = yes
    guest ok = yes

    smbusers (test01/test02)
    # Unix_name = SMB_name1 SMB_name2 ...
    # root = administrator admin
    # nobody = guest pcguest smbguest
    root = root

    P R O B L E M

    While client test03 attempts to access SAMBA services on 192.168.50.195, the primary server reboots.

    T R O U B L E S H O O T I N G

    The steps taken (to the point of failure) are as follows:

    1. Started drbd on test02 (primary)
    2. Started drbd on test01 (secondary)
    3. Ran the command drbdadm primary all on test02
    4. Ran the command mount /dev/drbd0 /mnt/goose02 on test02
    5. Started samba on test02 (primary)
    6. Created test files hello and world in the /mnt/goose02 share. (SAMBA was already configured with the /mnt/goose02 folder.)
    7. I then try accessing it from the windows system using service IP address 192.168.50.195. If it does not crash, I can browse the files on 192.168.50.195 momentarily. Then the primary server reboots without warning.
    8. After the primary server crashes, I ran the command drbdadm primary all on the secondary server, in order to mount the virtual block.
    9. Then I ran the command mount dev/drbd0 /mnt/goose01 share on test01. (SAMBA was already configured with the /mnt/goose01 folder.)
    10. Started the samba service on test01.
    11. The files are accessible from the windows system on service IP 192.168.50.195

    I tried to review the logs present in /var/log but I was not able to find any
    conclusive evidence for the cause of the crash. High availability seems to be
    working... but the tasks are manual as described in the above steps.

    O B S E R V A T I O N

    I suspect that heartbeat maybe the problem - specifically the virtual IP address. I have noticed that when I startup heartbeat, both the primary and secondary server have the virtual IP address of 192.168.50.195 for the initial period. After sometime, the virtual IP disappears from the secondary server (giving me the impression that it takes a while for heartbeat to get settled), but then the windows system is not able to ping the virtual IP address. Only after making manual entries for the IPaddress on both primary and secondary servers, its possible to ping the service address from the windows client. (Manual entry is made by typing the command /etc/ha.d/resource.d/IPaddr 192.168.50.195 start on primary server and /etc/ha.d/resource.d/IPaddr 192.168.50.195 stop on secondary server.

    I need help with the following issues:
    1. Feedback on the cause of the server crash and how to avoid it.
    2. Suggestions to automate these manual tasks.
    3. Feedback on the cluster configuration and scope for improvement.

    Regards,
    Alex
     
    Last edited: Aug 18, 2006
  2. falko

    falko Super Moderator Howtoforge Staff

    Did you check this tutorial? Sounds like something is wrong with your heartbeat configuration. Please compare your heartbeat configuration with the one from the tutorial.
     
  3. djalex

    djalex New Member

    Hi Falko,

    Extremely pleased to see your response to this thread. I have read your articles on high availability installations and was very impressed with the step-by-step explanation of how it was implemented (I have used your articles as a guide for my installation).

    In reference to the high availability samba setup, I checked the following tutorial links:
    http://www.linux-ha.org/ha.cf
    http://www.linux-ha.org/haresources
    http://www.linux-ha.org/authkeys

    However, I was not able to find any option which I found suitable to the existing heartbeat configuration. If you have any suggestions to be made to the existing heartbeat configuration files (based on your experience), I shall try them out. Its just that I have tried all possibilities from my side.. to no avail. Your expert guidance in this regard could provide the breakthrough I need... in order to achieve the final setup successfully.
    Awaiting your response.....

    Cheers,
    Alex
     
  4. falko

    falko Super Moderator Howtoforge Staff

    Seems I forgot the link in my previous post... :eek: http://www.howtoforge.com/high_availability_nfs_drbd_heartbeat

    What's in /etc/heartbeat/ha.cf and /etc/heartbeat/haresources?

    Just found haresources in your first post:
    Code:
    test02 IPaddr::192.168.50.195
    test02 drbddisk::r0 Filesystem::/dev/drbd0 smb
    This should be just one line. In my tutorial, it's

    Code:
    server1  IPaddr::192.168.0.174/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext3 nfs-kernel-server
     
    Last edited: Aug 22, 2006
  5. djalex

    djalex New Member

    Hi Falko,

    The updated configuration files are as follows (changes indicated in italics)

    ha.cf (test01/test02)

    logfacility local0
    logfile /var/log/ha-log
    debug 1
    bcast eth0
    keepalive 2
    deadtime 10
    initdead 30
    auto_failback off
    node test01
    node test02
    ping test01
    ping test02
    #respawn hacluster /user/lib/heartbeat/ipfail

    haresources (test01/test02)

    test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb

    However, primary linux server crashes when windows client tries to
    access the files with Samba. Please help.

    Regards,
    Alex
     
  6. falko

    falko Super Moderator Howtoforge Staff

    Are there any errors in the logs?
     
  7. djalex

    djalex New Member

    Hi Falko,

    I havent been able to note any particular errors which determines the cause of the crash. I guess I would'nt be struggling so much if I had to tackle this problem on Windows. Neverthless, I consider this Linux issue as a challenging and learning experience. Moreover, I am extremely glad that you are willing to offer your guidance on this problem. If there are any specific logs which you require, I am willing to post it for your review.

    Regards,
    Alex
     
  8. falko

    falko Super Moderator Howtoforge Staff

    Usually the logs are in /var/log.
     
  9. djalex

    djalex New Member

    Hi Falko,

    Made the following changes in drbd.conf (indicated in italics)

    drbd.conf (test01/test02)

    resource r0
    {

    protocol C;
    #incon-degr-cmd "halt -f";

    startup
    {
    degr-wfc-timeout 120; # 2 minutes
    }

    disk
    {
    on-io-error detach;
    }

    net
    {

    }

    syncer
    {
    rate 10M;
    group 1;
    al-extents 257;
    }

    on test01
    {
    device /dev/drbd0;
    disk /dev/hda5;
    address 192.168.50.151:7789;
    meta-disk internal;
    }

    on test02
    {
    device /dev/drbd0;
    disk /dev/hda5;
    address 192.168.50.152:7789;
    meta-disk internal;
    }
    }

    Then I cleared the files in /var/log. Started up drbd, heartbeat and samba on test01 and test02. While trying to access the samba files on service address 192.168.50.195, the primary server crashed out again. The fresh logs are attached with this post. Kindly help.

    Regards,
    Alex

    View attachment 30 Aug test01.zip

    View attachment 30 Aug test02.zip
     
  10. falko

    falko Super Moderator Howtoforge Staff

    Please post the logs here directly instead of attaching them.
     
  11. djalex

    djalex New Member

    /var/log/acpid (test02)
    ==========
    [Wed Aug 30 10:50:21 2006] starting up
    [Wed Aug 30 10:50:21 2006] 1 rule loaded

    /var/log/boot.log (test02)
    ============
    Aug 30 10:50:18 test02 syslog: syslogd startup succeeded
    Aug 30 10:50:18 test02 syslog: klogd startup succeeded
    Aug 30 10:50:18 test02 irqbalance: irqbalance startup succeeded
    Aug 30 10:50:18 test02 portmap: portmap startup succeeded
    Aug 30 10:50:19 test02 nfslock: rpc.statd startup succeeded
    Aug 30 10:50:19 test02 rpcidmapd: rpc.idmapd startup succeeded
    Aug 30 10:50:19 test02 netfs: Mounting other filesystems: succeeded
    Aug 30 10:50:19 test02 rc: Starting lm_sensors: succeeded
    Aug 30 10:50:19 test02 autofs: automount startup succeeded
    Aug 30 06:48:57 test02 rc.sysinit: -e
    Aug 30 10:50:21 test02 smartd: smartd startup succeeded
    Aug 30 06:48:59 test02 start_udev: Starting udev: succeeded
    Aug 30 06:49:05 test02 rc.sysinit: -e
    Aug 30 06:49:15 test02 sysctl: net.ipv4.ip_forward = 0
    Aug 30 10:50:21 test02 acpid: acpid startup succeeded
    Aug 30 06:49:15 test02 sysctl: net.ipv4.conf.default.rp_filter = 1
    Aug 30 06:49:15 test02 sysctl: net.ipv4.conf.default.accept_source_route = 0
    Aug 30 06:49:15 test02 sysctl: kernel.sysrq = 0
    Aug 30 06:49:15 test02 sysctl: kernel.core_uses_pid = 1
    Aug 30 06:49:15 test02 rc.sysinit: Configuring kernel parameters: succeeded
    Aug 30 10:49:15 test02 date: Wed Aug 30 10:49:15 EDT 2006
    Aug 30 10:49:15 test02 rc.sysinit: Setting clock (localtime): Wed Aug 30 10:49:15 EDT 2006 succeeded
    Aug 30 10:49:15 test02 rc.sysinit: Loading default keymap succeeded
    Aug 30 10:49:15 test02 rc.sysinit: Setting hostname test02: succeeded
    Aug 30 10:49:20 test02 rc.sysinit: Checking root filesystem succeeded
    Aug 30 10:49:21 test02 rc.sysinit: Remounting root filesystem in read-write mode: succeeded
    Aug 30 10:49:21 test02 lvm.static:
    Aug 30 10:49:21 test02 lvm.static: No volume groups found
    Aug 30 10:49:21 test02 rc.sysinit: Setting up Logical Volume Management: succeeded
    Aug 30 10:49:22 test02 rc.sysinit: Checking filesystems succeeded
    Aug 30 10:49:22 test02 rc.sysinit: Mounting local filesystems: succeeded
    Aug 30 10:49:22 test02 rc.sysinit: Enabling local filesystem quotas: succeeded
    Aug 30 10:49:23 test02 rc.sysinit: Enabling swap space: succeeded
    Aug 30 10:49:23 test02 microcode_ctl: microcode_ctl startup succeeded
    Aug 30 10:49:23 test02 readahead_early: Starting background readahead:
    Aug 30 10:49:24 test02 rc: Starting readahead_early: succeeded
    Aug 30 10:50:14 test02 kudzu: failed
    Aug 30 10:50:14 test02 kudzu: Hardware configuration timed out.
    Aug 30 10:50:14 test02 kudzu: Run '/usr/sbin/kudzu' from the command line to re-detect.
    Aug 30 10:50:14 test02 rc: Starting pcmcia: succeeded
    Aug 30 10:50:14 test02 sysctl: net.ipv4.ip_forward = 0
    Aug 30 10:50:14 test02 sysctl: net.ipv4.conf.default.rp_filter = 1
    Aug 30 10:50:14 test02 sysctl: net.ipv4.conf.default.accept_source_route = 0
    Aug 30 10:50:14 test02 sysctl: kernel.sysrq = 0
    Aug 30 10:50:14 test02 sysctl: kernel.core_uses_pid = 1
    Aug 30 10:50:14 test02 network: Setting network parameters: succeeded
    Aug 30 10:50:14 test02 network: Bringing up loopback interface: succeeded
    Aug 30 10:50:18 test02 network: Bringing up interface eth0: succeeded
    Aug 30 10:50:23 test02 cups: cupsd startup succeeded
    Aug 30 10:50:24 test02 sshd: succeeded
    Aug 30 10:50:24 test02 xinetd: xinetd startup succeeded
    Aug 30 10:52:24 test02 sendmail: sendmail startup succeeded
    Aug 30 10:53:24 test02 sendmail: sm-client startup succeeded
    Aug 30 10:53:25 test02 gpm: gpm startup succeeded
    Aug 30 10:53:25 test02 crond: crond startup succeeded
    Aug 30 10:53:26 test02 xfs: xfs startup succeeded
    Aug 30 10:53:26 test02 anacron: anacron startup succeeded
    Aug 30 10:53:26 test02 atd: atd startup succeeded
    Aug 30 10:53:27 test02 readahead: Starting background readahead:
    Aug 30 10:53:27 test02 rc: Starting readahead: succeeded
    Aug 30 10:53:27 test02 messagebus: messagebus startup succeeded
    Aug 30 10:53:28 test02 cups-config-daemon: cups-config-daemon startup succeeded
    Aug 30 10:53:28 test02 haldaemon: haldaemon startup succeeded

    /var/log/cron (test02)
    ==========

    Aug 30 10:53:25 test02 crond[2679]: (CRON) STARTUP (V5.0)
    Aug 30 10:53:26 test02 anacron[2712]: Anacron 2.3 started on 2006-08-30
    Aug 30 10:53:27 test02 anacron[2712]: Will run job `cron.daily' in 65 min.
    Aug 30 10:53:27 test02 anacron[2712]: Jobs will be executed sequentially
     
  12. djalex

    djalex New Member

    /var/log/dmesg (test02)
    ===========

    Linux version 2.6.9-34.EL (buildcentos@build-i386) (gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)) #1 Wed Mar 8 00:07:35 CST 2006
    BIOS-provided physical RAM map:
    BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
    BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
    BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
    BIOS-e820: 000000001fff0000 - 000000001fff3000 (ACPI NVS)
    BIOS-e820: 000000001fff3000 - 0000000020000000 (ACPI data)
    BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
    0MB HIGHMEM available.
    511MB LOWMEM available.
    Using x86 segment limits to approximate NX protection
    zapping low mappings.
    On node 0 totalpages: 131056
    DMA zone: 4096 pages, LIFO batch:1
    Normal zone: 126960 pages, LIFO batch:16
    HighMem zone: 0 pages, LIFO batch:1
    DMI 2.3 present.
    ACPI: RSDP (v000 GBT ) @ 0x000f6910
    ACPI: RSDT (v001 GBT AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x1fff3000
    ACPI: FADT (v001 GBT AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x1fff3040
    ACPI: MADT (v001 GBT AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x1fff6980
    ACPI: DSDT (v001 GBT AWRDACPI 0x00001000 MSFT 0x0100000c) @ 0x00000000
    ACPI: PM-Timer IO Port: 0x4008
    Built 1 zonelists
    Kernel command line: ro root=LABEL=/ rhgb quiet
    Initializing CPU#0
    CPU 0 irqstacks, hard=c03e7000 soft=c03e6000
    PID hash table entries: 2048 (order: 11, 32768 bytes)
    Detected 2546.581 MHz processor.
    Using tsc for high-res timesource
    Console: colour VGA+ 80x25
    Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
    Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
    Memory: 514876k/524224k available (2117k kernel code, 8700k reserved, 669k data, 144k init, 0k highmem)
    Calibrating delay using timer specific routine.. 5096.02 BogoMIPS (lpj=2548012)
    Security Scaffold v1.0.0 initialized
    SELinux: Initializing.
    SELinux: Starting in permissive mode
    There is already a security framework initialized, register_security failed.
    selinux_register_security: Registering secondary module capability
    Capability LSM initialized as secondary
    Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
    CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
    CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000
    CPU: Trace cache: 12K uops, L1 D cache: 8K
    CPU: L2 cache: 512K
    CPU: After all inits, caps: bfebf3ff 00000000 00000000 00000080
    Intel machine check architecture supported.
    Intel machine check reporting enabled on CPU#0.
    CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
    CPU: Intel(R) Pentium(R) 4 CPU 2.53GHz stepping 07
    Enabling fast FPU save and restore... done.
    Enabling unmasked SIMD FPU exception support... done.
    Checking 'hlt' instruction... OK.
    ACPI: IRQ9 SCI: Level Trigger.
    checking if image is initramfs... it is
    Freeing initrd memory: 384k freed
    NET: Registered protocol family 16
    PCI: PCI BIOS revision 2.10 entry at 0xfa4c0, last bus=2
    PCI: Using configuration type 1
    mtrr: v2.0 (20020519)
    ACPI: Subsystem revision 20040816
    ACPI: Interpreter enabled
    ACPI: Using PIC for interrupt routing
    ACPI: PCI Root Bridge [PCI0] (00:00)
    PCI: Probing PCI hardware (bus 00)
    PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
    PCI: Transparent bridge - 0000:00:1e.0
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
    ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
    ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
    ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 *12 14 15)
    ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
    ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 11 *12 14 15)
    ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
    ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
    Linux Plug and Play Support v0.97 (c) Adam Belay
    usbcore: registered new driver usbfs
    usbcore: registered new driver hub
    PCI: Using ACPI for IRQ routing
    ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 5
    ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 5 (level, low) -> IRQ 5
    ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5
    ACPI: PCI interrupt 0000:00:1d.1 -> GSI 5 (level, low) -> IRQ 5
    ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 12
    ACPI: PCI interrupt 0000:00:1d.2[C] -> GSI 12 (level, low) -> IRQ 12
    ACPI: PCI Interrupt Link [LNK1] enabled at IRQ 11
    ACPI: PCI interrupt 0000:00:1d.7[D] -> GSI 11 (level, low) -> IRQ 11
    ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 12 (level, low) -> IRQ 12
    ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 9
    ACPI: PCI interrupt 0000:00:1f.3 -> GSI 9 (level, low) -> IRQ 9
    ACPI: PCI interrupt 0000:00:1f.5 -> GSI 9 (level, low) -> IRQ 9
    ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 5 (level, low) -> IRQ 5
    ACPI: PCI Interrupt Link [LNK0] enabled at IRQ 11
    ACPI: PCI interrupt 0000:02:02.0[A] -> GSI 11 (level, low) -> IRQ 11
    ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 12
    ACPI: PCI interrupt 0000:02:08.0[A] -> GSI 12 (level, low) -> IRQ 12
    apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
    apm: overridden by ACPI.
    audit: initializing netlink socket (disabled)
    audit(1156934928.050:1): initialized
    Total HugeTLB memory allocated, 0
    VFS: Disk quotas dquot_6.5.1
    Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
    SELinux: Registering netfilter hooks
    Initializing Cryptographic API
    ksign: Installing public key data
    Loading keyring
    - Added public key B4802E7A21D4FA03
    - User ID: CentOS (Kernel Module GPG key)
    pci_hotplug: PCI Hot Plug PCI Core version: 0.5
    ACPI: Processor [CPU0] (supports C1)
    Real Time Clock Driver v1.12
    Linux agpgart interface v0.100 (c) Dave Jones
    agpgart: Detected an Intel 845G Chipset.
    agpgart: Maximum main memory to use for agp memory: 439M
    agpgart: AGP aperture is 128M @ 0xd0000000
    serio: i8042 AUX port at 0x60,0x64 irq 12
    serio: i8042 KBD port at 0x60,0x64 irq 1
    Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
    ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
    ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
    RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
    divert: not allocating divert_blk for non-ethernet device lo
    Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
    ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
    ICH4: IDE controller at PCI slot 0000:00:1f.1
    ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 12 (level, low) -> IRQ 12
    ICH4: chipset revision 2
    ICH4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xcc00-0xcc07, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xcc08-0xcc0f, BIOS settings: hdc:DMA, hdd:pio
    Probing IDE interface ide0...
    hda: Maxtor 32049H2, ATA DISK drive
    hdb: Maxtor 6Y060L0, ATA DISK drive
    Using cfq io scheduler
    ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
    Probing IDE interface ide1...
    hdc: AOPEN DVD1648/LKY, ATAPI CD/DVD-ROM drive
    hdd: IOMEGA ZIP 250 ATAPI, ATAPI FLOPPY drive
    ide1 at 0x170-0x177,0x376 on irq 15
    Probing IDE interface ide2...
    Probing IDE interface ide3...
    Probing IDE interface ide4...
    Probing IDE interface ide5...
    hda: max request size: 128KiB
    hda: 39062500 sectors (20000 MB) w/2048KiB Cache, CHS=38752/16/63, UDMA(33)
    hda: cache flushes not supported
    hda: hda1 hda2 hda3 hda4 < hda5 >
    hdb: max request size: 128KiB
    hdb: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(33)
    hdb: cache flushes supported
    hdb:
    hdc: ATAPI 48X DVD-ROM drive, 512kB Cache, UDMA(33)
     
  13. djalex

    djalex New Member

    /var/log/dmesg (test02) - contd
    ===========


    Uniform CD-ROM driver Revision: 3.20
    ide-floppy driver 0.99.newide
    hdd: No disk in drive
    hdd: 244736kB, 239/64/32 CHS, 4096 kBps, 512 sector size, 2941 rpm
    usbcore: registered new driver hiddev
    usbcore: registered new driver usbhid
    drivers/usb/input/hid-core.c: v2.0:USB HID core driver
    mice: PS/2 mouse device common for all mice
    input: AT Translated Set 2 keyboard on isa0060/serio0
    md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
    NET: Registered protocol family 2
    IP: routing cache hash table of 1024 buckets, 32Kbytes
    TCP: Hash tables configured (established 32768 bind 9362)
    Initializing IPsec netlink socket
    NET: Registered protocol family 1
    NET: Registered protocol family 17
    ACPI: (supports S0 S1 S4 S5)
    ACPI wakeup devices:
    SLPB PCI0 HUB0 USB0 USB1 USB2 USB3
    Freeing unused kernel memory: 144k freed
    EXT3-fs: INFO: recovery required on readonly filesystem.
    EXT3-fs: write access will be enabled during recovery.
    kjournald starting. Commit interval 5 seconds
    EXT3-fs: hda2: orphan cleanup on readonly fs
    ext3_orphan_cleanup: deleting unreferenced inode 196124
    ext3_orphan_cleanup: deleting unreferenced inode 196118
    EXT3-fs: hda2: 2 orphan inodes deleted
    EXT3-fs: recovery complete.
    EXT3-fs: mounted filesystem with ordered data mode.
    security: 3 users, 4 roles, 354 types, 26 bools
    security: 55 classes, 21833 rules
    SELinux: Completing initialization.
    SELinux: Setting up existing superblocks.
    SELinux: initialized (dev hda2, type ext3), uses xattr
    SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
    SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
    SELinux: initialized (dev mqueue, type mqueue), not configured for labeling
    SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
    SELinux: initialized (dev devpts, type devpts), uses transition SIDs
    SELinux: initialized (dev eventpollfs, type eventpollfs), uses genfs_contexts
    SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
    SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
    SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
    SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
    SELinux: initialized (dev proc, type proc), uses genfs_contexts
    SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
    SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
    SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
    SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
    inserting floppy driver for 2.6.9-34.EL
    Floppy drive(s): fd0 is 1.44M
    FDC 0 is a post-1991 82077
    e100: Intel(R) PRO/100 Network Driver, 3.4.8-k2-NAPI
    e100: Copyright(c) 1999-2005 Intel Corporation
    ACPI: PCI interrupt 0000:02:08.0[A] -> GSI 12 (level, low) -> IRQ 12
    divert: allocating divert_blk for eth0
    e100: eth0: e100_probe: addr 0xea010000, irq 12, MAC addr 00:20:ED:4E:85:04
    tg3.c:v3.43-rh (Oct 24, 2005)
    ACPI: PCI interrupt 0000:02:02.0[A] -> GSI 11 (level, low) -> IRQ 11
    divert: allocating divert_blk for eth1
    eth1: Tigon3 [partno(AC91002A1) rev 0105 PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet 00:09:5b:1f:26:06
    eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0]
    eth1: dma_rwctrl[76ff000f]
    ip_tables: (C) 2000-2002 Netfilter core team
    ACPI: PCI interrupt 0000:00:1f.5 -> GSI 9 (level, low) -> IRQ 9
    PCI: Setting latency timer of device 0000:00:1f.5 to 64
    e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
    intel8x0_measure_ac97_clock: measured 50054 usecs
    intel8x0: clocking to 48000
    hw_random: RNG not detected
    Evaluate _OSC Set fails. Status = 0x0005
    pciehp: Both _OSC and OSHP methods do not exist
    ACPI: PCI interrupt 0000:00:1d.7[D] -> GSI 11 (level, low) -> IRQ 11
    ehci_hcd 0000:00:1d.7: EHCI Host Controller
    PCI: Setting latency timer of device 0000:00:1d.7 to 64
    ehci_hcd 0000:00:1d.7: irq 11, pci mem e0898000
    SELinux: initialized (dev usbdevfs, type usbdevfs), uses genfs_contexts
    ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
    PCI: cache line size of 128 is not supported by device 0000:00:1d.7
    ehci_hcd 0000:00:1d.7: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
    hub 1-0:1.0: USB hub found
    hub 1-0:1.0: 6 ports detected
    USB Universal Host Controller Interface driver v2.2
    ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 5 (level, low) -> IRQ 5
    uhci_hcd 0000:00:1d.0: UHCI Host Controller
    PCI: Setting latency timer of device 0000:00:1d.0 to 64
    uhci_hcd 0000:00:1d.0: irq 5, io base 0000b800
    uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
    hub 2-0:1.0: USB hub found
    hub 2-0:1.0: 2 ports detected
    ACPI: PCI interrupt 0000:00:1d.1 -> GSI 5 (level, low) -> IRQ 5
    uhci_hcd 0000:00:1d.1: UHCI Host Controller
    PCI: Setting latency timer of device 0000:00:1d.1 to 64
    uhci_hcd 0000:00:1d.1: irq 5, io base 0000b000
    uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
    hub 3-0:1.0: USB hub found
    hub 3-0:1.0: 2 ports detected
    ACPI: PCI interrupt 0000:00:1d.2[C] -> GSI 12 (level, low) -> IRQ 12
    uhci_hcd 0000:00:1d.2: UHCI Host Controller
    PCI: Setting latency timer of device 0000:00:1d.2 to 64
    uhci_hcd 0000:00:1d.2: irq 12, io base 0000b400
    uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
    hub 4-0:1.0: USB hub found
    hub 4-0:1.0: 2 ports detected
    md: Autodetecting RAID arrays.
    md: autorun ...
    md: ... autorun DONE.
    usb 4-1: new low speed USB device using address 2
    input: USB HID v1.00 Mouse [Logitech] on usb-0000:00:1d.2-1
    SELinux: initialized (dev ramfs, type ramfs), uses genfs_contexts
    NET: Registered protocol family 10
    Disabled Privacy Extensions on device c0378f60(lo)
    IPv6 over IPv4 tunneling driver
    divert: not allocating divert_blk for non-ethernet device sit0
    ACPI: Power Button (FF) [PWRF]
    ACPI: Sleep Button (CM) [SLPB]
    eth0: no IPv6 routers present
    EXT3 FS on hda2, internal journal
    device-mapper: 4.5.0-ioctl (2005-10-04) initialised: [email protected]
    cdrom: open failed.
    hdd: No disk in drive
    kjournald starting. Commit interval 5 seconds
    EXT3 FS on hda1, internal journal
    EXT3-fs: mounted filesystem with ordered data mode.
    SELinux: initialized (dev hda1, type ext3), uses xattr
    SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
    Adding 1333384k swap on /dev/hda3. Priority:-1 extents:1
    SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
     
  14. djalex

    djalex New Member

    var/log/ha-log (test02)
    ==============
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid user id name [hacluster]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Bad uid list [hacluster]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid apiauth directive [ipfail uid=hacluster]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Where uidlist is a comma-separated list of uids,
    heartbeat[3724]: 2006/08/30_10:40:58 info: and gidlist is a comma-separated list of gids
    heartbeat[3724]: 2006/08/30_10:40:58 info: One or the other must be specified.
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid user id name [hacluster]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Bad uid list [hacluster]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid apiauth directive [ccm uid=hacluster]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Where uidlist is a comma-separated list of uids,
    heartbeat[3724]: 2006/08/30_10:40:58 info: and gidlist is a comma-separated list of gids
    heartbeat[3724]: 2006/08/30_10:40:58 info: One or the other must be specified.
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid group name [haclient]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Bad gid list [haclient]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid apiauth directive [ping gid=haclient]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Where uidlist is a comma-separated list of uids,
    heartbeat[3724]: 2006/08/30_10:40:58 info: and gidlist is a comma-separated list of gids
    heartbeat[3724]: 2006/08/30_10:40:58 info: One or the other must be specified.
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid group name [haclient]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Bad gid list [haclient]
    heartbeat[3724]: 2006/08/30_10:40:58 ERROR: Invalid apiauth directive [anon gid=haclient]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[3724]: 2006/08/30_10:40:58 info: Where uidlist is a comma-separated list of uids,
    heartbeat[3724]: 2006/08/30_10:40:58 info: and gidlist is a comma-separated list of gids
    heartbeat[3724]: 2006/08/30_10:40:58 info: One or the other must be specified.
    heartbeat[3724]: 2006/08/30_10:40:58 info: AUTH: i=3: key = 0x9ea6260, auth=0x816934, authname=md5
    heartbeat[3724]: 2006/08/30_10:40:58 WARN: Logging daemon is disabled --enabling logging daemon is recommended
    heartbeat[3724]: 2006/08/30_10:40:58 info: **************************
    heartbeat[3724]: 2006/08/30_10:40:58 info: Configuration validated. Starting heartbeat 2.0.7
    heartbeat[3725]: 2006/08/30_10:40:58 info: heartbeat: version 2.0.7
    heartbeat[3725]: 2006/08/30_10:40:58 ERROR: change_logfile_ownship: entry for user hacluster not found
    heartbeat[3725]: 2006/08/30_10:40:59 info: Heartbeat generation: 65
    heartbeat[3725]: 2006/08/30_10:40:59 info: G_main_add_TriggerHandler: Added signal manual handler
    heartbeat[3725]: 2006/08/30_10:40:59 info: G_main_add_TriggerHandler: Added signal manual handler
    heartbeat[3725]: 2006/08/30_10:40:59 info: Removing /var/run/heartbeat/rsctmp failed, recreating.
    heartbeat[3725]: 2006/08/30_10:40:59 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
    heartbeat[3725]: 2006/08/30_10:40:59 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
    heartbeat[3725]: 2006/08/30_10:40:59 info: glib: ping heartbeat started.
    heartbeat[3725]: 2006/08/30_10:40:59 info: glib: ping heartbeat started.
    heartbeat[3725]: 2006/08/30_10:40:59 info: G_main_add_SignalHandler: Added signal handler for signal 17
    heartbeat[3725]: 2006/08/30_10:40:59 info: Local status now set to: 'up'
    heartbeat[3725]: 2006/08/30_10:41:00 info: Status update for node test01: status ping
    heartbeat[3725]: 2006/08/30_10:41:00 info: Link test02:eth0 up.
    heartbeat[3725]: 2006/08/30_10:41:00 info: Status update for node test02: status ping
    heartbeat[3725]: 2006/08/30_10:41:08 info: Status update for node test01: status init
    heartbeat[3725]: 2006/08/30_10:41:08 info: Status update for node test01: status up
    harc[3740]: 2006/08/30_10:41:08 info: Running /etc/ha.d/rc.d/status status
    heartbeat[3725]: 2006/08/30_10:41:08 info: Exiting status process 3740 returned rc 0.
    harc[3751]: 2006/08/30_10:41:08 info: Running /etc/ha.d/rc.d/status status
    heartbeat[3725]: 2006/08/30_10:41:08 info: Exiting status process 3751 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:41:29 WARN: node test01: is dead
    heartbeat[3725]: 2006/08/30_10:41:29 info: Comm_now_up(): updating status to active
    heartbeat[3725]: 2006/08/30_10:41:29 info: Local status now set to: 'active'
    heartbeat[3725]: 2006/08/30_10:41:29 WARN: No STONITH device configured.
    heartbeat[3725]: 2006/08/30_10:41:29 WARN: Shared disks are not protected.
    heartbeat[3725]: 2006/08/30_10:41:29 info: Resources being acquired from test01.
    harc[3767]: 2006/08/30_10:41:29 info: Running /etc/ha.d/rc.d/status status
    mach_down[3788]: 2006/08/30_10:41:29 info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
    mach_down[3788]: 2006/08/30_10:41:29 info: mach_down takeover complete for node test01.
    heartbeat[3725]: 2006/08/30_10:41:29 info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES' (0))
    heartbeat[3725]: 2006/08/30_10:41:29 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (0))
    heartbeat[3725]: 2006/08/30_10:41:29 info: Initial resource acquisition complete (T_RESOURCES(us))
    heartbeat[3725]: 2006/08/30_10:41:29 info: mach_down takeover complete.
    heartbeat[3725]: 2006/08/30_10:41:29 info: AnnounceTakeover(local 1, foreign 1, reason 'mach_down' (1))
    heartbeat[3725]: 2006/08/30_10:41:29 info: STATE 1 => 3
    heartbeat[3725]: 2006/08/30_10:41:29 info: Exiting status process 3767 returned rc 0.
    IPaddr[3805]: 2006/08/30_10:41:29 INFO: IPaddr Resource is stopped
    req_resource[3781]: 2006/08/30_10:41:29 debug: in /usr/lib/heartbeat/req_resource IPaddr::192.168.50.195/24/eth0
    req_resource[3781]: 2006/08/30_10:41:29 debug: dont_ask: yes nice_failback: yes
    heartbeat[3768]: 2006/08/30_10:41:29 info: 1 local resources from [/usr/lib/heartbeat/ResourceManager listkeys test02]
    heartbeat[3768]: 2006/08/30_10:41:29 info: Local Resource acquisition completed.
    heartbeat[3768]: 2006/08/30_10:41:29 info: FIFO message [type resource] written rc=79
    heartbeat[3725]: 2006/08/30_10:41:29 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:41:29 info: Exiting req_our_resources process 3768 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:41:29 info: AnnounceTakeover(local 1, foreign 1, reason 'req_our_resources' (1))
    harc[3939]: 2006/08/30_10:41:29 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
    ip-request-resp[3939]: 2006/08/30_10:41:30 received ip-request-resp IPaddr::192.168.50.195/24/eth0 OK yes
    ResourceManager[3954]: 2006/08/30_10:41:30 info: Acquiring resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    IPaddr[3978]: 2006/08/30_10:41:30 INFO: IPaddr Resource is stopped
    ResourceManager[3954]: 2006/08/30_10:41:30 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 start
    IPaddr[4176]: 2006/08/30_10:41:30 INFO: eval /sbin/ifconfig eth0:0 192.168.50.195 netmask 255.255.255.0 broadcast 192.168.50.255
    IPaddr[4176]: 2006/08/30_10:41:30 INFO: Sending Gratuitous Arp for 192.168.50.195 on eth0:0 [eth0]
    IPaddr[4176]: 2006/08/30_10:41:30 INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.50.195 eth0 192.168.50.195 auto 192.168.50.195 ffffffffffff
    IPaddr[4094]: 2006/08/30_10:41:30 INFO: IPaddr Success
     
  15. djalex

    djalex New Member

    var/log/ha-log (test02) - contd 1
    ==========

    ResourceManager[3954]: 2006/08/30_10:41:30 info: Running /etc/ha.d/resource.d/drbddisk r0 start
    ResourceManager[3954]: 2006/08/30_10:41:30 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 start
    ResourceManager[3954]: 2006/08/30_10:41:30 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:30 CRIT: Giving up resources due to failure of Filesystem::/dev/drbd0
    ResourceManager[3954]: 2006/08/30_10:41:30 info: Releasing resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    ResourceManager[3954]: 2006/08/30_10:41:30 info: Running /etc/init.d/smb stop
    ResourceManager[3954]: 2006/08/30_10:41:30 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:30 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:31 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:32 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:32 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:33 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:33 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:33 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:34 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:34 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:34 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:35 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:35 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:35 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:36 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:36 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:36 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:37 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:37 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:37 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    heartbeat[3725]: 2006/08/30_10:41:38 WARN: 1 lost packet(s) for [test01] [19:21]
    heartbeat[3725]: 2006/08/30_10:41:38 info: Status update for node test01: status active
    heartbeat[3725]: 2006/08/30_10:41:38 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:41:38 info: No pkts missing from test01!
    heartbeat[3725]: 2006/08/30_10:41:38 info: remote resource transition completed.
    heartbeat[3725]: 2006/08/30_10:41:38 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own foreign resources!
    heartbeat[3725]: 2006/08/30_10:41:38 info: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:41:38 info: remote resource transition completed.
    heartbeat[3725]: 2006/08/30_10:41:38 info: Local Resource acquisition completed. (none)
    heartbeat[3725]: 2006/08/30_10:41:38 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own foreign resources!
    heartbeat[3725]: 2006/08/30_10:41:38 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(them)' (1))
    heartbeat[3725]: 2006/08/30_10:41:38 info: STATE 3 => 4
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own foreign resources!
    ResourceManager[3954]: 2006/08/30_10:41:38 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:38 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:38 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    heartbeat[3725]: 2006/08/30_10:41:38 info: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own foreign resources!
    heartbeat[3725]: 2006/08/30_10:41:38 info: remote resource transition completed.
    heartbeat[3725]: 2006/08/30_10:41:38 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own foreign resources!
    heartbeat[3725]: 2006/08/30_10:41:38 info: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:38 ERROR: Both machines own foreign resources!
    ResourceManager[3954]: 2006/08/30_10:41:39 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:39 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:39 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:40 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:40 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:40 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:41 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[3954]: 2006/08/30_10:41:41 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[3954]: 2006/08/30_10:41:41 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[3954]: 2006/08/30_10:41:41 ERROR: Resource script for Filesystem::/dev/drbd0 probably not LSB-compliant.
    ResourceManager[3954]: 2006/08/30_10:41:41 WARN: it (Filesystem::/dev/drbd0) MUST succeed on a stop when already stopped
    ResourceManager[3954]: 2006/08/30_10:41:41 WARN: Machine reboot narrowly avoided!
    ResourceManager[3954]: 2006/08/30_10:41:41 info: Running /etc/ha.d/resource.d/drbddisk r0 stop
    ResourceManager[3954]: 2006/08/30_10:41:41 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 stop
    IPaddr[4961]: 2006/08/30_10:41:41 INFO: /sbin/route -n del -host 192.168.50.195
    IPaddr[4961]: 2006/08/30_10:41:41 INFO: /sbin/ifconfig eth0:0 192.168.50.195 down
    IPaddr[4961]: 2006/08/30_10:41:41 INFO: IP Address 192.168.50.195 released
    IPaddr[4879]: 2006/08/30_10:41:41 INFO: IPaddr Success
     
  16. djalex

    djalex New Member

    var/log/ha-log (test02) - contd 2
    ==========
    heartbeat[3725]: 2006/08/30_10:41:41 info: Exiting ip-request-resp process 3939 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:41:41 info: AnnounceTakeover(local 1, foreign 1, reason 'ip-request-resp' (1))
    harc[4998]: 2006/08/30_10:41:41 info: Running /etc/ha.d/rc.d/status status
    heartbeat[3725]: 2006/08/30_10:41:41 info: Exiting status process 4998 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:41:54 info: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:41:54 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:41:54 ERROR: Both machines own foreign resources!
    hb_standby[5017]: 2006/08/30_10:42:11 Going standby [foreign].
    heartbeat[3725]: 2006/08/30_10:42:12 info: test02 wants to go standby [foreign]
    heartbeat[3725]: 2006/08/30_10:42:12 info: i_hold_resources: 3
    heartbeat[3725]: 2006/08/30_10:42:12 info: New standby state: 1
    heartbeat[3725]: 2006/08/30_10:42:12 info: standby: test01 can take our foreign resources
    heartbeat[3725]: 2006/08/30_10:42:12 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:42:12 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:42:12 info: New standby state: 1
    heartbeat[5027]: 2006/08/30_10:42:12 info: give up foreign HA resources (standby).
    heartbeat[5027]: 2006/08/30_10:42:12 info: go_standby: who: 1 resource set: foreign
    heartbeat[5027]: 2006/08/30_10:42:12 info: go_standby: (query/action): (otherkeys/givegroup)
    heartbeat[5027]: 2006/08/30_10:42:12 info: foreign HA resource release completed (standby).
    heartbeat[5027]: 2006/08/30_10:42:12 info: FIFO message [type ask_resources] written rc=51
    heartbeat[3725]: 2006/08/30_10:42:12 info: Local standby process completed [foreign].
    heartbeat[3725]: 2006/08/30_10:42:12 info: New standby state: 3
    heartbeat[3725]: 2006/08/30_10:42:12 info: Exiting go_standby process 5027 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:42:13 WARN: 1 lost packet(s) for [test01] [44:46]
    heartbeat[3725]: 2006/08/30_10:42:13 info: remote resource transition completed.
    heartbeat[3725]: 2006/08/30_10:42:13 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:42:13 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:42:13 info: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:42:13 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:42:13 info: No pkts missing from test01!
    heartbeat[3725]: 2006/08/30_10:42:13 info: Other node completed standby takeover of foreign resources.
    heartbeat[3725]: 2006/08/30_10:42:13 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:42:13 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:42:13 info: New standby state: 0
    heartbeat[3725]: 2006/08/30_10:42:13 info: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:42:13 ERROR: Both machines own our resources!
    heartbeat[3725]: 2006/08/30_10:42:24 info: test01 wants to go standby [foreign]
    heartbeat[3725]: 2006/08/30_10:42:24 info: standby: other_holds_resources: 3
    heartbeat[3725]: 2006/08/30_10:42:24 info: New standby state: 2
    heartbeat[3725]: 2006/08/30_10:42:24 info: New standby state: 2
    heartbeat[3725]: 2006/08/30_10:42:24 info: other_holds_resources: 1
    heartbeat[3725]: 2006/08/30_10:42:46 info: standby: acquire [foreign] resources from test01
    heartbeat[3725]: 2006/08/30_10:42:46 info: New standby state: 3
    heartbeat[5038]: 2006/08/30_10:42:46 info: acquire local HA resources (standby).
    heartbeat[5038]: 2006/08/30_10:42:46 info: go_standby: who: 2 resource set: local
    heartbeat[5038]: 2006/08/30_10:42:46 info: go_standby: (query/action): (ourkeys/takegroup)
    ResourceManager[5048]: 2006/08/30_10:42:46 info: Acquiring resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    IPaddr[5072]: 2006/08/30_10:42:47 INFO: IPaddr Resource is stopped
    ResourceManager[5048]: 2006/08/30_10:42:47 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 start
    IPaddr[5270]: 2006/08/30_10:42:47 INFO: eval /sbin/ifconfig eth0:0 192.168.50.195 netmask 255.255.255.0 broadcast 192.168.50.255
    IPaddr[5270]: 2006/08/30_10:42:47 INFO: Sending Gratuitous Arp for 192.168.50.195 on eth0:0 [eth0]
    IPaddr[5270]: 2006/08/30_10:42:47 INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.50.195 eth0 192.168.50.195 auto 192.168.50.195 ffffffffffff
    IPaddr[5188]: 2006/08/30_10:42:47 INFO: IPaddr Success
    ResourceManager[5048]: 2006/08/30_10:42:47 info: Running /etc/ha.d/resource.d/drbddisk r0 start
    ResourceManager[5048]: 2006/08/30_10:42:47 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 start
    ResourceManager[5048]: 2006/08/30_10:42:47 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:42:47 CRIT: Giving up resources due to failure of Filesystem::/dev/drbd0
    ResourceManager[5048]: 2006/08/30_10:42:47 info: Releasing resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    ResourceManager[5048]: 2006/08/30_10:42:47 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:47 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:48 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:48 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:48 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:49 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:49 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:51 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:52 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:52 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:52 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:53 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:53 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:53 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:54 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:54 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:54 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:55 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:55 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:56 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:57 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:57 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:57 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:58 info: Retrying failed stop operation [smb]
     
  17. djalex

    djalex New Member

    var/log/ha-log (test02) - contd 3
    ==============
    ResourceManager[5048]: 2006/08/30_10:42:58 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:58 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:42:59 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:42:59 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:42:59 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:43:00 info: Retrying failed stop operation [smb]
    ResourceManager[5048]: 2006/08/30_10:43:00 info: Running /etc/init.d/smb stop
    ResourceManager[5048]: 2006/08/30_10:43:00 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[5048]: 2006/08/30_10:43:00 ERROR: Resource script for smb probably not LSB-compliant.
    ResourceManager[5048]: 2006/08/30_10:43:00 WARN: it (smb) MUST succeed on a stop when already stopped
    ResourceManager[5048]: 2006/08/30_10:43:00 WARN: Machine reboot narrowly avoided!
    ResourceManager[5048]: 2006/08/30_10:43:00 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:00 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:01 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:01 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:01 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:02 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:02 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:02 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:03 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:03 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:03 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:04 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:04 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:04 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:05 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:06 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:06 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:07 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:07 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:07 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:08 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:08 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:08 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:09 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:09 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:09 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:10 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:10 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:10 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:11 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[5048]: 2006/08/30_10:43:11 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[5048]: 2006/08/30_10:43:11 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[5048]: 2006/08/30_10:43:11 ERROR: Resource script for Filesystem::/dev/drbd0 probably not LSB-compliant.
    ResourceManager[5048]: 2006/08/30_10:43:11 WARN: it (Filesystem::/dev/drbd0) MUST succeed on a stop when already stopped
    ResourceManager[5048]: 2006/08/30_10:43:11 WARN: Machine reboot narrowly avoided!
    ResourceManager[5048]: 2006/08/30_10:43:11 info: Running /etc/ha.d/resource.d/drbddisk r0 stop
    ResourceManager[5048]: 2006/08/30_10:43:11 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 stop
    IPaddr[6559]: 2006/08/30_10:43:11 INFO: /sbin/route -n del -host 192.168.50.195
    IPaddr[6559]: 2006/08/30_10:43:11 INFO: /sbin/ifconfig eth0:0 192.168.50.195 down
    IPaddr[6559]: 2006/08/30_10:43:11 INFO: IP Address 192.168.50.195 released
    IPaddr[6477]: 2006/08/30_10:43:11 INFO: IPaddr Success
    heartbeat[5038]: 2006/08/30_10:43:11 info: local HA resource acquisition completed (standby).
    heartbeat[5038]: 2006/08/30_10:43:11 info: FIFO message [type ask_resources] written rc=51
    heartbeat[3725]: 2006/08/30_10:43:11 info: Standby resource acquisition done [foreign].
    heartbeat[3725]: 2006/08/30_10:43:11 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:43:11 info: New standby state: 0
    heartbeat[3725]: 2006/08/30_10:43:11 info: Exiting go_standby process 5038 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:43:12 info: remote resource transition completed.
    heartbeat[3725]: 2006/08/30_10:43:12 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:43:12 info: other_holds_resources: 1
    heartbeat[3725]: 2006/08/30_10:43:12 info: other_holds_resources: 1
    hb_standby[6607]: 2006/08/30_10:43:41 Going standby [foreign].
    heartbeat[3725]: 2006/08/30_10:43:41 info: test02 wants to go standby [foreign]
    heartbeat[3725]: 2006/08/30_10:43:41 info: i_hold_resources: 1
    heartbeat[3725]: 2006/08/30_10:43:41 info: New standby state: 1
    heartbeat[3725]: 2006/08/30_10:43:42 info: standby: test01 can take our foreign resources
    heartbeat[3725]: 2006/08/30_10:43:42 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:43:42 info: New standby state: 1
    heartbeat[6618]: 2006/08/30_10:43:42 info: give up foreign HA resources (standby).
    heartbeat[6618]: 2006/08/30_10:43:42 info: go_standby: who: 1 resource set: foreign
    heartbeat[6618]: 2006/08/30_10:43:42 info: go_standby: (query/action): (otherkeys/givegroup)
    heartbeat[6618]: 2006/08/30_10:43:42 info: foreign HA resource release completed (standby).
    heartbeat[6618]: 2006/08/30_10:43:42 info: FIFO message [type ask_resources] written rc=51
    heartbeat[3725]: 2006/08/30_10:43:42 info: Local standby process completed [foreign].
    heartbeat[3725]: 2006/08/30_10:43:42 info: New standby state: 3
    heartbeat[3725]: 2006/08/30_10:43:42 info: Exiting go_standby process 6618 returned rc 0.
    heartbeat[3725]: 2006/08/30_10:43:42 WARN: 1 lost packet(s) for [test01] [98:100]
    heartbeat[3725]: 2006/08/30_10:43:42 info: remote resource transition completed.
    heartbeat[3725]: 2006/08/30_10:43:42 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:43:42 info: other_holds_resources: 1
    heartbeat[3725]: 2006/08/30_10:43:42 info: No pkts missing from test01!
    heartbeat[3725]: 2006/08/30_10:43:42 info: Other node completed standby takeover of foreign resources.
    heartbeat[3725]: 2006/08/30_10:43:42 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[3725]: 2006/08/30_10:43:42 info: New standby state: 0
    heartbeat[3725]: 2006/08/30_10:43:43 info: other_holds_resources: 1
    heartbeat[3725]: 2006/08/30_10:44:59 info: hb_giveup_resources(): current status: active
    heartbeat[3725]: 2006/08/30_10:44:59 info: Heartbeat shutdown in progress. (3725)
    heartbeat[6642]: 2006/08/30_10:44:59 info: Giving up all HA resources.
     
  18. djalex

    djalex New Member

    var/log/ha-log (test02) - contd 4
    ==============
    ResourceManager[6652]: 2006/08/30_10:44:59 info: Releasing resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    ResourceManager[6652]: 2006/08/30_10:44:59 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:44:59 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:00 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:00 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:00 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:01 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:01 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:01 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:02 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:02 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:02 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:03 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:03 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:03 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:04 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:04 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:05 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:06 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:06 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:06 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:07 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:07 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:07 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:08 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:08 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:08 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:09 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:09 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:09 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:10 info: Retrying failed stop operation [smb]
    ResourceManager[6652]: 2006/08/30_10:45:10 info: Running /etc/init.d/smb stop
    ResourceManager[6652]: 2006/08/30_10:45:10 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[6652]: 2006/08/30_10:45:10 ERROR: Resource script for smb probably not LSB-compliant.
    ResourceManager[6652]: 2006/08/30_10:45:10 WARN: it (smb) MUST succeed on a stop when already stopped
    ResourceManager[6652]: 2006/08/30_10:45:10 WARN: Machine reboot narrowly avoided!
    ResourceManager[6652]: 2006/08/30_10:45:10 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:10 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:11 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:11 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:11 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:12 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:12 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:12 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:13 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:13 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:13 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:14 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:14 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:14 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:15 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:15 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:15 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:16 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:16 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:16 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:18 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:18 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:18 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:19 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:19 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:19 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:20 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:20 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:20 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:21 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[6652]: 2006/08/30_10:45:21 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[6652]: 2006/08/30_10:45:21 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[6652]: 2006/08/30_10:45:21 ERROR: Resource script for Filesystem::/dev/drbd0 probably not LSB-compliant.
    ResourceManager[6652]: 2006/08/30_10:45:21 WARN: it (Filesystem::/dev/drbd0) MUST succeed on a stop when already stopped
    ResourceManager[6652]: 2006/08/30_10:45:21 WARN: Machine reboot narrowly avoided!
    ResourceManager[6652]: 2006/08/30_10:45:21 info: Running /etc/ha.d/resource.d/drbddisk r0 stop
    ResourceManager[6652]: 2006/08/30_10:45:21 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 stop
    IPaddr[7687]: 2006/08/30_10:45:21 INFO: IPaddr Success
    heartbeat[6642]: 2006/08/30_10:45:21 info: All HA resources relinquished.
    heartbeat[6642]: 2006/08/30_10:45:21 info: FIFO message [type shutdone] written rc=27
    heartbeat[3725]: 2006/08/30_10:45:22 info: other_holds_resources: 0
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBFIFO process 3732 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBWRITE process 3733 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBREAD process 3734 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBWRITE process 3735 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBREAD process 3736 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBWRITE process 3737 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: killing HBREAD process 3738 with signal 15
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3732 exited. 7 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3733 exited. 6 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3734 exited. 5 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3735 exited. 4 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3736 exited. 3 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3737 exited. 2 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: Core process 3738 exited. 1 remaining
    heartbeat[3725]: 2006/08/30_10:45:23 info: test02 Heartbeat shutdown complete.
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid user id name [hacluster]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Bad uid list [hacluster]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid apiauth directive [ipfail uid=hacluster]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Where uidlist is a comma-separated list of uids,
    heartbeat[7933]: 2006/08/30_10:45:54 info: and gidlist is a comma-separated list of gids
    heartbeat[7933]: 2006/08/30_10:45:54 info: One or the other must be specified.
     
  19. djalex

    djalex New Member

    var/log/ha-log (test02) - contd 5
    ==========

    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid user id name [hacluster]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Bad uid list [hacluster]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid apiauth directive [ccm uid=hacluster]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Where uidlist is a comma-separated list of uids,
    heartbeat[7933]: 2006/08/30_10:45:54 info: and gidlist is a comma-separated list of gids
    heartbeat[7933]: 2006/08/30_10:45:54 info: One or the other must be specified.
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid group name [haclient]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Bad gid list [haclient]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid apiauth directive [ping gid=haclient]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Where uidlist is a comma-separated list of uids,
    heartbeat[7933]: 2006/08/30_10:45:54 info: and gidlist is a comma-separated list of gids
    heartbeat[7933]: 2006/08/30_10:45:54 info: One or the other must be specified.
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid group name [haclient]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Bad gid list [haclient]
    heartbeat[7933]: 2006/08/30_10:45:54 ERROR: Invalid apiauth directive [anon gid=haclient]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Syntax: apiauth client [uid=uidlist] [gid=gidlist]
    heartbeat[7933]: 2006/08/30_10:45:54 info: Where uidlist is a comma-separated list of uids,
    heartbeat[7933]: 2006/08/30_10:45:54 info: and gidlist is a comma-separated list of gids
    heartbeat[7933]: 2006/08/30_10:45:54 info: One or the other must be specified.
    heartbeat[7933]: 2006/08/30_10:45:54 info: AUTH: i=3: key = 0x9c79260, auth=0x2bb934, authname=md5
    heartbeat[7933]: 2006/08/30_10:45:54 WARN: Logging daemon is disabled --enabling logging daemon is recommended
    heartbeat[7933]: 2006/08/30_10:45:54 info: **************************
    heartbeat[7933]: 2006/08/30_10:45:54 info: Configuration validated. Starting heartbeat 2.0.7
    heartbeat[7934]: 2006/08/30_10:45:54 info: heartbeat: version 2.0.7
    heartbeat[7934]: 2006/08/30_10:45:54 ERROR: change_logfile_ownship: entry for user hacluster not found
    heartbeat[7934]: 2006/08/30_10:45:54 info: Heartbeat generation: 66
    heartbeat[7934]: 2006/08/30_10:45:54 info: G_main_add_TriggerHandler: Added signal manual handler
    heartbeat[7934]: 2006/08/30_10:45:54 info: G_main_add_TriggerHandler: Added signal manual handler
    heartbeat[7934]: 2006/08/30_10:45:54 info: Removing /var/run/heartbeat/rsctmp failed, recreating.
    heartbeat[7934]: 2006/08/30_10:45:54 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
    heartbeat[7934]: 2006/08/30_10:45:54 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
    heartbeat[7934]: 2006/08/30_10:45:54 info: glib: ping heartbeat started.
    heartbeat[7934]: 2006/08/30_10:45:54 info: glib: ping heartbeat started.
    heartbeat[7934]: 2006/08/30_10:45:54 info: G_main_add_SignalHandler: Added signal handler for signal 17
    heartbeat[7934]: 2006/08/30_10:45:54 info: Local status now set to: 'up'
    heartbeat[7934]: 2006/08/30_10:45:55 info: Link test02:eth0 up.
    heartbeat[7934]: 2006/08/30_10:45:55 info: Status update for node test01: status ping
    heartbeat[7934]: 2006/08/30_10:45:56 info: Status update for node test02: status ping
    heartbeat[7934]: 2006/08/30_10:46:00 info: Status update for node test01: status init
    heartbeat[7934]: 2006/08/30_10:46:00 info: Status update for node test01: status up
    harc[7949]: 2006/08/30_10:46:00 info: Running /etc/ha.d/rc.d/status status
    heartbeat[7934]: 2006/08/30_10:46:00 info: Exiting status process 7949 returned rc 0.
    harc[7960]: 2006/08/30_10:46:00 info: Running /etc/ha.d/rc.d/status status
    heartbeat[7934]: 2006/08/30_10:46:00 info: Exiting status process 7960 returned rc 0.
    heartbeat[7934]: 2006/08/30_10:46:25 WARN: node test01: is dead
    heartbeat[7934]: 2006/08/30_10:46:25 info: Comm_now_up(): updating status to active
    heartbeat[7934]: 2006/08/30_10:46:25 info: Local status now set to: 'active'
    heartbeat[7934]: 2006/08/30_10:46:25 WARN: No STONITH device configured.
    heartbeat[7934]: 2006/08/30_10:46:25 WARN: Shared disks are not protected.
    heartbeat[7934]: 2006/08/30_10:46:25 info: Resources being acquired from test01.
    harc[7977]: 2006/08/30_10:46:25 info: Running /etc/ha.d/rc.d/status status
    mach_down[8018]: 2006/08/30_10:46:25 info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
    mach_down[8018]: 2006/08/30_10:46:25 info: mach_down takeover complete for node test01.
    heartbeat[7934]: 2006/08/30_10:46:25 info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES' (0))
    heartbeat[7934]: 2006/08/30_10:46:25 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (0))
    heartbeat[7934]: 2006/08/30_10:46:25 info: Initial resource acquisition complete (T_RESOURCES(us))
    heartbeat[7934]: 2006/08/30_10:46:25 info: mach_down takeover complete.
    heartbeat[7934]: 2006/08/30_10:46:25 info: AnnounceTakeover(local 1, foreign 1, reason 'mach_down' (1))
    heartbeat[7934]: 2006/08/30_10:46:25 info: STATE 1 => 3
    heartbeat[7934]: 2006/08/30_10:46:25 info: Exiting status process 7977 returned rc 0.
    IPaddr[8012]: 2006/08/30_10:46:25 INFO: IPaddr Resource is stopped
    req_resource[7995]: 2006/08/30_10:46:25 debug: in /usr/lib/heartbeat/req_resource IPaddr::192.168.50.195/24/eth0
    req_resource[7995]: 2006/08/30_10:46:25 debug: dont_ask: yes nice_failback: yes
    heartbeat[7978]: 2006/08/30_10:46:25 info: 1 local resources from [/usr/lib/heartbeat/ResourceManager listkeys test02]
    heartbeat[7978]: 2006/08/30_10:46:25 info: Local Resource acquisition completed.
    heartbeat[7978]: 2006/08/30_10:46:25 info: FIFO message [type resource] written rc=79
    heartbeat[7934]: 2006/08/30_10:46:25 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[7934]: 2006/08/30_10:46:25 info: Exiting req_our_resources process 7978 returned rc 0.
    heartbeat[7934]: 2006/08/30_10:46:25 info: AnnounceTakeover(local 1, foreign 1, reason 'req_our_resources' (1))
    harc[8149]: 2006/08/30_10:46:25 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
    ip-request-resp[8149]: 2006/08/30_10:46:25 received ip-request-resp IPaddr::192.168.50.195/24/eth0 OK yes
    ResourceManager[8164]: 2006/08/30_10:46:25 info: Acquiring resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    IPaddr[8188]: 2006/08/30_10:46:25 INFO: IPaddr Resource is stopped
    ResourceManager[8164]: 2006/08/30_10:46:25 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 start
    IPaddr[8386]: 2006/08/30_10:46:25 INFO: eval /sbin/ifconfig eth0:0 192.168.50.195 netmask 255.255.255.0 broadcast 192.168.50.255
    IPaddr[8386]: 2006/08/30_10:46:25 INFO: Sending Gratuitous Arp for 192.168.50.195 on eth0:0 [eth0]
    IPaddr[8386]: 2006/08/30_10:46:25 INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.50.195 eth0 192.168.50.195 auto 192.168.50.195 ffffffffffff
    IPaddr[8304]: 2006/08/30_10:46:25 INFO: IPaddr Success
    ResourceManager[8164]: 2006/08/30_10:46:26 info: Running /etc/ha.d/resource.d/drbddisk r0 start
    ResourceManager[8164]: 2006/08/30_10:46:26 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 start
    ResourceManager[8164]: 2006/08/30_10:46:26 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:26 CRIT: Giving up resources due to failure of Filesystem::/dev/drbd0
    ResourceManager[8164]: 2006/08/30_10:46:26 info: Releasing resource group: test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb
    ResourceManager[8164]: 2006/08/30_10:46:26 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:26 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:27 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:27 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:27 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:28 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:28 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:28 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:29 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:29 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:29 ERROR: Return code 1 from /etc/init.d/smb
     
  20. djalex

    djalex New Member

    var/log/ha-log (test02) - contd 6
    ==============
    heartbeat[7934]: 2006/08/30_10:46:30 WARN: 1 lost packet(s) for [test01] [19:21]
    heartbeat[7934]: 2006/08/30_10:46:30 info: Status update for node test01: status active
    heartbeat[7934]: 2006/08/30_10:46:30 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[7934]: 2006/08/30_10:46:30 info: No pkts missing from test01!
    heartbeat[7934]: 2006/08/30_10:46:30 info: remote resource transition completed.
    heartbeat[7934]: 2006/08/30_10:46:30 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own our resources!
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own foreign resources!
    heartbeat[7934]: 2006/08/30_10:46:30 info: other_holds_resources: 3
    heartbeat[7934]: 2006/08/30_10:46:30 info: remote resource transition completed.
    heartbeat[7934]: 2006/08/30_10:46:30 info: Local Resource acquisition completed. (none)
    heartbeat[7934]: 2006/08/30_10:46:30 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own our resources!
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own foreign resources!
    heartbeat[7934]: 2006/08/30_10:46:30 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(them)' (1))
    heartbeat[7934]: 2006/08/30_10:46:30 info: STATE 3 => 4
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own our resources!
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own foreign resources!
    ResourceManager[8164]: 2006/08/30_10:46:30 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:30 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:30 ERROR: Return code 1 from /etc/init.d/smb
    heartbeat[7934]: 2006/08/30_10:46:30 info: other_holds_resources: 3
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own our resources!
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own foreign resources!
    heartbeat[7934]: 2006/08/30_10:46:30 info: remote resource transition completed.
    heartbeat[7934]: 2006/08/30_10:46:30 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own our resources!
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own foreign resources!
    heartbeat[7934]: 2006/08/30_10:46:30 info: other_holds_resources: 3
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own our resources!
    heartbeat[7934]: 2006/08/30_10:46:30 ERROR: Both machines own foreign resources!
    ResourceManager[8164]: 2006/08/30_10:46:31 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:31 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:31 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:32 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:32 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:32 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:33 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:33 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:33 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:34 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:34 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:35 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:36 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:36 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:36 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:37 info: Retrying failed stop operation [smb]
    ResourceManager[8164]: 2006/08/30_10:46:37 info: Running /etc/init.d/smb stop
    ResourceManager[8164]: 2006/08/30_10:46:37 ERROR: Return code 1 from /etc/init.d/smb
    ResourceManager[8164]: 2006/08/30_10:46:37 ERROR: Resource script for smb probably not LSB-compliant.
    ResourceManager[8164]: 2006/08/30_10:46:37 WARN: it (smb) MUST succeed on a stop when already stopped
    ResourceManager[8164]: 2006/08/30_10:46:37 WARN: Machine reboot narrowly avoided!
    ResourceManager[8164]: 2006/08/30_10:46:37 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:37 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:38 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:38 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:38 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:39 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:39 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:39 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:40 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:40 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:40 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:41 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:41 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:41 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:42 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:42 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:42 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:43 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:43 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:43 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:44 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:44 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:44 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:45 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:45 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:45 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:46 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:46 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:46 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:47 info: Retrying failed stop operation [Filesystem::/dev/drbd0]
    ResourceManager[8164]: 2006/08/30_10:46:47 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 stop
    ResourceManager[8164]: 2006/08/30_10:46:47 ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem
    ResourceManager[8164]: 2006/08/30_10:46:47 ERROR: Resource script for Filesystem::/dev/drbd0 probably not LSB-compliant.
    ResourceManager[8164]: 2006/08/30_10:46:47 WARN: it (Filesystem::/dev/drbd0) MUST succeed on a stop when already stopped
    ResourceManager[8164]: 2006/08/30_10:46:47 WARN: Machine reboot narrowly avoided!
    ResourceManager[8164]: 2006/08/30_10:46:47 info: Running /etc/ha.d/resource.d/drbddisk r0 stop
    ResourceManager[8164]: 2006/08/30_10:46:48 info: Running /etc/ha.d/resource.d/IPaddr 192.168.50.195/24/eth0 stop
    IPaddr[9672]: 2006/08/30_10:46:48 INFO: /sbin/route -n del -host 192.168.50.195
    IPaddr[9672]: 2006/08/30_10:46:48 INFO: /sbin/ifconfig eth0:0 192.168.50.195 down
     

Share This Page