Server randomly stops responding

Discussion in 'Server Operation' started by Jayock, Oct 2, 2007.

  1. Jayock

    Jayock New Member

    I am running an FC7 server setup using the Perfect server guide, and running ISPConfig. The server randomly stops responding to all requests except for telnet, which always goes through. This happens fairly frequently with dovecot, apache, and SSH. Restarting the services will typically bring them back to life, but not always. They go back down again, and I repeat while I try to figure out the permanent solutions. Nothing in the system logs seems to be showing me any solutions. Any ideas where to start looking from here?

    Im wondering if it might be a PAM issue, or PHP scripts on my website (which worked perfectly on a debian server before migrating here).

    Thanks,
    Justin
     
  2. falko

    falko Super Moderator Howtoforge Staff

    Just a shot in the dark: is SELinux enabled?
     
  3. Jayock

    Jayock New Member

    Nope, its off. The ISPConfig firewall is the only firewall too. Ive tried turning that on and off, no difference.

    Had a large outage this afternoon. About an hour long, Restarting services, and even the entire server did nothing. Pings OK, but nothing else works, ssh, ftp, http, bind, you name it.

    Very perplexing.
     
  4. Jayock

    Jayock New Member

    Oddly enough ive noticed that while on the local network, ping times go from 1ms to 27-47ms when the server isn't responding, but only on the primary IP. It doesn't ever slow down on the secondary IP (on eth0:1)

    munin doesnt show any excessive loads on anything that should be causing this (and the server is 2x Quad core Xeon 8gb ram, Gig-E) So is it possible that a service is causing the eth0 address to respond slowly, but not the eth0:1? Or is it a network issue. The server always gets all services outbound, just never inbound. Ive already replaced the switch, since i had a second lying around, just to narrow it down.

    Any ideas with this new Info?
     
  5. falko

    falko Super Moderator Howtoforge Staff

    Can you disable the firewall for testing purposes? Does it then happen again?
     
  6. Jayock

    Jayock New Member

    Ive already tried. It does not seem to be the firewall.
     
  7. Jayock

    Jayock New Member

    More interesting developments. After pinging the eth0:1 address, the eth0 address seems to respond normally. So when it isn't working, I ping the eth0:1 address, and everything comes back to life. Very odd. Any ideas?
     
  8. falko

    falko Super Moderator Howtoforge Staff

    What's in /etc/sysconfig/network-scripts/ifcfg-eth0, /etc/sysconfig/network-scripts/ifcfg-eth0:0, /etc/sysconfig/network-scripts/ifcfg-eth0:1, etc.?
     
  9. Jayock

    Jayock New Member

    ifcfg-eth0:
    # Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
    DEVICE=eth0
    BOOTPROTO=none
    BROADCAST=XX.XX.XX.239
    HWADDR=00:1c:23:ba:5d:1f
    IPADDR=XX.XX.XX.236
    NETMASK=255.255.255.248
    NETWORK=XX.XX.XX.232
    ONBOOT=yes
    GATEWAY=XX.XX.XX.233
    TYPE=Ethernet
    USERCTL=no
    IPV6INIT=no
    PEERDNS=yes

    No ifcfg-eth0:0 exists

    ifcfg-eth0:1
    # Please read /usr/share/doc/initscripts-*/sysconfig.txt
    # for the documentation of these parameters.
    GATEWAY=XX.XX.XX.233
    TYPE=Ethernet
    DEVICE=eth0:0
    BOOTPROTO=none
    NETMASK=255.255.255.248
    IPADDR=XX.XX.XX.237
    USERCTL=no
    IPV6INIT=no
    PEERDNS=yes
    ONPARENT=yes

    It was doing the same thing when it was just eth0. I added the eth0:1 to help diagnose. Didn't really end up helping.
     
  10. falko

    falko Super Moderator Howtoforge Staff

    I was thinking that maybe you use HWADDR (used by eth0) again for the erh0:1 configuration, but that's not the case.

    Hm... Maybe it's a problem with the network card driver? What's the output of
    Code:
    lspci
    ?
     
  11. Jayock

    Jayock New Member

    I actually got it stable by switching over to eth1 and eth1:1. Which are associated with the second built in GIG-E port on the server (it has 2 from factory). Same NIC, same driver, same chipset, same bus, so I have no idea why eht1 would work, but eth0 wouldn't. Possibly a MAC address conflict with another device on the network. Anyways:

    [oconnell@server1 ~]$ /sbin/lspci
    00:00.0 Host bridge: Intel Corporation 5000X Chipset Memory Controller Hub (rev
    12)
    00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 2
    (rev 12)
    00:03.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 3
    (rev 12)
    00:04.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 4
    (rev 12)
    00:05.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 5
    (rev 12)
    00:06.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port 6-
    7 (rev 12)
    00:07.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 7
    (rev 12)
    00:10.0 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 12
    )
    00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 12
    )
    00:10.2 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 12
    )
    00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers (r
    ev 12)
    00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers (r
    ev 12)
    00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 12
    )
    00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 12
    )
    00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI Express R
    oot Port 1 (rev 09)
    00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB
    Controller #1 (rev 09)
    00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB
    Controller #2 (rev 09)
    00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB
    Controller #3 (rev 09)
    00:1d.3 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB
    Controller #4 (rev 09)
    00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset EHCI USB2
    Controller (rev 09)
    00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
    00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC Interface Controller (rev 09)
    00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 09)
    01:00.0 PCI bridge: Intel Corporation 80333 Segment-A PCI Express-to-PCI Express Bridge
    01:00.2 PCI bridge: Intel Corporation 80333 Segment-B PCI Express-to-PCI Express Bridge
    02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 5i
    04:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev c3)
    05:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit E thernet (rev 12)
    06:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Upstream Port (rev 01)
    06:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to PCI-X Bridg e (rev 01)
    07:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Por t E1 (rev 01)
    07:01.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Por t E2 (rev 01)
    08:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev c3)
    09:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit E thernet (rev 12)
    10:0d.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
    [oconnell@server1 ~]$
     
  12. Jayock

    Jayock New Member

    Still stumped why one of the eth interfaces works and the other does not. Any ideas?
     
  13. falko

    falko Super Moderator Howtoforge Staff

    No, unfortunately not... :eek:
     

Share This Page