Server sometimes (1 or 2 hrs) down :/

Discussion in 'Server Operation' started by edge, Apr 22, 2006.

  1. edge

    edge Active Member Moderator

    Hmm I do not really want to leave the firewall off, and I can not really test it as this error sometimes does not happen for days/weeks!

    I've been google'ing about the 'sk98lin', but did not find any good info.. What does it do?
     
  2. falko

    falko Super Moderator Howtoforge Staff

  3. dusti

    dusti New Member

    I have these strange isues with a dell GB nic also...
    The NIC is working fine most of the time and then suddenly all goes wrong.

    I did some pings and arping while capturing output with tcpdump -v -v
    that was interresting:
    If I pinged to annother server or the gateway It would do it good for about 5 times then it goes all wrong for about 50 pings, then it pings again wright for about 5 times and on and on...
    seemed that for some obsure season the the NIC forgot what the mac was of the machine it was talking too, so it started broadcasting, gets his answer, pings 5 times and then forgets it again?
    When I arping'd it just kept going...
    To me it seems like a driver issue, specialy since the same configuration works flawless with a 3 com GB pci NIC. Don't know if it's related to the PCI-X or with the NIC itself...



    divert: allocating divert_blk for eth0
    eth0: Tigon3 [partno(BCM95721) rev 4101 PHY(5750)] (PCI Express) 10/100/1000BaseT Ethernet 00:15:c5:5f:03:96
    eth0: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[1] Split[0] WireSpeed[1] TSOcap[1]
    eth0: dma_rwctrl[76180000]

    divert: allocating divert_blk for eth1
    eth1: 3Com Gigabit NIC (3C2000)

    Uname -a
    Linux elmo.domain.xx 2.6.9-34.ELsmp #1 SMP Wed Mar 8 00:27:03 CST 2006 i686 i686 i386 GNU/Linux

    I have no clue what or how I could test further to find out what the problem really is (and how I could solve it...)
     
  4. edge

    edge Active Member Moderator

    dusti,

    Are you using any virtual IP's / NIC's? If so.. remove all of them (only use the main NIC with one IP), and see if you still have the problem. If not, add one virtual IP / NIC, and see if all is still okay. etc ect.. I think that in my case one of the virtual IP's was causing the problem.

    An other thing I did (as backup) was that I made a script pinging the gateway ever 5 minutes.. When it did not get a reply, it did a network restart.. After the network restart the network was working fine again.

    btw. I'm now not using Fedora anymore (it's Debian now), and all is working fine with my Dell PowerEdge and it's 16 IP's
     
    Last edited: Jul 31, 2006
  5. dusti

    dusti New Member

    I am not using virtual IP's, at the moment I only need 1,
    Later this week I'll try the driver rpm I got with the machine, see what that gives. I know debian is sometimes better, somethimes fedora is better,
    anyway, I just have to run fedora/redhat-based here.
    I also could try to add a virtual nic and see if that one is doing a better job
     
  6. edge

    edge Active Member Moderator

    dusti,

    please do not get me wrong with the Fedora v/s Debian. I have an other server running Fedora, and both are great OS'es!

    The reason I installed Debian on that server was only to try an other OS!
     

Share This Page