CPU problems under Fedora 12 - Constantine

Discussion in 'Technical' started by K_meleonu, Apr 24, 2010.

  1. K_meleonu

    K_meleonu Member

    Hi all,
    I run into a problem on a server running Fedora 12.
    2 days ago my server crashed. It just frozed and i was not able to run anny command on it. The only solution was a reboot (from the button). After that i saw that it uses only one CPU core from 2.

    The CPU is an Intel(R) Pentium(R) 4 CPU 3.06GHz
    vendor_id : GenuineIntel
    cpu family : 15
    model : 4
    stepping : 9
    cpu MHz : 3066.590
    cache size : 1024 KB

    If i run mpstat -P ALL i get this:
    [root@CrazyDesign ~]# mpstat -P ALL
    Linux 2.6.32.11-99.fc12.i686 (CrazyDesign ) 04/24/2010 _i686_ (1 CPU)

    01:06:00 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
    01:06:00 AM all 66.23 0.00 4.84 5.01 1.45 2.50 0.00 0.00 19.97
    01:06:00 AM 0 66.23 0.00 4.84 5.01 1.45 2.50 0.00 0.00 19.97

    Untill the reboot it was showing (2 CPU) and then it was showing all, 0 and 1 and the values for core 0 and 1.
    I thought it might be a kernel problem and i made an update (yum update kernet , kernel-headers, etc).
    I also restarted the server after this update and nothing.

    My uname -a shows:
    Linux CrazyDesign 2.6.32.11-99.fc12.i686 #1 SMP Mon Apr 5 16:32:08 EDT 2010 i686 i686 i386 GNU/Linux

    Hardware Information
    Processors
    Intel(R) Pentium(R) 4 CPU 3.06GHz
    CPU Speed: 3.07 GHz
    Cache Size: 1024.00 KiB
    System Bogomips: 6133
    Load Averages: 100%
    PCI Devices
    (5x) Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro Host Bridge
    Host bridge: VIA Technologies, Inc. PT890 Host Bridge
    PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
    (2x) Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+
    IDE interface: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller
    IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE
    (4x) USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
    USB Controller: VIA Technologies, Inc. USB 2.0
    ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South]
    Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller
    VGA compatible controller: VIA Technologies, Inc. CN700/P4M800 Pro/P4M800 CE/VN800 [S3 UniChrome Pro]
    IDE Devices
    none
    SCSI Devices
    ATA WDC WD800JB-00JJ (Direct-Access)
    _NEC CD-RW NR-9500B (CD-ROM)
    USB Devices
    (4x) Linux Foundation 1.1 root hub
    Linux Foundation 2.0 root hub


    On another server with a same intel CPU and an older version of kernel is everything ok.
    The info's are:
    [root@smotocica ~]# mpstat -P ALL
    Linux 2.6.31.12-174.2.3.fc12.i686 (smotocica.crazydesign) 04/24/2010 _i686_ (2 CPU)

    01:08:23 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
    01:08:23 AM all 41.04 0.68 7.56 0.59 0.07 0.82 0.00 0.00 49.26
    01:08:23 AM 0 40.66 0.82 2.00 1.01 0.01 0.08 0.00 0.00 55.43
    01:08:23 AM 1 41.41 0.54 13.07 0.17 0.13 1.56 0.00 0.00 43.13

    [root@smotocica ~]# uname -a
    Linux smotocica.crazydesign 2.6.31.12-174.2.3.fc12.i686 #1 SMP Mon Jan 18 20:22:46 UTC 2010 i686 i686 i386 GNU/Linux


    Does anyone have any ideas? I am guessing this is a kernel problem. I think that this if this was a CPU problem the server would'n start at all after the restart.
    Please post any links to similar problems or any other helpfull information you have.

    Thank you all in advance.

    Best regards
     
    Last edited: Apr 24, 2010
  2. K_meleonu

    K_meleonu Member

    Hi to all again,
    I was searching a little through dmesg and found something about my CPU.

    x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
    initial memory mapped : 0 - 01000000
    ---------------------------------------------
    Using APIC driver default
    SFI: Simple Firmware Interface v0.7 http://simplefirmware.org
    Intel MultiProcessor Specification v1.4
    Virtual Wire compatibility mode.
    MPTABLE: OEM ID: OEM00000
    MPTABLE: Product ID: PROD00000000
    MPTABLE: APIC at: 0xFEE00000
    Processor #0 (Bootup-CPU)
    I/O APIC #4 Version 17 at 0xFEC00000.
    Processors: 1
    SMP: Allowing 1 CPUs, 0 hotplug CPUs
    nr_irqs_gsi: 24
    PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
    PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
    PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
    Allocating PCI resources starting at 3c000000 (gap: 3c000000:c2c00000)
    Booting paravirtualized kernel on bare hardware
    NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:1 nr_node_ids:1
    PERCPU: Embedded 14 pages/cpu @c2000000 s34488 r0 d22856 u4194304
    pcpu-alloc: s34488 r0 d22856 u4194304 alloc=1*4194304
    pcpu-alloc: [0] 0
    ---------------------------------------------
    Enabling fast FPU save and restore... done.
    Enabling unmasked SIMD FPU exception support... done.
    Initializing CPU#0
    allocated 4914880 bytes of page_cgroup
    please try 'cgroup_disable=memory' option if you don't want memory cgroups
    Initializing HighMem for node 0 (000373fe:0003bff0)
    Memory: 949720k/982976k available (3677k kernel code, 32564k reserved, 2312k data, 548k init, 77768k highmem)
    virtual kernel memory layout:
    fixmap : 0xffad5000 - 0xfffff000 (5288 kB)
    pkmap : 0xff400000 - 0xff800000 (4096 kB)
    vmalloc : 0xf7bfe000 - 0xff3fe000 ( 120 MB)
    lowmem : 0xc0000000 - 0xf73fe000 ( 883 MB)
    .init : 0xc09da000 - 0xc0a63000 ( 548 kB)
    .data : 0xc079740c - 0xc09d9710 (2312 kB)
    .text : 0xc0400000 - 0xc079740c (3677 kB)
    Checking if this processor honours the WP bit even in supervisor mode...Ok.
    SLUB: Genslabs=13, HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
    Hierarchical RCU implementation.
    --------------------------------------------
    Fast TSC calibration using PIT
    Detected 3066.522 MHz processor.
    Calibrating delay loop (skipped), value calculated using timer frequency.. 6133.04 BogoMIPS (lpj=3066522)
    Security Framework initialized
    SELinux: Initializing.
    SELinux: Starting in permissive mode
    Mount-cache hash table entries: 512
    Initializing cgroup subsys ns
    Initializing cgroup subsys cpuacct
    Initializing cgroup subsys memory
    Initializing cgroup subsys devices
    Initializing cgroup subsys freezer
    Initializing cgroup subsys net_cls
    CPU: Trace cache: 12K uops, L1 D cache: 16K
    CPU: L2 cache: 1024K
    CPU: Unsupported number of siblings 2
    mce: CPU supports 4 MCE banks
    ----------------------------------------------
    Enabling APIC mode: Flat. Using 1 I/O APICs
    SMP disabled
    Brought up 1 CPUs
    Total of 1 processors activated (6133.04 BogoMIPS).
    sizeof(vma)=84 bytes
    sizeof(page)=32 bytes
    sizeof(inode)=352 bytes
    sizeof(dentry)=132 bytes
    sizeof(ext3inode)=508 bytes
    sizeof(buffer_head)=56 bytes
    sizeof(skbuff)=184 bytes
    sizeof(task_struct)=3256 bytes
    CPU0 attaching NULL sched-domain.
    devtmpfs: initialized
    regulator: core version 0.5
    Time: 21:55:25 Date: 04/23/10

    Hope this will help.
    I also have attached the full dmesg output in a .txt file
     

    Attached Files:

  3. K_meleonu

    K_meleonu Member

    Hi all,
    It's me again.
    I see noone has an answer for this problem.
    I did some tests on the server and tested the old kernel versions.
    In my /boot/grub.menu.lst i was having:

    Code:
    default=0
    timeout=0
    splashimage=(hd0,0)/grub/splash.xpm.gz
    hiddenmenu
    
    I have modified that to:
    Code:
    default=0
    timeout=10
    splashimage=(hd0,0)/grub/splash.xpm.gz
    #hiddenmenu
    
    And now i can choose from a list wich kernel to boot.
    I have tested all 3 versions i have ( 2.6.32.11-99.fc12.i686, 2.6.32.9-70.fc12.i686 and 2.6.31.12-174.2.19.fc12.i686 )

    After booting each one i have run a mpstat -P ALL and with all 3 kernels i was having a result with only one CPU instead of two.

    I still don't want to think that this could be a hardware problem but... you can newer know.

    So.. i yell at you all, if is someone who had or has this problem please let a message here. Maybe we find an answer for this problem.

    Than you all in advance
     
  4. K_meleonu

    K_meleonu Member

    Hello again to all,

    I have made some tests on the server, i have started it with a Live CD and checked the CPU's.
    With the Live CD i can see and use both CPU's. After i have booted from the HDD the problem has appeared again and the sistem "sees" and uses only one CPU.

    So, this is a clear software problem.

    Please tell me wich are the programs / .conf's wich are responsable of the CPU.

    (P.S. i have had this problem some years ago with a windows XP and a Dual Core PC. The windows was seeiyng only one CPU and i have solved the problem by changing it with another wersion).

    Thank you all in advance
     
  5. enteridfe

    enteridfe New Member

    best cpu

    Processors: 1
    SMP: Allowing 1 CPUs, 0 hotplug CPUs
    nr_irqs_gsi: 24
    PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
    PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
    PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
    Allocating PCI resources starting at 3c000000 (gap: 3c000000:c2c00000)
    Booting paravirtualized kernel on bare hardware
    NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:1 nr_node_ids:1
    PERCPU: Embedded 14 pages/cpu @c2000000 s34488 r0 d22856 u4194304
    pcpu-alloc: s34488 r0 d22856 u4194304 alloc=1*4194304
    pcpu-alloc: [0] 0
     
  6. K_meleonu

    K_meleonu Member

    I have seen that but how can i fix it?

    I did a ls -s /sys/devices/system/cpu and i have this:
    . .. cpu0 cpuidle kernel_max offline online perf_counters possible present sched_smt_power_savings
    It sould be cpu0 and cpu1

    And i have THIS:
    cat offline
    1-31
    cat online
    0
    cat possible
    0
    cat present
    0

    INSTEAD of:
    cat offline
    2-31
    cat online
    0-1
    cat possible
    0-1
    cat present
    0-1
     
    Last edited: Apr 28, 2010
  7. K_meleonu

    K_meleonu Member

    I have solved the problem.

    I have edited /boot/grub/menu.lst and REMOVED noapic nolapic acpi=off from the boot option line.
    I have restarted the machine and right now it "sees" and uses both cpu's
     

Share This Page