system cant boot!

Discussion in 'ISPConfig 3 Priority Support' started by craig baker, Dec 9, 2020.

  1. craig baker

    craig baker Member HowtoForge Supporter

    I started having odd mail problems so I rebooted the centos 8 server and now I can do nothing!
    i see the centos 8 boot choice in the menu as usual but then

    I see errors in the boot log:
    timed out waiting for device dev-mapper-cl\x2dswap.device
    then
    dependency failed for resume from h.. using device /dev/mapper/cl-swap
    then warning /dev/cl/root does not exist and dev/cl/swap does not exist
    and i'm at emergency mode! dracut:/#
    any idea wtf is going on?
    help!

    ADDITIONAL INFORMATION: my centos 8 boot menu gave me:
    Centos Linux (4.18.0.240.1.1.el8_3.x86_64) 8 <--- (default choice crashes)
    Centos Linux (4.18-0-193.19.1.el8_2.86_64) 8 (core) <--- booted ok from what I can tell

    picking the second one what do I need to check for? am I actually functional? and do I need to fix the top choice? I DID do a dnf update recently I think thats what installed the newer kernel!

    so cautiously I seem to be functional!
    cdb.
     
    Last edited: Dec 9, 2020
  2. Th0m

    Th0m ISPConfig Developer Staff Member ISPConfig Developer

  3. craig baker

    craig baker Member HowtoForge Supporter

    yes twice. the top menu choice cant boot - the second choice (core with older kernel) seems to boot properly!
     
  4. craig baker

    craig baker Member HowtoForge Supporter

    from /var/log/messages-
    Dec 7 18:58:59 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 18:59:35 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 18:59:36 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 19:00:04 ns10 systemd-coredump[4070816]: Process 4070814 (doveadm) of user 0 dumped core.#012#012Stack trace of thread 4070814:#012#0 0x00007f824f9377ff raise (libc.so.6)#012#1 0x00007f824f921c35 abort (libc.so.6)#012#2 0x00007f825058de53 fatal_handler_real.cold.16 (libdovecot.so.0)#012#3 0x00007f8250630997 default_fatal_handler (libdovecot.so.0)#012#4 0x00007f825058db05 i_panic (libdovecot.so.0)#012#5 0x00007f825058b6a3 auth_master_unset_io.cold.18 (libdovecot.so.0)#012#6 0x00007f825061cad9 auth_master_run_cmd_post (libdovecot.so.0)#012#7 0x00007f825061e67d auth_master_user_list_deinit (libdovecot.so.0)#012#8 0x00007f825095a9c2 mail_storage_service_all_next (libdovecot-storage.so.0)#012#9 0x0000562aea2f4692
    doveadm_mail_cmd_exec (doveadm)#012#10 0x0000562aea2f5322 doveadm_cmd_ver2_to_mail_cmd_wrapper (doveadm)#012#11 0x0000562aea305f45 doveadm_cmd_run_ver2 (doveadm)#012#12 0x0000562aea305f9b doveadm_cmd_try_run_ver2 (doveadm)#012#13 0x0000562aea2e4625 main (doveadm)#012#14 0x00007f824f9237b3 __libc_start_main (libc.so.6)#012#15 0x0000562aea2e4abe _start (doveadm)
    Dec 7 19:00:20 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 19:00:38 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 19:01:21 ns10 named[4077446]: /var/named/pri.mtdiablolandscaping.com:15: TTL set to prior TTL (3600)
    Dec 7 19:01:21 ns10 named[4077446]: /var/named/pri.mtdiablolandscaping.com:16: TTL set to prior TTL (3600)
    Dec 7 19:01:23 ns10 smartd[4077561]: Device: /dev/bus/0 [megaraid_disk_00] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
    Dec 7 19:01:24 ns10 smartd[4077561]: Device: /dev/bus/0 [megaraid_disk_01] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
    Dec 7 19:01:24 ns10 smartd[4077561]: Device: /dev/bus/0 [megaraid_disk_02] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
    Dec 7 19:01:24 ns10 smartd[4077561]: Device: /dev/bus/0 [megaraid_disk_03] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
    Dec 7 19:01:24 ns10 smartd[4077561]: Device: /dev/bus/0 [megaraid_disk_04] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
    Dec 7 19:01:24 ns10 smartd[4077561]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
    Dec 7 19:01:34 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 19:02:03 ns10 systemd[1]: kdump.service: Failed with result 'exit-code'.
    Dec 7 19:02:03 ns10 systemd[1]: Failed to start Crash recovery kernel arming.
    Dec 7 19:02:12 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'
    Dec 7 19:04:10 ns10 journal: After installation of a new version of microcode_ctl package,
    Dec 7 19:04:10 ns10 journal: initramfs hasn't been re-generated for all the installed kernel packages.
    Dec 7 19:04:10 ns10 journal: The following kernel packages have been skipped: kernel-core-4.18.0-193.el8.x86_64.
    Dec 7 19:04:10 ns10 journal: Please re-generate initramfs manually for these kernel packages with the
    Dec 7 19:04:10 ns10 journal: "dracut -f --kver KERNEL_VERSION" command in order to get the latest
    Dec 7 19:04:10 ns10 journal: Intel CPU microcode included into early initramfs image for it, if needed.
    Dec 7 19:04:10 ns10 systemd[1]: cgroup compatibility translation between legacy and unified hierarchy settings activated. See cgroup-compat debug messages for details.
    Dec 7 19:04:12 ns10 systemd[1]: /usr/lib/systemd/system/systemd-resolved.service:33: Unknown lvalue 'ProtectSystems' in section 'Service'

    dec 7 1900 is when I did the remote sync;reboot command. I the kernel package that can boot is 4.18.0-193.19.1.el8_2.x86_64 (core)
    the one that does not boot (default) was 4.18.0-240.1.1.el8_3.x86_64.
    it wants initramfs rebuild? so - dracut -f -kver 4.18.0-240.1.1.el8_3.x86_64 would be the command?
    and whats the difference between these 2 kernels?
    and (MOST IMPORTANT) can I possibly screw anything ELSE up doing this??? i'm on a ragged edge, ready to put up disturbing thread titles!
    :)
     
  5. craig baker

    craig baker Member HowtoForge Supporter

    ENLIGHTENMENT! Centos 8 and LSI raid support for Dell Servers with PERC controllers!!!

    Turns out that the problem is that Centos 8 ELIMINATED support for a bunch of still used LSI sas controllers including the Perc 6, H700, H800 etc!
    And when I booted that sniny new Kernel - presto NO HARD DRIVES DETECTED LOL
    I had installed Centos 8.2 and I determined at the time that I had to add the driver manually (via a usb stick) - and thought we were all good! yes install worked fine.
    BUT there is a BUG in the dracut package - so that when you do a kernel upgrade the slipstreamed (sorry old xp reference) driver is NOT included - so the newly built kernel will be lacking the driver! hence no boot. and alas the only fixes involve
    1) using an elrepo kernel in place of the centos kernel (they kept the drivers!)
    or
    2) manually fixing dracut to eliminate the bug. or wait till centos releases a fixed dracut!
    or
    3) my choice. KEEP USING THE OLDER KERNEL and add exclude="kernel*" in the dnf.conf file until they get it fixed.
    For reference the betas of centos 8.4 (the stream previews) bring back support for SOME of the LSI raid contollers.

    SHEESH. This was BAD MOVE space cadet - several of the eliminated controllers were still being sold in 2019 servers!
     

Share This Page