Any advice on where to go from here? This console was running dmesg -w to try and catch an intermittent crash… And this is what I got. I am using an el cheapo USB wifi adapter that I’m suspicious of.

Everything was working fine until I rebuilt nixos with Nvidia support… Now my old generations of the OS are crashing after a few minutes (display on, no response to input, keyboard lights don’t respond, SysRq doesn’t work)

  • mozz@mbin.grits.dev
    link
    fedilink
    arrow-up
    0
    ·
    10 months ago

    Does it still get the error without the wifi adapter connected? The stack trace shows some network-related stuff (which doesn’t necessarily mean that’s where the issue arose, but it would be a little coincidence based on what you said).

    That’s the first thing I’d try, and if removing the adapter fixes it (long term) I wouldn’t use the adapter anymore. Sometimes broken hardware breaks other hardware it’s connected to.

    If removing the adapter doesn’t fix it, then the next thing I’d try is booting back into the known-good old old OS, maybe removing the NVidia card, basically simplify everything one step at a time until it stops happening, if you can.

  • Rentlar@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    10 months ago

    Comm: wpa_supplicant being the wifi function makes me suspicious of your wifi hardware as well before I saw the rest of your post. I’ve had the best success with PCIe based wifi cards (if this is a desktop pc)

  • StarDreamer@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    10 months ago

    Look at the line with the asm_exc_invalid_op. That seems like a hardware fault caused by an invalid asm instruction to me. Either something wrong is being interpreted as an opcore (unlikely) or maybe the driver was compiled with extensions not available on the current machine.

    OP, how old is your CPU? And how old is the nic you are using?

    Edit: did you use a custom driver for the NIC? I’m looking at the Linux src and rt_mutex_schedule does not exist. Nevermind. Was checking 4.18 instead of 6.7. found it now. The bug is most likely inside a macro called preempt_disable(). Unfortunately most of the functions are pretty heavily inlined and architecture dependent so you won’t get much out of it. But it is likely any changes you made in terms of premption might also be causing the bug.