- cross-posted to:
- sysadmin@lemmy.ml
- sysadmin@lemmy.world
- cross-posted to:
- sysadmin@lemmy.ml
- sysadmin@lemmy.world
All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It’s all very exciting, personally, as someone not responsible for fixing it.
Apparently caused by a bad CrowdStrike update.
CrowdStrike: It’s Friday, let’s throw it over the wall to production. See you all on Monday!
Yeah my plans of going to sleep last night were thoroughly dashed as every single windows server across every datacenter I manage between two countries all cried out at the same time lmao
I always wondered who even used windows server given how marginal its marketshare is. Now i know from the news.
My current company does and I hate it so much. Who even got that idea in the first place? Linux always dominated server-side stuff, no?
Not too long ago, a lot of Customer Relationship Management (CRM) software ran on MS SQL Server. Businesses made significant investments in software and training, and some of them don’t have the technical, financial, or logistical resources to adapt - momentum keeps them using Windows Server.
For example, small businesses that are physically located in rural areas can’t use cloud based services because rural internet is too slow and unreliable. Its not quite the case that there’s no amount of money you can pay for a good internet connection in rural America, but last time I looked into it, Verizon wanted to charge me $20,000 per mile to run a fiber optic cable from the nearest town to my client’s farm.
Marginal? You must be joking. A vast amount of servers run on Windows Server. Where I work alone we have several hundred and many companies have a similar setup. Statista put the Windows Server OS market share over 70% in 2019. While I find it hard to believe it would be that high, it does clearly indicate it’s most certainly not a marginal percentage.
I’m not getting an account on Statista, and I agree that its marketshare isn’t “marginal” in practice, but something is up with those figures, since overwhelmingly internet hosted services are on top of Linux. Internal servers may be a bit different, but “servers” I’d expect to count internet servers…
It’s stated in the synopsis, below where it says you need to pay for the article. Anyway, it might be true as the hosting servers themselves often host up to hundreds of Windows machines. But it really depends on what is measured and the method used, which we don’t know because who the hell has a statista account anyway.
Almost everyone, because the Windows server market share isn’t marginal at all.
It’s only marginal for running custom code. Every large organization has at least a few of them running important out-of-the-box services.
This is a crowdstrike issue specifically related to the falcon sensor. Happens to affect only windows hosts.
Well, I’ve seen some, but they usually don’t have automatic updates and generally do not have access to the Internet.
Did you feel a great disturbance in the force?
How many coffee cups have you drank in the last 12 hours?
There was a point where words lost all meaning and I think my heart was one continuous beat for a good hour.
I work in a data center
I lost count
What was Dracula doing in your data centre?
Because he’s Dracula. He’s twelve million years old.
THE WORMS
Surely Dracula doesn’t use windows.
I work in a datacenter, but no Windows. I slept so well.
Though a couple years back some ransomware that also impacted Linux ran through, but I got to sleep well because it only bit people with easily guessed root passwords. It bit a lot of other departments at the company though.
This time even the Windows folks were spared, because CrowdStrike wasn’t the solution they infested themselves with (they use other providers, who I fully expect to screw up the same way one day).
How’s it going, Obi-Wan?
I’m so exhausted… This is madness. As a Linux user I’ve busy all day telling people with bricked PCs that Linux is better but there are just so many. It never ends. I think this is outage is going to keep me busy all weekend.
The thought of a local computer being unable to boot because some remote server somewhere is unavailable makes me laugh and sad at the same time.
A remote server that you pay some serious money to that pushes a garbage driver that prevents yours from booting
Not only does it (possibly) prevent booting, but it will also bsod it first so you’ll have to see how lucky you get.
Goddamn I hate crowdstrike. Between this and them fucking up and letting malware back into a system, I have nothing nice to say about them.
It’s bsod on boot
And anything encrypted with bitlocker can’t even go into safe mode to fix it
It doesn’t consistently bsod on boot, about half of affected machines did in our environment, but all of them did experience a bsod while running. A good amount of ours just took the bad update, bsod’d and came back up.
yeah so you can’t get Chinese government spyware installed.
I don’t think that’s what’s happening here. As far as I know it’s an issue with a driver installed on the computers, not with anything trying to reach out to an external server. If that were the case you’d expect it to fail to boot any time you don’t have an Internet connection.
Windows is bad but it’s not that bad yet.
It’s just a fun coincidence that the azure outage was around the same time.
Yep, and it’s harder to fix Windows VMs in Azure that are effected because you can’t boot them into safe mode the same way you can with a physical machine.
Foof. Nightmare fuel.
I think we’re getting a lot of pictures for !pbsod@lemmy.ohaa.xyz
And subscribed!
I’m used to IT doing a lot of their work on the weekends as to not impact operations.
Good ol microsloth
never do updates on a Friday.
deleted by creator
And especially now the work week has slimmed down where no one works on Friday anymore
Excuse me, what now? I didn’t get that memo.
Yeah it’s great :-) 4 10hr shifts and every weekend is a 3 day weekend
Is the 4x10 really worth the extra day off? Tbh I’m not sure it would work very well for me… I find just one 10-hour day to be kinda draining, so doing that 4 times a week every week feels like it might just cancel out any benefits of the extra day off.
I am very used to it so I don’t find it draining. I tried 5x8 once and it felt more like working an extra day than getting more time in the afternoon. If that makes sense. I also start early around 7am, so I am only staying a little later than other people
deleted by creator
I changed jobs because the new management was all “if I can’t look at your ass you don’t work here” and I agreed.
I now work remotely 100% and it’s in the union contract with the 21vacation days and 9x9 compressed time and regular raises. The view out my home office window is partially obscured by a floofy cat and we both like it that way.
I’d work here until I die.
Actually I was not even joking. I also work in IT and have exactly the same opinion. Friday is for easy stuff!
Yep, anything done on Friday can enter the world on a Monday.
I don’t really have any plans most weekends, but I sure as shit don’t plan on spending it fixing Friday’s fuckups.
And honestly, anything that can be done Monday is probably better done on Tuesday. Why start off your week by screwing stuff up?
We have a team policy to never do externally facing updates on Fridays, and we generally avoid Mondays as well unless it’s urgent. Here’s roughly what each day is for:
- Monday - urgent patches that were ready on Friday; everyone WFH
- Tuesday - most releases; work in-office
- Wed - fixing stuff we broke on Tuesday/planning the next release; work in-office
- Thu - fixing stuff we broke on Tuesday, closing things out for the week; WFH
- Fri - documentation, reviews, etc; WFH
If things go sideways, we come in on Thu to straighten it out, but that almost never happens.
You posted this 14 hours ago, which would have made it 4:30 am in Austin, Texas where Cloudstrike is based. You may have felt the effect on Friday, but it’s extremely likely that the person who made the change did it late on a Thursday.
Never update unless something is broken.
BTW, I use Arch.
If it was Arch you’d update once every 15 minutes whether anything’s broken or not.
I use Tumbleweed, so I only get updates once/day, twice if something explodes. I used to use Arch, so my update cycle has lengthened from 1-2x/day to 1-2x/week, which is so much better.
gets two update notifications
Ah, must be explosion Wednesday
I really like the tumbleweed method, seems like the best compromise between arch and debian style updates.
This is AV, and even possible that it is part of definitions (for example some windows file deleted as false positive). You update those daily.
That’s advice so smart you’re guaranteed to have massive security holes.
This is fine as long as you politely ask everyone on the Internet to slow down and stop exploiting new vulnerabilities.
I think vulnerabilities found count as “something broken” and chap you replied to simply did not think that far ahead hahah
For real - A cyber security company should basically always be pushing out updates.
always pushing out updates
Notes: Version bump: Eric is a twat so I removed his name from the listed coder team members on the about window.
git push --force
leans back in chair productive day, productive day indeed
Exactly. You don’t know what the vulnerabilities are, but the vendors pushing out updates typically do. So stay on top of updates to limit the attack surface.
Major releases can wait, security updates should be pushed as soon as they can be proven to not break prod.
Stop running production services on M$. There is a better backend OS.
https://www.theregister.com/ has a series of articles on what’s going on technically.
Latest advice…
There is a faulty channel file, so not quite an update. There is a workaround…
-
Boot Windows into Safe Mode or WRE.
-
Go to C:\Windows\System32\drivers\CrowdStrike
-
Locate and delete file matching “C-00000291*.sys”
-
Boot normally.
-
The amount of servers running Windows out there is depressing to me
Where did you think Microsoft was getting all (hyperbole) of their money from?
I dunno, but doesn’t like a quarter of the internet kinda run on Azure?
doesn’t like a quarter of the internet kinda run on Azure?
Said another way, 3/4 of the internet isn’t on Unsure cloud blah-blah.
And azure is - shhh - at least partially backed by Linux hosts. Didn’t they buy an AWS clone and forcibly inject it with money like Bobby Brown on a date in the hopes of building AWS better than AWS like they did with nokia? MS could be more protectively diverse than many of its best customers.
I guess Spotify was running on the other 40%, as many other services
so 40% of azure crashes a quarter of the internet…
The four multinational corporations I worked at were almost entirely Windows servers with the exception of vendor specific stuff running Linux. Companies REALLY want that support clause in their infrastructure agreement.
Companies REALLY want that support clause in their infrastructure agreement.
RedHat, Ubuntu, SUSE - they all exist on support contracts.
I’ve worked as an IT architect at various companies in my career and you can definitely get support contracts for engineering support of RHEL, Ubuntu, SUSE, etc. That isn’t the issue. The issue is that there are a lot of system administrators with “15 years experience in Linux” that have no real experience in Linux. They have experience googling for guides and tutorials while having cobbled together documents of doing various things without understanding what they are really doing.
I can’t tell you how many times I’ve seen an enterprise patch their Linux solutions (if they patched them at all with some ridiculous rubberstamped PO&AM) manually without deploying a repo and updating the repo treating it as you would a WSUS. Hell, I’m pleasantly surprised if I see them joined to a Windows domain (a few times) or an LDAP (once but they didn’t have a trust with the Domain Forest or use sudoer rules…sigh).
“googling answers”, I feel personally violated.
/s
To be fare, there is not reason to memorize things that you need once or twice. Google is tool, and good for Linux issues. Why debug some issue for few hours, if you can Google resolution in minutes.
I’m not against using Google, stack exhange, man pages, apropos, tldr, etc. but if you’re trying to advertise competence with a skillset but you can’t do the basics and frankly it is still essentially a mystery to you then youre just being dishonest. Sure use all tools available to you though because that’s a good thing to do.
Just because someone breathed air in the same space occasionally over the years where a tool exists does not mean that they can honestly say that those are years of experience with it on a resume or whatever.
Just because someone breathed air in the same space occasionally over the years where a tool exists does not mean that they can honestly say that those are years of experience with it on a resume or whatever.
Capitalism makes them to.
The issue is that there are a lot of system administrators with “15 years experience in Linux” that have no real experience in Linux.
Reminds me of this guy I helped a few years ago. His name was Bob, and he was a sysadmin at a predominantly Windows company. The software I was supporting, however, only ran on Linux. So since Bob had been a UNIX admin back in the 80s they picked him to install the software.
But it had been 30 years since he ever touched a CLI. Every time I got on a call with him, I’d have to give him every keystroke one by one, all while listening to him complain about how much he hated it. After three or four calls I just gave up and used the screenshare to do everything myself.
AFAIK he’s still the only Linux “sysadmin” there.
I’ve had my PC shut down for updates three times now, while using it as a Jellyfin server from another room. And I’ve only been using it for this purpose for six months or so.
I can’t imagine running anything critical on it.
Not judging, but why wouldn’t you run Linux for a server?
Because I only have one PC (that I need for work), and I can’t be arsed to cock around with dual boot just to watch movies. Especially when Windows will probably break that at some point.
Can you use Linux as main OS then? What do you need your computer to do?
I need to run windows software that makes other windows software, that will be run on our customers (who pay us quite well) PCs that also run windows.
Plus gaming. I’m not switching my primary box to Linux at any point. If I get a mini server, that will probably ruin Linux.
I need to run windows software that makes other windows software, that will be run on our customers (who pay us quite well) PCs that also run windows.
Mingw, but whatever. Maybe there is somethong mingw can’t do.
Plus gaming. I’m not switching my primary box to Linux at any point.
Unless it is Apex and some other worst offenders or you use GPU from the only company actively hostile to linux, gaming is fine.
Well with your level of expertise you should probably not be running anything, to be honest :)
Wow dude you’re so cool. I bet that made you feel so superior. Everyone on here thinks you are so badass.
I do as well!
Windows server, the OS, runs differently from desktop windows. So if you’re using desktop windows and expecting it to run like a server, well, that’s on you. However, I ran windows server 2016 and then 2019 for quite a few years just doing general homelab stuff and it is really a pain compared to Linux which I switched to on my server about a year ago. Server stuff is just way easier on Linux in my experience.
It doesn’t have to, though. Linux manages to do both just fine, with relatively minor compromises.
Expecting an OS to handle keeping software running is not a big ask.
big ask.
Off the car lot, we say ‘request’. But good on you for changing careers.
I really have no idea why you think your choice of wording would be relevant to the discussion in any way, but OK…
Yup, I use Linux to run a Jellyfin server, as well as a few others things. The only problem is that the CPU I’m using (Ryzen 1st gen) will crash every couple weeks or so (known hardware fault, I never bothered to RMA), but that’s honestly not that bad since I can just walk over and restart it. Before that, it ran happily on an old Phenom II from 2009 for something like 10 years (old PC), and I mostly replaced it because the Ryzen uses a bit less electricity (enough that I used to turn the old PC off at night; this one runs 24/7 as is way more convenient).
So aside from this hardware issue, Linux has been extremely solid. I have a VPS that tunnels traffic into my Jellyfin and other services from outside, and it pretty much never goes down (I guess the host reboots it once a year or something for hardware maintenance). I run updates when I want to (when I remember, which is about monthly), and it only goes down for like 30 sec to reboot after updates are applied.
So yeah, Linux FTW, once it’s set up, it just runs.
not that bad since I can just walk over and restart it.
You can try to use watchdog to automatically restart on crashes. Or go through RMA.
I could, but it’s a pretty rare nuisance. I’d rather just replace the CPU than go through RMA, a newer gen CPU is quite inexpensive, I could probably get by with a <$100 CPU since anything AM4 should work (I have an X370 with support for 5XXX series CPUs).
I’m personally looking at replacing it with a much lower power chip, like maybe something ARM. I just haven’t found something that would fit well since I need 2-4 SATA (PCIe card could work), 16GB+ RAM, and a relatively strong CPU. I’m hopeful that with ARM Snapdragon chips making their way to laptops and RISC-V getting more available, I’ll find something that’ll fit that niche well. Otherwise, I’ll just upgrade when my wife or I upgrade, which is what I usually do.
I just haven’t found something that would fit well since I need 2-4 SATA (PCIe card could work), 16GB+ RAM, and a relatively strong CPU.
4 SATA, 8GB RAM is easy to find. What do you need 16 gigs for? Compiling Gentoo?
Star64 for ARM and Quartz64 for RV.
crowdstrike sent a corrupt file with a software update for windows servers. this caused a blue screen of death on all the windows servers globally for crowdstrike clients causing that blue screen of death. even people in my company. luckily i shut off my computer at the end of the day and missed the update. It’s not an OTA fix. they have to go into every data center and manually fix all the computer servers. some of these severs have encryption. I see a very big lawsuit coming…
I was quite surprised when I heard the news. I had been working for hours on my PC without any issues. It pays off not to use Windows.
Linux and Mac just got free advertisment.
A lot of people I work with were affected, I wasn’t one of them. I had assumed it was because I put my machine to sleep yesterday (and every other day this week) and just woke it up after booting it. I assumed it was an on startup thing and that’s why I didn’t have it.
Our IT provider already broke EVERYTHING earlier this month when they remote installed" Nexthink Collector" which forced a 30+ minute CHKDSK on every boot for EVERYONE, until they rolled out a fix (which they were at least able to do remotely), and I didn’t want to have to deal with that the week before I go in leave.
But it sounds like it even happened to running systems so now I don’t know why I wasn’t affected, unless it’s a windows 10 only thing?
Our IT have had some grief lately, but at least they specified Intel 12th gen on our latest CAD machines, rather than 13th or 14th, so they’ve got at least one win.