It was to talk about “team restructuring”
Companies are often insane. I’m working in one who has this one guy build a super complicated architecture, because he don’t know aws. So instead of just using a message queue on aws, he is building Java programs and tons of software and containers to try and send messages in a reliable way. Costs the company huge money, but they don’t care, since he is some old timer who has been there for like 10 years and everyone let’s him do what he wants.
Quick and dirty as they like to say
I personally always try to engineer away from cloud services. They cost you ridiculous amounts of money and all you need is documentation afterwards. Then it can be easier and faster than AWS or GC
You’re the guy 1984 was talking about…
Got to agree with @Zushii@feddit.de here, although it depends on the scope of your service or project.
Cloud services are good at getting you up and running quickly, but they are very, very expensive to scale up.
I work for a financial services company, and we are paying 7 digit monthly AWS bills for an amount of work that could realistically be done with one really big dedicated server. And now we’re required to support multiple cloud providers by some of our customers, we’ve spent a TON of effort trying to untangle from SQS/SNS and other AWS specific technologies.
Clouds like to tell you:
- Using the cloud is cheaper than running your own server
- Using cloud services requires less manpower / labour to maintain and manage
- It’s easier to get up and running and scale up later using cloud services
The last item is true, but the first two are only true if you are running a small service. Scaling up on a cloud is not cost effective, and maintaining a complicated cloud architecture can be FAR more complicated than managing a similar centralized architecture.
You are paying aws to not have one big server, so you get high availability and dynamic load balancing as instances come and go.
I agree its not cheaper than being on prem. But it’s much higher quality solutions.
Today at work, they decided to upgrade from ancient Ubuntu version to a more recent version. Since they don’t use aws properly, they treat servers as pets. So to upgrade Ubuntu, they actually upgraded Ubuntu on the instance instead of creating a new one. This led to grub failing and now they are troubleshooting how to mount disks etc.
All of this could easily be avoided by using the cloud properly.
That could be avoided by using on prem properly, too. People are very capable of making bad infrastructure whether on prem or cloud.
I used to work on an on premise object storage system before, where we required double digits of “nines” availability. High availability is not rocket science. Most scenarios are covered by having 2 or 3 machines.
I’d also wager that using the cloud properly is a different skillset than properly managing or upgrading a Linux system, not necessarily a cheaper or better one from a company point of view.
where we required double digits of “nines” availability
Do you mean 99% or 99.99999999%? Because 99.99999999% is absurd. Even Google doesn’t go near that for internal targets. That’s 1/3 of a second per year of downtime. If a network hiccup causes 30s of downtime, you’ve blown through a century of error budget. If you’re talking durability, that’s another matter, but availability?
For ten-nines availability to make any sense, any dependent system would also have to have ten nines availability, and any calling system would have to have close to ten nines availability or it’s not worth ten nines on the called system.
If the traffic ever goes over TCP/IP, not even if it ever goes over the public internet, if it ever goes over Ethernet wires, ten nines sounds like overkill. Maybe if it stays within a mainframe computer, but you’d have to carefully audit that mainframe to ensure that every component involved also has approx ten nines.
If you mean 2 nines availability, that’s not high availability at all. That’s nearly 4 days of downtime a year. That’s enough that you don’t necessarily need a standby system, you just need to be able to repair the main one within a few hours if it goes down.
Sorry, yes, that was durability. I got it mixed up in my head. Availability had lower targets.
But I stand by the gist of my argument - you can achieve a lot with a live/live system, or a 3 node system with a master election, or…
High availability doesn’t have to equate high cost or complexity, if you can take it into account when designing the system.
Randomly got a message from one of my reports asking what this “Mandatory Team Meeting” was on his calendar. I hadn’t been invited, but it was our whole company shutting down ¯\_(ツ)_/¯
Hey, that happened to me, too!
I got scheduled for a mandatory meeting with 1 hour notice. During lunch.
I asked my boss what it was. He didn’t know either. I joked that it was us being shut down.
Sure enough, 1 hour later we were both writing LinkedIn recommendations and helping each other find jobs after it was announced that our whole studio was being shut down by corporate and myself plus all my coworkers were all now jobless.
I at least had the cathartic experience of being told “hey we need to shut down EVERYTHING before 7pm because that’s when the email will turn off, so log into every service you know we use and delete it all.” And then I spent the next couple hours clicking every delete button I could.
K8s clusters? Delete. Prod DB? Delete. Prod DB backups? Delete. S3 buckets? Delete. Cloudflare account? Delete.
It was actually kinda fun.
This sounds therapeautic
“Team restructuring” is so much fun, you never know what you’re going to get.
Your boss’s boss now reports to a slightly different VP? Everyone is getting fired? No way to know which it’s going to be, until the end of the meeting.
Bit let me first say that these are difficult times, and we’re proud of this team.