Fess up. You know it was you.

  • tquid@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    7 months ago

    One time I was deleting a user from our MySQL-backed RADIUS database.

    DELETE * FROM PASSWORDS;

    And yeah, if you don’t have a WHERE clause? It just deletes everything. About 60,000 records for a decent-sized ISP.

    That afternoon really, really sucked. We had only ad-hoc backups. It was not a well-run business.

    Now when I interview sysadmins (or these days devops), I always ask about their worst cock-up. It tells you a lot about a candidate.

    • RacerX@lemm.eeOP
      link
      fedilink
      arrow-up
      0
      ·
      7 months ago

      Always skeptical of people that don’t own up to mistakes. Would much rather they own it and speak to what they learned.

      • chameleon@kbin.social
        link
        fedilink
        arrow-up
        0
        ·
        7 months ago

        It’s difficult because you have a 50/50 of having a manager that doesn’t respect mistakes and will immediately get you fired for it (to the best of their abilities), versus one that considers such a mistake to be very expensive training.

        I simply can’t blame people for self-defense. I interned at a ‘non-profit’ where there had apparently been a revolving door of employees being fired for making entirely reasonable mistakes and looking back at it a dozen years later, it’s no surprise that nobody was getting anything done in that environment.

        • ilinamorato@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          7 months ago

          Incredibly short-sighted, especially for a nonprofit. You just spent some huge amount of time and money training a person to never make that mistake again, why would you throw that investment away?

      • Flax@feddit.uk
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 months ago

        This is what I was told when I started work. If you make a mistake, just admit to it. They most likely won’t punish you for it if it wasn’t out of pure negligence

    • cobysev@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      I was a sysadmin in the US Air Force for 20 years. One of my assignments was working at the headquarters for AFCENT (Air Forces Central Command), which oversees every deployed base in the middle east. Specifically, I worked on a tier 3 help desk, solving problems that the help desks at deployed bases couldn’t figure out.

      Normally, we got our issues in tickets forwarded to us from the individual base’s Communications Squadron (IT squadron at a base). But one day, we got a call from the commander of a base’s Comm Sq. Apparently, every user account on the base has disappeared and he needed our help restoring accounts!

      The first thing we did was dig through server logs to determine what caused it. No sense fixing it if an automated process was the cause and would just undo our work, right?

      We found one Technical Sergeant logged in who had run a command to delete every single user account in the directory tree. We sought him out and he claimed he was trying to remove one individual, but accidentally selected the tree instead of the individual. It just so happened to be the base’s tree, not an individual office or squadron.

      As his rank implies, he’s supposed to be the technical expert in his field. But this guy was an idiot who shouldn’t have been touching user accounts in the first place. Managing user accounts in an Airman job; a simple job given to our lowest-ranking members as they’re learning how to be sysadmins. And he couldn’t even do that.

      It was a very large base. It took 3 days to recover all accounts from backup. The Technical Sergeant had his admin privileges revoked and spent the rest of his deployment sitting in a corner, doing administrative paperwork.

  • GolfNovemberUniform@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    Installed a flatpak app (can’t remember which one but it wasn’t obscure or shady) and smh it broke the file system on one of my main machines :) (at least I think that’s what happened because the machine started lagging, any app refused to launch and after a reboot I got an fsck error or something like that)

  • Quazatron@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    Did you know that “Terminate” is not an appropriate way to stop an AWS EC2 instance? I sure as hell didn’t.

        • tslnox@reddthat.com
          link
          fedilink
          arrow-up
          0
          ·
          7 months ago

          Maybe there should be some warning message… Maybe a question requiring you to manually type “yes I want it” or something.

          • synae[he/him]@lemmy.sdf.org
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 months ago

            Maybe an entire feature that disables it so you can’t do it accidentally, call it “termination protection” or something

      • ilinamorato@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        7 months ago

        “Stop” is the AWS EC2 verb for shutting down a box, but leaving the configuration and storage alone. You do it for load balancing, or when you’re done testing or developing something for the day but you’ll need to go back to it tomorrow. To undo a Stop, you just do a Start, and it’s just like power cycling a computer.

        “Terminate” is the AWS EC2 verb for shutting down a box, deleting the configuration and (usually) deleting the storage as well. It’s the “nuke it from orbit” option. You do it for temporary instances or instances with sensitive information that needs to go away. To undo a Terminate, you weep profusely and then manually rebuild everything; or, if you’re very, very lucky, you restore from backups (or an AMI).

      • Quazatron@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        7 months ago

        Noob was told to change some parameters on an AWS EC2 instance, requiring a stop/start. Selected terminate instead, killing the instance.

        Crappy company, running production infrastructure in AWS without giving proper training and securing a suitable backup process.

  • 𝕱𝖎𝖗𝖊𝖜𝖎𝖙𝖈𝖍@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    7 months ago

    Accidentally deleted an entire column in a police department’s evidence database 😬

    Thankfully, it only contained filepaths that could be reconstructed via a script. But I was sweating 12+1 bullets.

  • BestBouclettes@jlai.lu
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    I was still a wee IT technician, I was supposed to remove some cables from a patch panel. I pulled at least two cables that were used as ISCSI from the hypervisors to the storage bays. During production hours. Not my proudest memory.

  • shyguyblue@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 months ago

    Updated WordPress…

    Previous Web Dev had a whole mess of code inside the theme that was deprecated between WP versions.

    Fuck WordPress for static sites…

  • -RJ-@lemmy.worldB
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    Plugged a server in after it had been repaired but the person whose responsibility it was insisted it would be fine - they didn’t release the FSMO roles from it, the time was an hour out, it changed the time EVERYWHERE and broke ALL THE THINGS. Not technically my fault, but i should have pushed harder for them to have demoted it before I turned it back on.

  • Futs@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    Advertised an OS deployment to the ‘All Wokstations’ collection by mistake. I only realized after 30 minutes when peoples workstations started rebooting. Worked right through the night recovering and restoring about 200 machines.

  • TheMadIrishman@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 months ago

    Was troubleshooting a failed drive in a raid array on a small business DC/File Serv/Print/Everything else box. Replaced drive still showed failed. Moved to another bay thinking it was the slot not the drive. Accidentally hit yes when asked to initialize the array. Blew the whole thing away. It was an OLD server the customer was working on replacing, so I told them it finally gave up the ghost and I was taking it back to the office to keep working on it. I had been on the job for about 4 months and thought for SURE I was fired. Turns out we were already working on moving them to the cloud, so it ended up not being a big deal.

  • Monkey With A Shell@lemmy.socdojo.com
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    Found out the hard way to triple check your work when adding a new line to the proxy policy. Or, more accurately 2 lines when you only planned one, and that second one defaulted to a ‘deny all’ and resulted in dropping all web traffic out for the company…

    That made for a REAL tense meeting the next day after it got deployed and people started asking WTF happened…

  • FaceDeer@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    It wasn’t “worst” in terms of how much time it wasted, but the worst in terms of how tricky it was to figure out. I submitted a change list that worked on my machine as well as 90% of the build farm and most other dev and QA machines, but threw a baffling linker error on the remaining 10%. It turned out that the change worked fine on any machine that used to have a particular old version of Visual Studio installed on it, even though we no longer used that version and had phased it out for a newer one. The code I had written depended on a library that was no longer in current VS installs but got left behind when uninstalling the old one. So only very new computers were hitting that, mostly belonging to newer hires who were least equipped to figure out what was going on.

    • tslnox@reddthat.com
      link
      fedilink
      arrow-up
      0
      ·
      7 months ago

      That reminds me of when some of my former colleagues and I were on a training about programming industrial camera system that judges the quality of produced parts. I’m not really a programmer, just a guy who can troubleshoot and google stuff and occasionally hack together a simple code with heavy help from Google too.

      The guy was a German (we are Czech and we communicated in English) programmer who coded the whole thing in Omron software but he also wrote his own plugin for it. All was well when he was showing us on the big screen, but when he sent us the program file so we could experiment on it (changing parameters, adding steps to the flow…) the app would crash. I finally delved into the app logs and with the help of Google I found it was because he compiled his plugin with debug flags and it worked for him because he had the VS debug DLLs installed but we didn’t.

  • slazer2au@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 months ago

    I took down an ISPfor a couple hours because I forgot the ‘add’ keyword at the end of a Cisco configuration line

    • sloppy_diffuser@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      That’s a rite of passage for anyone working on Cisco’s shit TUI. At least its gotten better with some of the newer stuff. IOS-XR supported commits and diffing.