Four months ago, we asked Are LLMs making Stack Overflow irrelevant? Data at the time suggested that the answer is likely “yes:”

  • ripcord@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    This is interesting because a huge amount of AI “knowledge” comes from stack exchange.

    Now I’ll go read the other comments and article to see if that’s already been mentioned :)

  • db0@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 month ago

    Never again will I help provide content to a VC-backed service just so that they can rugpull us and cash-out.

    • sunzu2@thebrainbin.org
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      That’s why people should be posting on fedi and never post on corporate web.

      When corporate tells you its a parasite, believe it

    • tetris11@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 month ago

      I live in the hope that the insightful comments I left on reddit over my long tenure there will eventually be part of a FOSS corpus, once the VCs can’t extract anything of competitive value from it anymore. I’ll be long dead, but my comments will live on.

      • isaakengineer@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        for a life of mine, I can’t understand why people don’t value knowledge enough to make their own website and be proud of it; PS. I have made a load of CMS and now working on new approach to web dev …

        • tetris11@lemmy.ml
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 month ago

          Because it won’t be used and won’t be seen, that’s the sad reality of it. I do host a small personal blog run of org-mode+hugo, but it gets less visitors than a library at midnight

    • vermaterc@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      What exactly do you accuse Stack Overflow for? As far as I know this service has always been free to use and data is easily downloadable.

      • db0@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        “Free to use” on a VC-backed service just means you’re the product. I am accusing them of the same thing I’m accusing each VC-backed service: That they exploit our efforts to cash out and then sell the service for someone who will enshittify it for profit.

        Also, what do you mean “easily downloadable”? Can anyone download the entire corpus of SO in a way that they could set up their own SO with the same content to bootstrap them?

        • vermaterc@lemmy.ml
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 month ago

          Also, what do you mean “easily downloadable”? Can anyone download the entire corpus of SO in a way that they could set up their own SO with the same content to bootstrap them?

          have you seen: https://archive.org/details/stackexchange

          That they exploit our efforts to cash out and then sell the service for someone who will enshittify it for profit.

          Can you give an example of this enshittification for profit?

  • TomMasz@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    Ever ask a question on SO? I tell my students to search there but never, ever ask a question. The unmitigated hostility is not what new developers need or deserve. ChatGPT won’t humiliate you for asking a question that someone else has already asked.

    • OhNoMoreLemmy@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      If LLMs just copied stack overflow they’d respond to every question with “Closed as duplicate. Question already answered.”

    • piefood@feddit.online
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 month ago

      I forget where I heard the quote, but:

      Stack Overflow is a great place to find answers. Stack Overflow is a terrible place to ask questions.

      • asret@lemmy.zip
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        Their moderation approach is a big part of why it’s a great place to search for answers.

    • corsicanguppy@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      I’ve asked questions on S.O. I’ve answered some too.

      What I’ve found works well on s-o is

      1. Researching a bit first
      2. Asking a question properly*
      3. Including that search attempt to prove you’ve done some due-diligence

      I’ve found even a dick like me can get a lot of leeway by showing I’ve put in the effort and asked properly.

      *Same as Usenet

    • Quibblekrust@thelemmy.club
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      That’s why I only post questions for bleeding-edge languages and code libraries. I have to answer them myself.

    • gradual@lemmings.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      I’ve never had an issue asking a question on stack overflow.

      I’d wager a lot of ‘you people’ that have issues with it probably didn’t do enough research on your own.

      • JuxtaposedJaguar@lemmy.ml
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        There’s issues on both sides. A lot of people who ask questions are clearly just asking others to do their homework or otherwise haven’t made any effort, but there are also a lot of people who are unnecessarily hostile.

    • Domi@lemmy.secnd.me
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      ChatGPT won’t humiliate you for asking a question that someone else has already asked.

      I don’t know, being told what a good question that was and what a good boy I am everytime I ask a stupid question feels pretty humiliating.

      (Still better than SO)

      • stephen01king@lemmy.zip
        cake
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        That’s a pretty recent development, isn’t it? I remember ChatGPT being a lot more matter of factly earlier on.

        • Domi@lemmy.secnd.me
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 month ago

          Yep, old ChatGPT was much more blunt and factual.

          Don’t really like the recent trend of every LLM talking to me like I’m in kindergarten.

    • PriorityMotif@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Problem being that someone else asked the question 10 years ago and the answer is now irrelevant due to version changes. People with high scores are just early adopters who answered all of the easy questions. Hostile users generally can’t understand the question. The issue with llms answering your question is that they are going to be stuck in the current time period. In the future their answers will also be irrelevant due to version changes.

      • SmoothLiquidation@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        I mean that is already a problem, if you ask a question you have to be ready for the answer to be a mismatch of version conflicts.

        But that is ok. ChatGPT is a tool that can either help you or hurt you. I like to think of it like a power hammer. If you are doing a roofing job, it can help you get things done faster compared to a manual hammer, but you still need to know how to build a roof to get started.

        ChatGPT is great at helping you organize your thoughts or finding an answer to some error message buried in some log file, but you still need to know what questions to ask and you need to be ready for it to give you a stupid answer and how to get around that.

      • Kevin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        Earlier today I googled how to toggle full screen in dosbox-x and the AI-generated answer said to use alt+enter. Tried it and it didn’t work, so I look in the documentation and it turns out that they changed it to F12+f a while ago (probably to avoid interfering with actual dos input).

        This is definitely already a problem.

        • Natanael@infosec.pub
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 month ago

          Every LLM is shit at dealing with version changes. They don’t understand it as a concept, despite all their training data.

    • theherk@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      I see this hot take often, and it isn’t entirely without merit, but it is mitigated by moderation; in some Stack communities better than others. I’ve been an active member for many years, and in my view it goes like this.

      If you contribute a question without reading the rules and How to Ask a Good Question, you don’t provide minimal reproducible steps with code, post images of code, etc. you may get flamed out of town. And that may feel bad and it may be mean if the questioner didn’t know to read those. But they are there for you.

      If, however, you ask a thoughtful question, give examples, show what you’ve tried, etc. you definitely can get quality, courteous help.

      Doesn’t change that video killed the radio star here. The show is over.

      • TomMasz@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        Beginners are the least likely to ask thoughtful questions. We include slides in lectures about how to ask a question, but when there’s an assignment deadline and you’re inexperienced, it’s more likely you’re going to just blurt out “help me!” rather than provide a detailed explanation that doesn’t require repeated prompting. It takes time to learn how to work through an issue yourself before asking. Students are often facing time pressure and that can drive bad behavior. Correcting them is important, just don’t do it in a way that crushes their spirit.

        • theherk@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 month ago

          100% understood and agreed. I don’t want to defend the bad behavior. It is out there among questioners and in the experienced community alike. Just saying it is possible to find quality help there.

      • Sl00k@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        Even for non newcomers, having threads marked as duplicates for problems introduced by version changes that aren’t considered in the original question/answers is a major issue.

  • Endmaker@ani.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    Even without LLMs, it’s possible StackOverflow would have eventually faded into irrelevance – perhaps driven by moderation policy changes or something else that started in 2014

    💯

    • lurch (he/him)@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      actually, i was surprised it took off at all, because there are plenty less formal alternatives, but the name is catchy with devs. maybe that’s all it took.

      • Eager Eagle@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        1 month ago

        It took off because searching a specific issue is likely to give you a good and comprehensive answer back with minimal effort, so it kept being ranked well in search engines.

        Other less “pedantic” forums are great for discussion and they encourage new questions, but they don’t perform nearly as well for people searching for the answers or the context they’re looking for: there’s too much noise in the discussion and answers are often scattered in multiple topics.

  • ramble81@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    So here’s what I don’t get. LLMs were trained on data from places like SO. SO starts losing users ,and thus content. Content that LLMs ingest to stay relevant.

    So where will LLMs get their content after a certain point? Especially for new things that may come out or unique situations. It’s not like it’ll scrape the answer from a web page if people are just asking LLMs.

    • db0@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 month ago

      The need for the service that SO provided won’t go away. Eventually people will migrate to new places to discuss. LLM creators will either constantly scrape those as well, forcing them to implement more and more countermeasures and GenAI-poison, or the services themselves will enshittify and sell our content (i.e. the commons) to LLM-creators.

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      This is an area where synthetic data can be useful. For example, you could scrape the documentation and source code for a Python library and then use an existing LLM to generate questions and answers about the content to train future coding assistants on. As long as the training data gets well curated for quality it’s perfectly useful for this kind of thing, no need for an actual forum.

      AI companies have a lot of clever people working for them, they’re aware of these problems.

      • Natanael@infosec.pub
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        You’ll never be able to capture every source of questions that humans might have in LLM training data.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          0
          ·
          1 month ago

          That’s the neat thing, you don’t.

          LLM training is primarily about getting the LLM to understand concepts. When you need it to be factual, or are working with it to solve novel problems, you can put a bunch of relevant information into the LLM’s context and it can use that even if it wasn’t explicitly trained on it. It’s called RAG, retrieval-augmented generation. Most of the general-purpose LLMs on the net these days do that, when you ask Copilot or Gemini about stuff it’ll often have footnotes in the response that point to the stuff that it searched up in the background and used as context.

          So for a future Stack Overflow LLM replacement, I’d expect the LLM to be backed up by being able to search through relevant documentation and source code.

          • Natanael@infosec.pub
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            1 month ago

            Even then the summarizer often fails or bring up the wrong thing 🤷

            You’ll still have trouble comparing changes if it needs to look at multiple versions, etc. Especially parsing changelogs and comparing that to specific version numbers, etc

            • FaceDeer@fedia.io
              link
              fedilink
              arrow-up
              0
              ·
              1 month ago

              How does this play out when you hold a human contributor to the same standards? They also often fail to summarize information accurately or bring up the wrong thing. Lots of answers on Stack Overflow are just plain wrong, or focus on the wrong thing, or don’t reference the correct sources (when they reference anything at all). The most common criticism of Stack Overflow I’m seeing is how its human contributors direct people to other threads and declare that the question is “already answered” there when it isn’t really.

              LLMs can do a decent job. And right now they are as bad as they’re ever going to be.

              • Natanael@infosec.pub
                link
                fedilink
                English
                arrow-up
                0
                ·
                1 month ago

                Well trained humans are still more consistent and more predictable and easier to teach.

                There’s no guarantee LLM will get reliably better at everything. It still makes some mistakes today that it did when introduced and nobody knows how to fix that yet

                • FaceDeer@fedia.io
                  link
                  fedilink
                  arrow-up
                  0
                  ·
                  1 month ago

                  You’re still setting a high standard here. What counts as a “well trained” human and how many SO commenters count as that? Also “easier to teach” is complicated. It takes decades for a human to become well trained, an LLM can be trained in weeks. And an individual computer that’ll be running the LLM is “trained” in minutes, it just needs to load the model into memory. Once you have an LLM you can run as many instances of it as you want to spend money on.

                  There’s no guarantee LLM will get reliably better at everything

                  Never said they would. I said they’re as bad as they’re ever going to be, which allows for the possibility that they don’t get any better.

                  Even if they don’t, though, they’re still good enough to have killed Stack Overflow.

                  It still makes some mistakes today that it did when introduced and nobody knows how to fix that yet

                  And humans also make mistakes. Do we know how to fix that yet?

    • fubarx@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Same question applies to all the other websites out there being mined to train LLMs. Google search Overviews removes the need for people to visit linked sites. Traffic plummets. Ads dry up, and the sites go out of business. No new content to train on 🤷🏻‍♂️

    • vala@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      You are assuming that people act in logical ways.

      This is only a problem right now if you think about it.

  • dinckel@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    Make no mistake. LLMs aren’t killing stackoverflow. LLMs just arrived to finish it off. The stuff that was killing it are the regular posters there, and their passive aggressive bullshit

      • Quibblekrust@thelemmy.club
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        Question closed as off-topic.

        Removed as duplicate of #264826376: “Question closed as duplicate.”

        Sometimes my jokes need explaining...

        I’m pointing out that questions on SO too often get closed as duplicates of adjacent (but distinctly different) questions, and I did so in the most confusing, recursive way possible.

    • SnortsGarlicPowder@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Nothing passive about them it was just regular aggressive. Made my programming coursework so much worse. Indian guys on YouTube however, now those guys were helpful!

    • skisnow@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Yup. I once decided to spend an afternoon answering questions on a framework I was expert in, as a kind of profile-building exercise to help with job hunting, and after around the third smug self-satisfied comment picking me up on some piece of irrelevant bullshit I deleted my account.

      • lars@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        I hate how cathartic it is to watch that mountain of bullies burn to the ground 😌

  • INeedMana@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    I’m not convinced that the number of questions asked is the correct metric. In the end the point is not to have a constant flow of questions, rather constant flow of answers found.

    There is a point in proficiency in language/library/whatever after which it is faster to find the answer in the code/documentation/test example than to wait until another person on even higher level will come and answer your question.
    Maybe we simply filled out what was needed to be asked in the beginner-bug found-intermediate space and, apart from questions stemming from new versions etc, SO does not need more questions?

    Expectation for everything to constantly grow is unrealistic

    • silasmariner@programming.dev
      cake
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      As more and more libraries are open source on GitHub or gitlab or sourceforge or whateverthefuck, asking questions on the libraries themselves (as an issue) is often the right thing to do, too… Less centralised than SO but also the only people who care about how to do things in a lib are people using the lib, so…

    • notfromhere@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Honestly using the existing question stock to generate current-version answers using the current documentation as synthetic training data is probably the way to go.

  • Maxxie@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    Like it or hate it (personally I prefer the latter, posting there I felt like a middle schooler with a PUNCH ME sticker on my face) it was a great source of indexable data on programming.

    I wonder how will this affect future search and llms, now that all similar questions are being asked in private llm threads.

    • interdimensionalmeme@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      Ah yes, the place that never answered anything.

      The sloppiest of slops before we got AI slop.

      It was the pinterest of answering stuff

      • irinotecan@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        Or if they had an answer, they paywalled it, until Google got pissed at them for including the answer in their SEO but blocking it once the user clicked through. Then they maliciously complied with Google’s demand to not censor by burying the answer under layers upon layers of ads and other “related” questions.

        I was so glad to see SO eat their lunch.

      • gianni@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        I used it in earnest! (to write shitty VB scripts and PHP websites)

    • Bruncvik@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 month ago

      I remember when it didn’t have a dash. Until people started making fun of the old URL…

  • wetbeardhairs@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    I had a decently awarded account on SO because I joined it in 2012. I asked and answered questions. For the first few years it was fucking awesome as a professional developer. Then it’s popularity on google search results ended up making it too well known and the comment quality dropped substantially. Then the fucking powerusers popped up and started flagging almost everye one my questions as duplicates while pointing to unrelated questions. The last I really used SO was around 2017. I got too fed up to participate in the platform because when I spent the time to make a well formed question, it would just got shut down and my time wasted.

  • hexonxonx@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    Stack Overflow hasn’t been useful for at least 10 years, if not longer.

    The flagged “correct” answer is almost always wrong due to idiotic power-users and the vast horde of idiots who upvote obviously wrong answers because they’re bootlickers. The real answer is usually buried in between the posts by gatekeepers, pedants, idiots with something to prove, wannabe admins, egotistical idiots, the highly opinionated technologically insecure, etc ad nauseam. Reddit is just as bad for tech questions, if not worse.

    Since I started using LLMs (running on my own inference server) I haven’t used anything else for tech questions that wasn’t opinion-based. Much, much more useful, and it requires you to think seriously about the problem to come up with a good prompt – which often gives you the answer before you even finish the prompt.