Wikipedia has a new initiative called WikiProject AI Cleanup. It is a task force of volunteers currently combing through Wikipedia articles, editing or removing false information that appears to have been posted by people using generative AI.

Ilyas Lebleu, a founding member of the cleanup crew, told 404 Media that the crisis began when Wikipedia editors and users began seeing passages that were unmistakably written by a chatbot of some kind.

  • e$tGyr#J2pqM8v@feddit.nl
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    2 days ago

    Sabotage Wikipedia, Ddos the Internet Archive. Makes you wonder if in the future we’re going to forget our past. Will actual history be obscured in a sea of alternative histories unrecognizably presented as the same thing. Maybe we need to keep some books laying around in archives just to be sure.

    • endofline@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 days ago

      We have still Anna’s archive, scihub, libgen and old fashion traditional libraries ( including the national ). National libraries won’t disappear in the nearest years, maybe will rotten due to defunding but still they will exist

    • TachyonTele@lemm.ee
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      2 days ago

      The digital dark age will be a real thing, absolutely.

      Interesting idea on a sea of alternative histories. That might be a possible threat.
      Someone else here called it “AI text apocalypse”. I like that term.

    • NateNate60@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 days ago

      “[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,” Lebleu said.

  • randon31415@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    If anyone can survive the AI text apocalypse, it is wikipedia. They have been fending off and regulating article writing bots since someone coded up a US town article writer from the 2000 census (not the 2010 or 2020 census, the 2000 census. This bot was writing wikipedia articles in 2003)

  • kibiz0r@midwest.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Unleashing generative AI on the world was basically the information equivalent of jumping headfirst into Kessler Syndrome.

    • khannie@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      For the uninitiated like me:

      The Kessler syndrome (also called the Kessler effect,[1][2] collisional cascading, or ablation cascade), proposed by NASA scientists Donald J. Kessler and Burton G. Cour-Palais in 1978, is a scenario in which the density of objects in low Earth orbit (LEO) due to space pollution is numerous enough that collisions between objects could cause a cascade in which each collision generates space debris that increases the likelihood of further collisions.

      Wikipedia link.

  • narc0tic_bird@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Best case is that the model used to generate this content was originally trained by data from Wikipedia so it “just” generates a worse, hallucinated “variant” of the original information. Goes to show how stupid this idea is.

    Imagine this in a loop: AI trained by Wikipedia that then alters content on Wikipedia, which in turn gets picked up by the next model trained. It would just get worse and worse, similar to how converting the same video over and over again yields continuously worse results.

    • 8uurg@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      A very similar situation to that analysed in this paper that was recently published. The quality of what is generated degrades significantly.

      Although they mostly investigate replacing the data with ai generated data in each step, so I doubt the effect will be as pronounced in practice. Human writing will still be included and even curation of ai generated text by people can skew the distribution of the training data (as the process by these editors would inevitably do, as reasonable text could get through the cracks.)

      • Blaster M@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 days ago

        AI model makers are very well aware of this and there is a move from ingesting everything to curating datasets more aggressively. Data prep ia something many upstarts have no idea is critical, but everyone is learning about, sometimes the hard way.

    • huginn@feddit.it
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      See also: model collapse

      (Which is more or less just regression towards the mean with more steps)

    • Wrench@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      Yes, this is what many of us worry will become the internet in general. AI content generated on from AI trained on AI garbage.

      AI bots can trivially outpace humans.

      • kboy101222@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 days ago

        I was just discussing with a friend of mine how we’re rapidly approaching the dead internet. At some point, many websites will likely just be chat bots talking to other chat bots, which then gets used to train further chat bots. Human made content is already becoming harder and harder to find on algorithm heavy websites like Reddit and facebooks suite of sites. The bots can easily outpace any algorithmic changes they might make to help deter them, but my fb using family members all constantly block those weird Jesus accounts and they still show up constantly

      • FeelzGoodMan420@eviltoast.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        3 days ago

        I wouldn’t know. I use pihole to block all ads on my TV OS. I’m curious though, which service/app is giving you ads on pause? Do you mean like on a Roku TV where the screensaver is ads? Many TVs let you disable that (i.e. LG WebOS.) otherwise pihole is your friend :-)

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 days ago

          My TV is old enough that it doesn’t have it, I’m just talking about the general trend toward making that a thing. I’m not going to buy a TV that forces ads on me, and the fact that I have to actively look for that on my next TV is appalling.

          • FeelzGoodMan420@eviltoast.org
            link
            fedilink
            English
            arrow-up
            0
            ·
            3 days ago

            I have bad news for you. Literally every TV has ads now. Every. Single. One. That’s why I keep harping on Pihole. It blocks them.

            • sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              0
              ·
              3 days ago

              Not the commercial grade ones, like “hospitality” TVs. They’re more expensive, but they’re also intended to be a bit more reliable as well.

              I’m worried they’ll adapt the ads to not be blockable w/ Pihole.

  • lolola@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    I hate to post because I have loved and trusted Wikipedia for years, but the fact that there are folks out there who equally trust what AI tools generate just baffles me.

    • Dragonstaff@leminal.space
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      The signal to noise ratio is so low these days. There’s so much information out there but everyone wants to profit from you before you can get it. Even worse, the people with good information generally can’t buy as big a megaphone as the people who profit from lying to you.

      Honestly, I think humans have been more likely to believe an easy lie than a hard truth all along, but it’s easier than ever these days.

  • Aatube@kbin.melroy.org
    link
    fedilink
    arrow-up
    0
    ·
    3 days ago

    Don’t worry, it’s not as bad as the title suggests. The attack on Internet Archive is far, far worse. It’s obviously a bit of a problem, though.We