• BringMeTheDiscoKing@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    If you aren’t the customer, you are the product. Congrats on being monetized and kinda sorta immortalized as a series of weights.

  • Wolpertinger@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    So I need to run any comments I make to reddit by chatgpt before posting, it seems. I heard ai training ai leads to a poisoned data set.

    • Fishbone@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      For text, AI training AI wouldn’t be all that great for giving data sets a little poison ivy rubdown, because at the end of the day, the message is still moderated by a non bot. I think a better way would be to write more unconventionally, but heavily contextual so that if specifics texts are ripped and tossed into the bot blender, it’ll make no sense without the context alongside it.

      Slang, edge case wording, and verbing non verbs would likely do a lot of heavy lifting in that department.

      • addie@feddit.uk
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        Using LLMs for corporate communications - automatically-generated complaint responses, and the like - usually has swearing disabled, so if you want to fuck up their shit, be sure to express yourself with as many fucking swears as possible. Let’s get that shit into those cunt’s language models ASAP.

    • rar@discuss.online
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      It’s all federated, so it would be strange the bots didn’t scrape anything off.

      • deweydecibel@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        And ya know what? Frankly, if AI is going to harvest all this shit, I’d rather fuckers like spez couldn’t get rich off it in the process. Granted I’m not happy the tech bros running these AI companies are getting rich with these fucking things, but I can at least take solace there isn’t some asshole middle man making bank of the work and words of users they never paid a dime to.

        Genuinely, why does Sepz and Reddit deserve to make money off anything we posted? Why does any social media site? They make the site, pay for the servers, maintain the apps, sure, and they can get compensation for that, I don’t see a problem there. But why does any social media company deserve to get rich when the only thing that makes their platform valuable is the people that post to it? Reddit didn’t even have paid mods, the community did all the work on the content of that site, why in the fuck do we tolerate these assholes making profit off it like this?

        • General_Effort@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          This is sad to read because I agree with all of it (except the casual sexism).

          why in the fuck do we tolerate these assholes making profit off it like this?

          Look at this thread. People delete their posts on Reddit. Which means that they can no longer be scraped for free. Which means they are now exclusively available in Reddit’s archive. It’s not that people tolerate it. It’s that the first instinct of people who don’t tolerate it, is to make it worse. What can you do?

        • 👍Maximum Derek👍@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          If the EU (or any other governments) decide that AI can’t legally train their models on information they don’t own or license (I don’t know how that would work legally but they talk about it), then this company that Reddit has sold access to could argue to lawmakers that they have license for all the content on Reddit. I don’t know that it would hold up, but I suspect it’s part of the company’s perceived value in this Reddit deal.

    • OmanMkII@aussie.zone
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      I was curious if a robots.txt equivalent exists for AI training data, and there was some solid points here:

      If I go to your writing, I read it & learn from it. Your writing influences my future writing. We’ve been okay with this as long as it’s not a blatant forgery.

      If a computer goes to your writing, it reads it & learns from it. Your writing influences its future writing. It seems we are not okay with this, even if it isn’t blatant forgery.

      [AI at the moment is] different because the company is re-using your material to create a product they are going to sell. I’m not sure if I believe that is so different than a human employee doing the same thing.

      https://news.ycombinator.com/item?id=34324208

      I still think we should have the ability to opt out like we do with search engines and webcrawlers, but if the algorithm works ideally and learns but does not recycle content, is it truly any different from a factory of workers pumping out clones of popular series on Amazon? I honestly don’t know the answer to that.

      • deweydecibel@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        9 months ago

        The problem is not the technology, the problem is the businesses and the people behind them.

        These tools were made with the explicit purpose of taking the content that they did not create, repurposing them, and creating a product. Throw all these conversation about intelligence and learning out the fucking window, what matters is what the thing does, and why it was created to do that thing.

        Until we reach a point where there is some sort of AI out there that has any semblance of free will, and can choose not to learn if fed certain information, and choose not to respond to input given to it without being programmed to do not respond, then we are not talking about intelligence, we are talking about a tool. No matter how they dress it up.

        Stop arguing about this on their terms, because they’re gaslighting the fuck out of you.

      • Mossy Feathers (She/They)@pawb.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        This is kinda my take on it. However, the way I see it is that the AI isn’t intelligent enough yet to truly create something original. As such, right now AI is closer to being a tool than a being. Because of that, it somewhat bothers me that I’m being used to teach a tool. If I thought that companies like OpenAI were truly trying to create beings and not tools, then I’d feel differently.

        It’s kinda nuanced, but a being can voluntarily determine whether or not something is copyright infringing, understand why that might be an issue, and then decide whether or not to continue writing based on that. A tool can’t really do that. You can try and add filters to a tool to avoid writing copy written text, but that will have flaws and holes in it. A being who understands what it’s writing and what makes it plagiarism vs reference vs homage/inspiration/whatever is less likely to have those issues.

      • Appoxo@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        Afaik the OpenAI bot may choose to ignore it? At least that’s what another user claimed it does.

        • JohnEdwa@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          Robots.txt has been always ignored by some bots, it’s just a guideline originally meant to prevent excessive bandwidth usage by search indexing bots and is entirely voluntary.

          Archive.org bot for example has completely ignored it since 2017.

  • Morcyphr@lemmy.one
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Who cares? Fuck reddit. Half the content is bots anyway. So, bots stealing content to train AI to make content, which the bots will steal and repost. Circle of death for reddit. Good luck with that IPO.

  • wise_pancake@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    9 months ago

    We should have been posting factually incorrect information instead of deleting posts this whole time.

    Although I think Reddit does a good job paying factually incorrect information on its own.

  • Fake4000@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Shit move from Reddit. Glad I jumped ship to lemmy.

    Honestly, lemmy has less users compared to Reddit, yet you still get more engagement.

    • DarkNightoftheSoul@mander.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      The only engagement you actually get is on super-niche subreddits. Other than that, the “engagement” is largely indistinguishable from bot traffic.

    • EatATaco@lemm.ee
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      You are glad that you jumped to where AI companies can get the information for free, but are mad at Reddit for getting paid for it.

      I can’t make any sense of this.

      • grue@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        It’s like the difference between volunteering and being forced to do community service.

        • EatATaco@lemm.ee
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          In neither case are you forced to do anything so this doesn’t make any sense either.

      • TORFdot0@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        The difference is that Lemmy admins across the fediverse aren’t making the user experience worse so they can sell the data to corporations for LLM training

        • EatATaco@lemm.ee
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          So it’s really that the user experience is getting worse. Feeding ai has nothing to do with it.

          • tacofox@lemm.ee
            link
            fedilink
            English
            arrow-up
            0
            ·
            9 months ago

            First of all, tacos are friends, not food…

            Secondly, I think it’s more important what they did to achieve this goal, locking down the API behind a paywall was their way of creating value in their data. They knew then that it would be too expensive for independent developers to pay for but didn’t care. They knew the money would be coming AI data brokers.

          • Alex@feddit.ro
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            9 months ago

            I’d rather have AI companies have my data for free than reddshit gettong paid for it

  • bbkpr@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Good, so let’s train crappy AI on posts by crappier AI, which was trained by posts from even crappier AI before it.

  • COASTER1921@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    If they hadn’t applied the same charges to legitimate 3rd party applications they could still do this and have avoided the massive community backlash.

    Considering their horrible track record with advertising and selling Reddit premium this should be the single best way for them to finally monetize their platform. They didn’t need to destroy what little credibility they had remaining to their users to get to this point, but for whatever reason they did.

    • Fake4000@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      What I don’t understand is that they had the option of providing a free service to all third party apps provided there was no commercial use.

      They could have easily asked for a cut from any AI company using their data for training.

      • COASTER1921@lemmy.ml
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        Not only did they have the option, as I understand it the API was even configured as such since all requests from an app shared the same API key. They’re basically whitelisting like this now but only for the accessibility oriented 3rd party apps.

  • xantoxis@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Damn. I keep meaning to use one of those things that deletes all your reddit data. I doubt it’ll actually do anything (reddit has no ethical framework so they won’t think twice about indexing “deleted” data) but I still need to do that.

    • Alpha71@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      Yeah, I deleted a banned account only to still find the posts I made still up. So I went in and manually deleted EVEY. SINGLE. ONE.

      Guess what. They still show up.

    • ipkpjersi@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      I’d bet a year of my salary that it only deletes it from public view so people can no longer get helped from Reddit’s Google search results, but a copy (or more than one copy) is still retained on their internal servers.

      • Dettweiler@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        The trick is to turn everything into randomized garbage and then delete it later. A lot of those purge services offer that feature. It just swaps the words with others; so on the surface it looks like proper written text, but it makes absolutely no sense.

        Aside from removing your content that they’re profiting from, it also feeds AI scrapers pure garbage in the event that your content is restored.

        • JeeBaiChow@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          Me, I’d prefer to fill it in with fake news. Let them train their bots on ‘taylor swift is an alien psyop trained to infiltrate the highest levels of govt to fulfill the agenda of the radical left instellar warmongering fearlords …’

        • Crackhappy@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          Yep. I did that over a month to all of my posts and comments, then deleted it all a week later before deleting my account.

      • HonorIsDead@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        9 months ago

        Maybe I’m miss remembering but weren’t they restoring stuff users deleted during the API protest?

        • philodendron@lemdro.id
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 months ago

          They were. One user got so upset he live-streamed himself individually deleting every post and comment he’d ever made. Reddit restored it all right after.

  • mellowheat@suppo.fi
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    9 months ago

    Well of course, that’s the #1 reason why everyone stopped providing free-to-use APIs last year. Because AI companies were getting all that data for free via those APIs.

  • gapbetweenus@feddit.de
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    If user content belongs to the service provider, one would think that they are responsible for it.

  • Postreader2814@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    That post reminded me that lemmee exists. Accounts didn’t work that great when I first got here but I made one today and got verified. Logged out of Reddit for the last time and replaced my comments. Eff that place right in it’s a-hole. Good riddance.