A new report from plagiarism detector Copyleaks found that 60% of OpenAI’s GPT-3.5 outputs contained some form of plagiarism.

Why it matters: Content creators from authors and songwriters to The New York Times are arguing in court that generative AI trained on copyrighted material ends up spitting out exact copies.

  • kromem@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    8 months ago

    “Plagiarism detection company claims LLM conditions plagiarism according to their detector.”

    I wonder how many student written essays also contain ‘plagiarism’ according to their tool.

    • CheeseNoodle@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      8 months ago

      100% iirc, there are only so many ways to write about how the blue curtains indicate the character is feeling depressed or something.

    • Anamnesis@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      8 months ago

      Probably very few. The bias for these companies is in false negatives, not false positives, since false positives create controversy when students appeal a ruling.

      • General_Effort@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        8 months ago

        The bias here was certainly to come up with a lot of false positives for advertising; kinda like anti-virus companies do it.