lemmy.mlaga97.space
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Jaden Norman@lemmy.world to Technology@lemmy.worldEnglish · 2 days ago

AI agents wrong ~70% of time: Carnegie Mellon study

www.theregister.com

external-link
message-square
253
fedilink
  • cross-posted to:
  • technology@beehaw.org
  • technology@lemmy.ml
1
external-link

AI agents wrong ~70% of time: Carnegie Mellon study

www.theregister.com

Jaden Norman@lemmy.world to Technology@lemmy.worldEnglish · 2 days ago
message-square
253
fedilink
  • cross-posted to:
  • technology@beehaw.org
  • technology@lemmy.ml
Analysis: More fiction than science
  • Log in | Sign up@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    What’s 0.7^10?

    • Knock_Knock_Lemmy_In@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 day ago

      About 0.02

      • Log in | Sign up@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        22 hours ago

        So the chances of it being right ten times in a row are 2%.

        • Knock_Knock_Lemmy_In@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          22 hours ago

          No the chances of being wrong 10x in a row are 2%. So the chances of being at least right once are 98%.

          • Log in | Sign up@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            21 hours ago

            Ah, my bad, you’re right, for being consistently correct, I should have done 0.3^10=0.0000059049

            so the chances of it being right ten times in a row are less than one thousandth of a percent.

            No wonder I couldn’t get it to summarise my list of data right and it was always lying by the 7th row.

            • Knock_Knock_Lemmy_In@lemmy.world
              link
              fedilink
              English
              arrow-up
              0
              ·
              12 hours ago

              That looks better. Even with a fair coin, 10 heads in a row is almost impossible.

              And if you are feeding the output back into a new instance of a model then the quality is highly likely to degrade.

              • Log in | Sign up@lemmy.world
                link
                fedilink
                English
                arrow-up
                0
                ·
                2 hours ago

                Whereas if you ask a human to do the same thing ten times, the probability that they get all ten right is astronomically higher than 0.0000059049.

Technology@lemmy.world

technology@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


  • @L4s@lemmy.world
  • @autotldr@lemmings.world
  • @PipedLinkBot@feddit.rocks
  • @wikibot@lemmy.world
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 456 users / day
  • 1.6K users / week
  • 5.17K users / month
  • 13.9K users / 6 months
  • 0 local subscribers
  • 72.5K subscribers
  • 13.9K Posts
  • 439K Comments
  • Modlog
  • mods:
  • L3s@lemmy.world
  • enu@lemmy.world
  • Technopagan@lemmy.world
  • L4sBot@lemmy.world
  • L3s@hackingne.ws
  • L4s@hackingne.ws
  • BE: 0.19.5
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org