• glimse@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Perhaps the best way would be through an analogy:

      “Are there any thermonuclear bombs made specifically for lighting candles?”

    • davel@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      Large Language Models are for natural language processing, not for converting between text document file formats.

  • tetris11@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    9 months ago

    I have to admit, PDF parsing being such a hot and profitable topic in computer science was really something I never saw coming.

    PDFs? The things you can select text from? And when not, there’s decent OCR? And when not, you just ask the person to send you an email or a word doc?

    It sounds like LLMs are looking for a new unpolluted source of historical data that they can learn from, and this source exists in the form of old scanned in paper documents. That’s the only reason I can fathom as to why this is such a big thing now.

    • MonkderVierte@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Selecting text doesn’t work in most multi-column pdfs and good OCR cost money. And if the original source is lost and you want an exact copy in word, the OCR tools need to be really good at guessing whitespace-to-line ratio, because pdf is only an output format and not a processing format.

      For most other converting needs, there’s pandoc, imagemagick and ffmpeg.

    • chicken@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Every time I try to convert a PDF to epub or something, or OCR one that doesn’t actually have selectable text, it turns out shit. I assume the real reason people would want to get LLMs involved is that there is actually a lot of ambiguity in what a correct conversion would be, and there are a lot of PDFs out there.

      • JustAnotherKay@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        9 months ago

        I self host sterling-pdf and I haven’t had an issue with file conversion in… When did I set this thing up?

        To be truthful, the machine I had it running on has been sent to the grave (I sold it) so I don’t actually have this service running right now

  • NicolaHaskell@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    This is that special blend of Tablet Kid “I don’t need to know things I can google them” and Rich Kid “I don’t need to do things I can crowdsource them” that makes for that Distinctively VP “I don’t know what I’m doing and nobody can tell 👈😎👉”

  • HStone32@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    9 months ago

    The secret to success in software engineering:

    1. Lie and say that there is
    2. Write or use a conversion algorithm
    3. Boss won’t know the difference
    4. Collect bonus at performance evaluation
    5. Put “AI engineer” on resume
  • mesamunefire@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Imagine getting a job like this and now half the nation knows your name…thats terrifying. being an intern may mean you have no idea of the true scope of what they are asking you to do.

    • GrumpyDuckling@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      We know that his dad is an engineering professor at university of Nebraska too. Really calls into question his credentials. I checked the other day and they had already removed his contact info from their website.

    • Pieisawesome@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      They are public employees who are changing things at the core of our government. Why wouldn’t we know their names?

      Government employees names aren’t secret (asides from a few exceptions) nor is their pay

    • crusa187@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Yeah, seems that’s the point. Old enough to competently perform what they’re told, but too young to realize the gravity of the situation and how wrong it is to partake in it.

        • ChapulinColorado@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          9 months ago

          It’s ok, with the experienced gained from being forced to grow up, some will come home and use their savings to buy a dodge ram on a 7 year loan at 18% apr.

    • some_guy@lemmy.sdf.org
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Yeah, I don’t really get this one. The class clown is the kid who recognizes a function of a tool, correctly at that. Unlike a dipshit lawyer who let it hallucinate bogus case law. Hilarious.

      • masterofn001@lemmy.ca
        link
        fedilink
        arrow-up
        0
        ·
        9 months ago

        There are programs that exist that explicitly do these sorts of things.

        They have been around since long before llm.

        This is a lazy and uneducated question.

        He demonstrates he has done zero research and goes straight to the buzzword because he knows nothing.

    • sevenapples@lemmygrad.ml
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      Not at all. The only similarity is that LLMs work with text, and the document formats can also represent text.

      Each format (E.g pdf, json, excel) has a defined standard, so all you have to do to change between each other is to map one format’s fields to the others. You don’t need (and won’t get good results) from having an LLM produce the new format from scratch.

      What he’s asking is the equivalent of asking if there’s an LLM made specifically for solving arithmetic problems. Why would you try to solve addition using an LLM?

    • Lka1988@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      9 months ago

      That was my thought. Young kids fresh out of school are really easy to manipulate into delusions of grandeur, especially when said delusions are offered by the richest person in the world. He’s gonna leave them out for the wolves.