There is a huge corporate insensitive that everyone is not realizing here. By screen recording + OCR, there is a possibility to start using this data to replace some labor intensive, but simple tasks of operating a business. If you can create RPA+ML+LLM that can rerun repetitive tasks, you have holy grail on your hands. I think this is one of the big reason why M$ is pushing this.
I assume to be down voted to oblivion, but I do business automation and integration for living, and at the same time I am scared and excited.
Automation suites exist and they are very much tuned to the individual apps. It seems giving ML an OCR readout of a page is not enough for it to know what it should do (accurately). We have had a training set for “booking flights on a browser” for about 6 years now and no one has figured out how to have it disrupt automated testing: https://miniwob.farama.org/
Lmao do you have any idea how quickly that’s going to go off the rails? They’re going to get into a hallucination feedback loop, which will destroy the integrity of their systems and processes, and they’ll richly deserve it.
At any rate, most highly-effective technical teams have already automated the shit out of all their rote operations without using ML.
Absolutely. Corporations - at least, shitty ones (most of them) - are absolutely salivating at using this. They want to be able to see and easily summarize eeeeeeverything you’re doing.
Some are absolutely already using a form of this. It’s not a hypothetical - this is currently happening and many want way way more.
I was thinking about this, but I don’t know what the plan us for annotating new flows with descriptions of the actions. There’s no point in learning how to send an email or open a webpage, that’s already easy. The value is in a database of uncommon interactions, but it’s only valuable if there is a description to train on.
There is a huge corporate insensitive that everyone is not realizing here. By screen recording + OCR, there is a possibility to start using this data to replace some labor intensive, but simple tasks of operating a business. If you can create RPA+ML+LLM that can rerun repetitive tasks, you have holy grail on your hands. I think this is one of the big reason why M$ is pushing this.
I assume to be down voted to oblivion, but I do business automation and integration for living, and at the same time I am scared and excited.
Automation suites exist and they are very much tuned to the individual apps. It seems giving ML an OCR readout of a page is not enough for it to know what it should do (accurately). We have had a training set for “booking flights on a browser” for about 6 years now and no one has figured out how to have it disrupt automated testing: https://miniwob.farama.org/
Lmao do you have any idea how quickly that’s going to go off the rails? They’re going to get into a hallucination feedback loop, which will destroy the integrity of their systems and processes, and they’ll richly deserve it.
At any rate, most highly-effective technical teams have already automated the shit out of all their rote operations without using ML.
Absolutely. Corporations - at least, shitty ones (most of them) - are absolutely salivating at using this. They want to be able to see and easily summarize eeeeeeverything you’re doing.
Some are absolutely already using a form of this. It’s not a hypothetical - this is currently happening and many want way way more.
I was thinking about this, but I don’t know what the plan us for annotating new flows with descriptions of the actions. There’s no point in learning how to send an email or open a webpage, that’s already easy. The value is in a database of uncommon interactions, but it’s only valuable if there is a description to train on.