MAROKO133 Update ai: Upwork study shows AI agents excel with human partners but fail indep

πŸ“Œ MAROKO133 Breaking ai: Upwork study shows AI agents excel with human partners bu

Artificial intelligence agents powered by the world's most advanced language models routinely fail to complete even straightforward professional tasks on their own, according to groundbreaking research released Thursday by Upwork, the largest online work marketplace.

But the same study reveals a more promising path forward: When AI agents collaborate with human experts, project completion rates surge by up to 70%, suggesting the future of work may not pit humans against machines but rather pair them together in powerful new ways.

The findings, drawn from more than 300 real client projects posted to Upwork's platform, marking the first systematic evaluation of how human expertise amplifies AI agent performance in actual professional work β€” not synthetic tests or academic simulations. The research challenges both the hype around fully autonomous AI agents and fears that such technology will imminently replace knowledge workers.

"AI agents aren't that agentic, meaning they aren't that good," Andrew Rabinovich, Upwork's chief technology officer and head of AI and machine learning, said in an exclusive interview with VentureBeat. "However, when paired with expert human professionals, project completion rates improve dramatically, supporting our firm belief that the future of work will be defined by humans and AI collaborating to get more work done, with human intuition and domain expertise playing a critical role."

How AI agents performed on 300+ real freelance jobsβ€”and why they struggled

Upwork's Human+Agent Productivity Index (HAPI) evaluated how three leading AI systems β€” Gemini 2.5 Pro, OpenAI's GPT-5, and Claude Sonnet 4 β€” performed on actual jobs posted by paying clients across categories including writing, data science, web development, engineering, sales, and translation.

Critically, Upwork deliberately selected simple, well-defined projects where AI agents stood a reasonable chance of success. These jobs, priced under $500, represent less than 6% of Upwork's total gross services volume β€” a tiny fraction of the platform's overall business and an acknowledgment of current AI limitations.

"The reality is that although we study AI, and I've been doing this for 25 years, and we see significant breakthroughs, the reality is that these agents aren't that agentic," Rabinovich told VentureBeat. "So if we go up the value chain, the problems become so much more difficult, then we don't think they can solve them at all, even to scratch the surface. So we specifically chose simpler tasks that would give an agent some kind of traction."

Even on these deliberately simplified tasks, AI agents working independently struggled. But when expert freelancers provided feedback β€” spending an average of just 20 minutes per review cycle β€” the agents' performance improved substantially with each iteration.

20 minutes of human feedback boosted AI completion rates up to 70%

The research reveals stark differences in how AI agents perform with and without human guidance across different types of work. For data science and analytics projects, Claude Sonnet 4 achieved a 64% completion rate working alone but jumped to 93% after receiving feedback from a human expert. In sales and marketing work, Gemini 2.5 Pro's completion rate rose from 17% independently to 31% with human input. OpenAI's GPT-5 showed similarly dramatic improvements in engineering and architecture tasks, climbing from 30% to 50% completion.

The pattern held across virtually all categories, with agents responding particularly well to human feedback on qualitative, creative work requiring editorial judgment β€” areas like writing, translation, and marketing β€” where completion rates increased by up to 17 percentage points per feedback cycle.

The finding challenges a fundamental assumption in the AI industry: that agent benchmarks conducted in isolation accurately predict real-world performance.

"While we show that in the tasks that we have selected for agents to perform in isolation, they perform similarly to the previous results that we've seen published openly, what we've shown is that in collaboration with humans, the performance of these agents improves surprisingly well," Rabinovich said. "It's not just a one-turn back and forth, but the more feedback the human provides, the better the agent gets at performing."

Why ChatGPT can ace the SAT but can't count the R's in 'strawberry'

The research arrives as the AI industry grapples with a measurement crisis. Traditional benchmarks β€” standardized tests that AI models can master, sometimes scoring perfectly on SAT exams or mathematics olympiads β€” have proven poor predictors of real-world capability.

"With advances of large language models, what we're now seeing is that these static, academic datasets are completely saturated," Rabinovich said. "So you could get a perfect score in the SAT test or LSAT or any of the math olympiads, and then you would ask ChatGPT how many R's there are in the word strawberry, and it would get it wrong."

This phenomenon β€” where AI systems ace formal tests but stumble on trivial real-world questions β€” has led to growing skepticism about AI capabilities, even as companies race to deploy autonomous agents. Several recent benchmarks from other firms have tested AI agents on Upwork jobs, but those evaluations measured only isolated performance, not the collaborative potential that Upwork's research reveals.

"We wanted to evaluate the quality of these agents on actual real work with economic value associated with it, and not only see how well these agents do, but also see how these agents do in collaboration with humans, because we sort of knew already that in isolation, they're not that advanced," Rabinovich explained.

For Upwork, which connects roughly 800,000 active clients posting more than 3 million jobs annually to a global pool of freelancers, the research serves a strategic business purpose: establishing quality standards for AI agents before allowing them to compete or collaborate with human workers on its platform.

The economics of human-AI teamwork: Why paying for expert feedback still saves money

Despite requiring multiple rounds of human feedback β€” each lasting about 20 minutes β€” the time investment remains "orders of magnitude different between a human doing the work alone, versus a human doing the work with an AI agent," Rabinovich said. Where a project might take a freelancer days to complete independently, the agent-plus-human approach can deliver results in hours through iterative cycles of automated work and expert refinement.

The economic implications extend beyond simple t…

Konten dipersingkat otomatis.

πŸ”— Sumber: venturebeat.com


πŸ“Œ MAROKO133 Update ai: Deaf Man Sues Tesla for Firing Him When His Hearing Aids Ma

A former Tesla employee claims he was fired after raising complaints about how the extreme heat of his work environment was causing his hearing aids to malfunction.

The man, Hans Kohls, made the allegations in a lawsuit against the Elon Musk owned automaker, which was filed Monday and obtained by The Independent.

Kohls, who is deaf, previously worked at Tesla’s enormous Gigafactory in Austin, Texas, where he was given a job melting aluminum at 1,220 degrees Fahrenheit, according to the reporting. The “extreme heat and moisture” of the casting department, which “far exceed standard industrial heat levels,” the lawsuit said, caused his hearing aids to fail, something that was not only a massive inconvenience, but dangerous: it would be impossible for Kohl to hear alarms and other safety alerts without them.

But when Kohl asked to be reassigned, the suit alleges, Tesla failed to comply with the regulations of the Americans With Disabilities Act and instead fired him.

“The facts of this case are stark and troubling,” Kohls’ attorney Andrew Rozynski told The Independent. “Tesla had a highly qualified employee who requested the most basic accommodation under the ADA, reassignment to a vacant position where he’d already demonstrated success. Instead of complying with the law, they fired him within nine days and told him he was being ‘medically separated.’”

The lawsuit adds to longstanding scrutiny over Tesla’s grueling workplace conditions. Several deaths have occurred at its factories, and the company has been accused of underreporting hundreds of workplace injuries.

Much attention has also been paid to Musk’s personal treatment of employees. He has a history of impulsively firing employees on the spot during his frequent fits of rage, and allegedly threatened to deport an employee who raised a critical safety issue.

Kohls applied for an internship through Tesla’s START program for training candidates to work technical jobs at the company in February 2024, the lawsuit said. He had previously worked in an industrial environment at another job, so when he was asked in an interview whether he would still be able to work in a hot environment after the interviewer noticed his hearing aid, he answered yes.

But “neither the application nor the interview disclosed that the Casting Department’s extreme heat and humidity conditions would far exceed standard industrial heat levels,” the lawsuit stated. Adding to the feeling of deception, Kohls was first sent to work at different departments at the Texas Gigafactory where conditions were much cooler. In other words, nothing Tesla had him train for prepared him for the extreme heat he would face when he was assigned to the casting department.

Kohls soon filed a transfer request in June 2024 to Tesla HR, asking to be sent to another position where his hearing aids could still function, a form of disability accommodation that’s obligated by the ADA, the suit argues. Instead, Tesla insisted that no other jobs were available β€” which the suit says is untrue β€” and that the START program “prohibited transfers.” Nine days later, Kohls was fired, and was told he was being “medically separated.”

“By characterizing the termination as ‘medical separation,’ Tesla revealed it was terminating Mr. Kohls because he had a disability that required accommodation β€” not for any legitimate, non-discriminatory reason,” reads the suit.

Kohls is demanding that he be reinstated to a new position at Tesla and is asking a judge to declare the automaker in violation of the ADA and the Texas Commission on Human Rights Act, according to The Independent.

More on Tesla: The Guy Responsible for the Cybertruck Is Suddenly Leaving Tesla

The post Deaf Man Sues Tesla for Firing Him When His Hearing Aids Malfunctioned appeared first on Futurism.

πŸ”— Sumber: futurism.com


πŸ€– Catatan MAROKO133

Artikel ini adalah rangkuman otomatis dari beberapa sumber terpercaya. Kami pilih topik yang sedang tren agar kamu selalu update tanpa ketinggalan.

βœ… Update berikutnya dalam 30 menit β€” tema random menanti!

Author: timuna