Amazon is inactive seen arsenic a spot of a laggard successful the contention to make precocious artificial intelligence, but it has softly created a laboratory that is present mounting records erstwhile it comes to AI performance. Amazon’s AGI SF Lab, which is located successful San Francisco and dedicated to gathering artificial wide intelligence, oregon AI that surpasses the capabilities of humans, revealed the archetypal fruits of its enactment today: A caller AI exemplary susceptible of powering immoderate of the astir precocious AI agents disposable anywhere.
The caller model, called Amazon Nova Act, outperforms ones from OpenAI and Anthropic connected respective benchmarks designed to gauge the quality and aptitude of AI agents, Amazon says. On the benchmarks GroundUI Web and ScreenSpot, Amazon Nova Act performs amended than Claude 3.7 Sonnet and OpenAI Computer Use Agent. A large portion of Amazon’s program to vie successful the AI marketplace is to absorption connected gathering agents, and the caller model’s abilities bespeak its efforts to physique a procreation of tools that tin measurement up to the precise champion available.
“I judge that the basal atomic portion of computing successful the aboriginal is going to beryllium a telephone to a elephantine [AI] agent,” says David Luan, who leads Amazon’s AGI SF Lab. He was antecedently a vice president of engineering astatine OpenAI and aboriginal cofounded Adept, a startup that pioneered enactment connected AI agents, earlier joining Amazon successful 2024 erstwhile the ecommerce elephantine took a involvement successful the company.
Most of the starring AI labs are present focused connected gathering progressively susceptible AI agents. Getting AI to maestro autarkic actions, arsenic good arsenic conversation, promises to marque the exertion much utile and valuable. The displacement from chat to enactment is inactive precise overmuch a enactment successful progress, however.
In the past six months, OpenAI, Anthropic, Google, and others person demonstrated web-browsing agents that instrumentality actions successful effect to a prompt. But for the astir part, these agents are inactive unreliable, and they tin easy beryllium tripped up by open-ended requests.
Luan says that Amazon’s extremity is gathering AI agents that are dependable alternatively than flashy. The happening holding agents backmost is not the request for “more chill demos of absorbing capabilities that enactment 60 percent of the time, it’s the Waymo problem,” helium says, referring to however self-driving cars needed to beryllium trained to woody with antithetic borderline cases earlier they could instrumentality to the streets unsupervised.
Many alleged agents are built by combining ample connection models with aggregate human-written rules that are designed to forestall them from veering disconnected course, but besides makes their behaviour brittle. Amazon Nova Act is simply a mentation of the company's astir almighty homegrown exemplary Amazon Nova that has received further grooming to assistance it marque decisions astir what actions to instrumentality and astatine what time. In general, Luan says, AI models conflict to determine erstwhile they should intervene successful a task.
To amended Nova’s agential abilities, Amazon is utilizing reinforcement learning, a method that has helped different AI models amended simulate reasoning.