Amazon announces Nova, a new family of multimodal AI models

by admin December 3, 2024

written by admin December 3, 2024

At its re:Invent 2024 conference, Amazon Web Services (AWS), Amazon’s cloud computing division, announced a new family of generative AI, multimodal models called Nova.

There’s four text-focused models in total: Micro, Lite, Pro, and Premier. Micro, Lite, and Pro are available today for AWS customers, while Premiere will launch in Q1 2025, Amazon CEO Andy Jassy said onstage.

In addition to those, there’s an image generation model, Nova Canas, and a video-generating model, Nova Reel. Both are publicly available today.

“We’ve continued to work on our own frontier models,” Jassy said, “and those frontier models have made a tremendous amount of progress over the last four to five months. And we figured, if we were finding value out of them, you would probably find value out of them.”

Micro, Lite, Pro, and Premier

The text-focused Nova models are differentiated by their capabilities and sizes, mainly.

Micro, which can only take in text and output text, delivers the lowest latency of the bunch — processing text and generating answers the fastest. Lite can process image, video, and text inputs reasonably quickly. Pro offers the “best combination of accuracy, speed, and cost” for a range of tasks, Amazon says. And Premier is the most capable, designed for complex workloads.

Pro and Premier, like Lite, can analyze text, images, or video.

Jassy claims the Nova models are among the fastest in their class — and the cheapest to run. They’re available in AWS Bedrock, Amazon’s AI development platform, where they can be fine-tuned and distilled for improved speed and lower costs.

“We’ve optimized these models to work with proprietary systems and APIs, so that you can do multiple orchestrated automatic steps — agent behavior — much more easily with these models,” Jassy added. “So I think these are very compelling.”

Canas and Reel

Canvas and Reel are Amazon’s strongest play yet for generative media.

Canvas lets users generate and edit images using prompts, and provides controls for the generated image’s color scheme and layout. Reel, the more ambitious of the two models, creates videos up to six seconds in length from prompts. Using Reel, users can adjust the camera motion to generate videos with pans, 360-degree rotations, and zoom.

Reel’s currently limited to six-second videos, but a version that can generate two-minute-long videos is “coming soon,” according to Amazon.

Jassy stressed that both Canvas and Reel have “built-in” controls for responsible use, including watermarking and content moderation. “[We’re trying] to limit the generation harmful content,” he said.

So, what’s next for Nova? Jassy said that Amazon’s working on a speech-to-speech model for Q1 2025 and an “any-to-any” model that should arrive around mid-2025.

“You’ll be able to input text, speech, images, or video and output text, speech, images, and video,” Jassy said of the any-to-any model. “This is the future of how frontier models are going to be built and consumed.”

Source Link

Amazon announces Nova, a new family of multimodal AI models

Micro, Lite, Pro, and Premier

Canas and Reel

AWS’ new service tackles AI hallucinations

Clarifai introduces vendor-agnostic orchestration capabilities

Related Posts

Leave a Comment Cancel Reply