HomeGen AI News Talk

Gen AI News Talk

Elevate customer experience by using the Amazon Q Business custom plugin for New Relic AI

Digital experience interruptions can harm customer satisfaction and business performance across industries. Application failures, slow load times, and service unavailability can lead to user frustration, decreased engagement, and revenue loss. The risk and impact of outages increase during peak usage periods, which vary by...

Query structured data from Amazon Q Business using Amazon QuickSight integration

Amazon Q Business is a generative AI-powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. Although generative AI is fueling transformative innovations, enterprises may still experience sharply divided data silos...

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – Part 2

In Part 1 of this series, we introduced Amazon SageMaker Fast Model Loader, a new capability in Amazon SageMaker that significantly reduces the time required to deploy and scale large language models (LLMs) for inference. We discussed how this innovation addresses one of the major...

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1

The generative AI landscape has been rapidly evolving, with large language models (LLMs) at the forefront of this transformation. These models have grown exponentially in size and complexity, with some now containing hundreds of billions of parameters and requiring hundreds of gigabytes of memory....

Unlock cost savings with the new scale down to zero feature in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce a new feature for Amazon SageMaker inference endpoints: the ability to scale SageMaker inference endpoints to zero instances. This long-awaited capability is a game changer for our customers using the power of AI and...

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI  models for inference. This innovation allows you to scale your models faster, observing up to 56%...