HomeGen AI News Talk

Gen AI News Talk

Accelerating LLM inference with post-training weight and activation using AWQ and GPTQ on Amazon SageMaker AI

Foundation models (FMs) and large language models (LLMs) have been rapidly scaling, often doubling in parameter count within months, leading to significant improvements in language understanding and generative capabilities. This rapid growth comes with steep costs: inference now requires enormous memory capacity, high-performance GPUs,...

Sentiment Analysis with Text and Audio Using AWS Generative AI Services: Approaches, Challenges, and Solutions

This post is co-written by Instituto de Ciência e Tecnologia Itaú (ICTi) and AWS. Sentiment analysis has grown increasingly important in modern enterprises, providing insights into customer opinions, satisfaction levels, and potential frustrations. As interactions occur largely through text (such as social media, chat applications,...

Architecting TrueLook’s AI-powered construction safety system on Amazon SageMaker AI

This post is co-written by TrueLook and AWS. TrueLook is a construction camera and jobsite intelligence company that provides real-time visibility into construction projects. Its platform combines high-resolution time-lapse cameras, live video streaming, and AI-powered insights to help teams monitor progress, improve accountability, and reduce...

How Beekeeper optimized user personalization with Amazon Bedrock

This post is cowritten by Mike Koźmiński from Beekeeper. Large Language Models (LLMs) are evolving rapidly, making it difficult for organizations to select the best model for each specific use case, optimize prompts for quality and cost, adapt to changing model capabilities, and personalize responses...

Scaling medical content review at Flo Health using Amazon Bedrock (Part 1)

This blog post is based on work co-developed with Flo Health. Healthcare science is rapidly advancing. Maintaining accurate and up-to-date medical content directly impacts people’s lives, health decisions, and well-being. When someone searches for health information, they are often at their most vulnerable, making accuracy...

Speed meets scale: Load testing SageMakerAI endpoints with Observe.AI’s testing tool

This post is cowritten with Aashraya Sachdeva from Observe.ai. You can use Amazon SageMaker to build, train and deploy machine learning (ML) models, including large language models (LLMs) and other foundation models (FMs). This helps you significantly reduce the time required for a range of...