This is Part II of a two-part series from the AWS Generative AI Innovation Center. If you missed Part I, refer to Operationalizing Agentic AI Part 1: A Stakeholder’s Guide.
The biggest barrier to agentic AI isn’t the technology—it’s the...
We thank Greg Pereira and Robert Shaw from the llm-d team for their support in bringing llm-d to AWS.
In the agentic and reasoning era, large language models (LLMs) generate 10x more tokens and compute through complex reasoning chains compared...
Building and managing machine learning (ML) features at scale is one of the most critical and complex challenges in modern data science workflows. Organizations often struggle with fragmented feature pipelines, inconsistent data definitions, and redundant engineering efforts across teams....
This post is cowritten with Ilija Subanovic and Michael Rice from Workhuman.
Workhuman’s customer service and analytics team were drowning in one-time reporting requests from seven million users worldwide—a common challenge with legacy reporting tools at scale. Business intelligence (BI)...
EAGLE is the state-of-the-art method for speculative decoding in large language model (LLM) inference, but its autoregressive drafting creates a hidden bottleneck: the more tokens that you speculate, the more sequential forward passes the drafter needs. Eventually those overhead...
Deploying AI agents safely in regulated industries is challenging. Without proper boundaries, agents that access sensitive data or execute transactions can pose significant security risks. Unlike traditional software, an AI agent chooses actions to achieve a goal by invoking...