HomeOpenAI

OpenAI

AI Tools

Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand)

AI Tools

Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI

AI News

Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3

AI News

Building age-responsive, context-aware AI with Amazon Bedrock Guardrails

AI Tools

Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era

Reinforcement fine-tuning on Amazon Bedrock with OpenAI-Compatible APIs: a technical walkthrough

In December 2025, we announced the availability of Reinforcement fine-tuning (RFT) on Amazon Bedrock starting with support for Nova models. This was followed by extended support for Open weight models such as OpenAI GPT OSS 20B and Qwen 3 32B in February 2026. RFT...

Run NVIDIA Nemotron 3 Super on Amazon Bedrock

Nemotron 3 Super is now available as a fully managed and serverless model on Amazon Bedrock, joining the Nemotron Nano models that are already available within the Amazon Bedrock environment. With NVIDIA Nemotron open models on Amazon Bedrock, you can accelerate innovation and deliver tangible...

Kick off Nova customization experiments using Nova Forge SDK

With a wide array of Nova customization offerings, the journey to customization and transitioning between platforms has traditionally been intricate, necessitating technical expertise, infrastructure setup, and considerable time investment. This disconnect between potential and practical applications is precisely what we aimed to address. Nova...

Introducing Disaggregated Inference on AWS powered by llm-d

We thank Greg Pereira and Robert Shaw from the llm-d team for their support in bringing llm-d to AWS. In the agentic and reasoning era, large language models (LLMs) generate 10x more tokens and compute through complex reasoning chains compared to single-shot replies. Agentic AI...

P-EAGLE: Faster LLM inference with Parallel Speculative Decoding in vLLM

EAGLE is the state-of-the-art method for speculative decoding in large language model (LLM) inference, but its autoregressive drafting creates a hidden bottleneck: the more tokens that you speculate, the more sequential forward passes the drafter needs. Eventually those overhead eats into your gains. P-EAGLE...

Fine-tuning NVIDIA Nemotron Speech ASR on Amazon EC2 for domain adaptation

This post is a collaboration between AWS, NVIDIA and Heidi. Automatic speech recognition (ASR), often called speech-to-text (STT) is becoming increasingly critical across industries like healthcare, customer service, and media production. While pre-trained models offer strong capabilities for general speech, fine-tuning for specific domains and...

OpenAI

Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand)

Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI

Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3

Building age-responsive, context-aware AI with Amazon Bedrock Guardrails

Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era

Reinforcement fine-tuning on Amazon Bedrock with OpenAI-Compatible APIs: a technical walkthrough

Run NVIDIA Nemotron 3 Super on Amazon Bedrock

Kick off Nova customization experiments using Nova Forge SDK

Introducing Disaggregated Inference on AWS powered by llm-d

P-EAGLE: Faster LLM inference with Parallel Speculative Decoding in vLLM

Fine-tuning NVIDIA Nemotron Speech ASR on Amazon EC2 for domain adaptation

most viewed

Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand)

Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI

Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3

trending right now

Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand)

Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI

Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3

Building age-responsive, context-aware AI with Amazon Bedrock Guardrails

Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era

Game On: Five New Titles Now Streaming on GeForce NOW