AI agents have reached a critical inflection point where their ability to generate sophisticated code exceeds the capacity to execute it safely in production environments. Organizations deploying agentic AI face a fundamental dilemma: although large language models (LLMs) can produce complex code scripts, mathematical analyses, and data visualizations, executing this AI-generated code introduces significant security vulnerabilities and operational complexity.
In this post, we introduce the Amazon Bedrock AgentCore Code Interpreter, a fully managed service that enables AI agents to securely execute code in isolated sandbox environments. We discuss how the AgentCore Code Interpreter helps solve challenges around security, scalability, and infrastructure management when deploying AI agents that need computational capabilities. We walk through the service’s key features, demonstrate how it works with practical examples, and show you how to get started with building your own agents using popular frameworks like Strands, LangChain, and LangGraph.
Security and scalability challenges with AI-generated code
Consider an example where an AI agent needs perform analysis on multi-year sales projections data for a product, to understand anomalies, trends, and seasonality. The analysis should be grounded in logic, repeatable, handle data securely, and scalable over large data and multiple iterations, if needed. Although LLMs excel at understanding and explaining concepts, they lack the ability to directly manipulate data or perform consistent mathematical operations at scale. LLMs alone are often inadequate for complex data analysis tasks like these, due to their inherent limitations in processing large datasets, performing precise calculations, and generating visualizations. This is where code interpretation and execution tools become essential, providing the capability to execute precise calculations, handle large datasets efficiently, and create reproducible analyses through programming languages and specialized libraries. Furthermore, implementing code interpretation capabilities comes with significant considerations. Organizations must maintain secure sandbox environments to help prevent malicious code execution, manage resource allocation, and maintain data privacy. The infrastructure requires regular updates, robust monitoring, and careful scaling strategies to handle increasing demand.
Traditional approaches to code execution in AI systems suffer from several limitations:
- Security vulnerabilities – Executing untrusted AI-generated code in production environments exposes organizations to code injection threats, unauthorized system access, and potential data breaches. Without proper sandboxing, malicious or poorly constructed code can compromise entire infrastructure stacks.
- Infrastructure overhead – Building secure execution environments requires extensive DevOps expertise, including container orchestration, network isolation, resource monitoring, and security hardening. Many organizations lack the specialized knowledge to implement these systems correctly.
- Scalability bottlenecks – Traditional code execution environments struggle with the dynamic, unpredictable workloads generated by AI agents. Peak demand can overwhelm static infrastructure, and idle periods waste computational resources.
- Integration complexity – Connecting secure code execution capabilities with existing AI frameworks often requires custom development, creating maintenance overhead and limiting adoption across development teams.
- Compliance challenges – Enterprise environments demand comprehensive audit trails, access controls, and compliance certifications that are difficult to implement and maintain in custom solutions.
These barriers have prevented organizations from fully using the computational capabilities of AI agents, limiting their applications to simple, deterministic tasks rather than the complex, code-dependent workflows that could maximize business value.
Introducing the Amazon Bedrock AgentCore Code Interpreter
With the AgentCore Core Interpreter, AI agents can write and execute code securely in sandbox environments, enhancing their accuracy and expanding their ability to solve complex end-to-end tasks. This purpose-built service minimizes the security, scalability, and integration challenges that have hindered AI agent deployment by providing a fully managed, enterprise-grade code execution system specifically designed for agentic AI workloads. The AgentCore Code Interpreter is designed and built from the ground up for AI-generated code, with built-in safeguards, dynamic resource allocation, and seamless integration with popular AI frameworks. It offers advanced configuration support and seamless integration with popular frameworks, so developers can build powerful agents for complex workflows and data analysis while meeting enterprise security requirements.
Transforming AI agent capabilities
The AgentCore Code Interpreter powers advanced use cases by addressing several critical enterprise requirements:
- Enhanced security posture – Configurable network access options range from fully isolated environments, which provide enhanced security by helping prevent AI-generated code from accessing external systems, to controlled network connectivity that provides flexibility for specific development needs and use cases.
- Zero infrastructure management – The fully managed service minimizes the need for specialized DevOps resources, reducing time-to-market from months to days while maintaining enterprise-grade reliability and security.
- Dynamic scalability – Automatic resource allocation handles varying AI agent workloads without manual intervention, providing low-latency session start-up times during peak demand while optimizing costs during idle periods.
- Framework agnostic integration – It integrates with Amazon Bedrock AgentCore Runtime, with native support for popular AI frameworks including Strands, LangChain, LangGraph, and CrewAI, so teams can use existing investments while maintaining development velocity.
- Enterprise compliance – Built-in access controls and comprehensive audit trails facilitate regulatory compliance without additional development overhead.
Purpose-built for AI agent code execution
The AgentCore Code Interpreter represents a shift in how AI agents interact with computational resources. This operation processes the agent generated code, runs it in a secure environment, and returns the execution results, including output, errors, and generated visualizations. The service operates as a secure, isolated execution environment where AI agents can run code (Python, JavaScript, and TypeScript), perform complex data analysis, generate visualizations, and execute mathematical computations without compromising system security. Each execution occurs within a dedicated sandbox environment that provides complete isolation from other workloads and the broader AWS infrastructure. What distinguishes the AgentCore Code Interpreter from traditional execution environments is its optimization for AI-generated workloads. The service handles the unpredictable nature of AI-generated code through intelligent resource management, automatic error handling, and built-in security safeguards specifically designed for untrusted code execution.
Key features and capabilities of AgentCore Code Interpreter include:
- Secure sandbox architecture:
- Low-latency session start-up time and compute-based session isolation facilitating complete workload separation
- Configurable network access policies supporting both isolated sandbox and controlled public network modes
- Implements resource constraints by setting maximum limits on memory and CPU usage per session, helping to prevent excessive consumption (see AgentCore Code Interpreter Service Quotas)
- Advanced session management:
- Persistent session state allowing multi-step code execution workflows
- Session-based file storage for complex data processing pipelines
- Automatic session and resource cleanup
- Support for long-running computational tasks with configurable timeouts
- Comprehensive Python runtime environment:
- Pre-installed data science libraries, including pandas, numpy, matplotlib, scikit-learn, and scipy
- Support for popular visualization libraries, including seaborn and bokeh
- Mathematical computing capabilities with sympy and statsmodels
- Custom package installation within sandbox boundaries for specialized requirements
- File operations and data management:
- Upload data files, process them with code, and retrieve the results
- Secure file transfer mechanisms with automatic encryption
- Support for upload and download of files directly within the sandbox from Amazon Simple Storage Service (Amazon S3)
- Support for multiple file formats, including CSV, JSON, Excel, and images
- Temporary storage with automatic cleanup for enhanced security
- Support for running AWS Command Line Interface (AWS CLI) commands directly within the sandbox, using the Amazon Bedrock AgentCore SDK and API
- Enterprise integration features:
- AWS Identity and Access Management (IAM) based access controls with fine-grained permission management
- AWS CloudTrail integration providing audit trails for compliance
How the AgentCore Code Interpreter works
To understand the functionality of the AgentCore Code Interpreter, let’s examine the orchestrated flow of a typical data analysis request from an AI agent, as illustrated in the following diagram.
The workflow consists of the following key components:
- Deployment and invocation – An agent is built and deployed (for instance, on the AgentCore Runtime) using a framework like Strands, LangChain, LangGraph, or CrewAI. When a user sends a prompt (for example, “Analyze this sales data and show me the trend by salesregion”), the AgentCore Runtime initiates a secure, isolated session.
- Reasoning and tool selection – The agent’s underlying LLM analyzes the prompt and determines that it needs to perform a computation. It then selects the AgentCore Code Interpreter as the appropriate tool.
- Secure code execution – The agent generates a code snippet, for instance using the pandas library, to read a data file and matplotlib to create a plot. This code is passed to the AgentCore Code Interpreter, which executes it within its dedicated, sandboxed session. The agent can read from and write files to the session-specific file system.
- Observation and iteration – The AgentCore Code Interpreter returns the result of the execution—such as a calculated value, a dataset, an image file of a graph, or an error message—to the agent. This feedback loop allows the agent to engage in iterative problem-solving by debugging its own code and refining its approach.
- Context and memory – The agent maintains context for subsequent turns in the conversation, during the duration of the session. Alternatively, the entire interaction can be persisted in Amazon Bedrock AgentCore Memory for long-term storage and retrieval.
- Monitoring and observability – Throughout this process, a detailed trace of the agent’s execution, providing visibility into agent behavior, performance metrics, and logs, is available for debugging and auditing purposes.
Practical real-world applications and use cases
The AgentCore Code Interpreter can be applied to real-world business problems that are difficult to solve with LLMs alone.
Use case 1: Automated financial analysis
An agent can be tasked with performing on-demand analysis of financial data. For this example, a user provides a CSV file of billing data within the following prompt and asks for analysis and visualization: “Using the billing data provided below, create a bar graph that shows the total spend by product category… After generating the graph, provide a brief interpretation of the results…”The agent takes the following actions:
- The agent receives the prompt and the data file containing the raw data.
- It invokes the AgentCore Code Interpreter, generating Python code with the pandas library to parse the data into a DataFrame. The agent then generates another code block to group the data by category and sum the costs, and asks the AgentCore Code Interpreter to execute it.
- The agent uses matplotlib to generate a bar chart and the AgentCore Code Interpreter saves it as an image file.
- The agent returns both a textual summary of the findings and the generated PNG image of the graph.
Use case 2: Interactive data science assistant
The AgentCore Code Interpreter’s stateful session supports a conversational and iterative workflow for data analysis. For this example, a data scientist uses an agent for exploratory data analysis. The workflow is as follows:
- The user provides a prompt: “Load dataset.csv and provide descriptive statistics.”
- The agent generates and executes
pandas.read_csv('dataset.csv')
followed by.describe()
and returns the statistics table. - The user prompts, “Plot a scatter plot of column A versus column B.”
- The agent, using the dataset already loaded in its session, generates code with
matplotlib.pyplot.scatter()
and returns the plot. - The user prompts, “Run a simple linear regression and provide the R^2 value.”
- The agent generates code using the scikit-learn library to fit a model and calculate the R^2 metric.
This demonstrates iterative code execution capabilities, which allow agents to work through complex data science problems in a turn-by-turn manner with the user.
Solution overview
To get started with the AgentCore Code Interpreter, clone the GitHub repo:
In the following sections, we show how to create a question answering agent that validates answers through code and reasoning. We build it using the Strands SDK, but you can use a framework of your choice.
Prerequisites
Make sure you have the following prerequisites:
- An AWS account with AgentCore Code Interpreter access
- The necessary IAM permissions to create and manage AgentCore Code Interpreter resources and invoke models on Amazon Bedrock
- The required Python packages installed (including boto3, bedrock-agentcore, and strands)
- Access to Anthropic’s Claude 4 Sonnet model in the
us-west-2
AWS Region (Anthropic’s Claude 4 is the default model for Strands SDK, but you can override and use your preferred model as described in the Strands SDK documentation)
Configure your IAM role
Your IAM role should have appropriate permissions to use the AgentCore Code Interpreter:
Set up and configure the AgentCore Code Interpreter
Complete the following setup and configuration steps:
- Install the bedrock-agentcore Python SDK:
- Import the AgentCore Code Interpreter and other libraries:
- Define the system prompt:
- Define the code execution tool for the agent. Within the tool definition, we use the invoke method to execute the Python code generated by the LLM-powered agent. It automatically starts a serverless AgentCore Code Interpreter session if one doesn’t exist.
- Configure the agent:
Invoke the agent
Test the AgentCore Code Interpreter powered agent with a simple prompt:
We get the following result:
Pricing and availability
Amazon Bedrock AgentCore is available in multiple Regions and uses a consumption-based pricing model with no upfront commitments or minimum fees. Billing for the AgentCore Code Interpreter is calculated per second and is based on the highest watermark of CPU and memory resources consumed during that second, with a 1-second minimum charge.
Conclusion
The AgentCore Code Interpreter transforms the landscape of AI agent development by solving the critical challenge of secure, scalable code execution in production environments. This purpose-built service minimizes the complex infrastructure requirements, security vulnerabilities, and operational overhead that have historically prevented organizations from deploying sophisticated AI agents capable of complex computational tasks. The service’s architecture—featuring isolated sandbox environments, enterprise-grade security controls, and seamless framework integration—helps development teams focus on agent logic and business value rather than infrastructure complexity.
To learn more, refer to the following resources:
- Introducing Amazon Bedrock AgentCore: Securely deploy and operate AI agents at any scale (preview)
- Execute code and analyze data using Amazon Bedrock AgentCore Code Interpreter
- Code Interpreter API Reference Examples
- Amazon Bedrock AgentCore Code Interpreter GitHub repo
Try it out today or reach out to your AWS account team for a demo!
About the authors
Veda Raman is a Senior Specialist Solutions Architect for generative AI and machine learning at AWS. Veda works with customers to help them architect efficient, secure, and scalable machine learning applications. Veda specializes in generative AI services like Amazon Bedrock and Amazon SageMaker.
Rahul Sharma is a Senior Specialist Solutions Architect at AWS, helping AWS customers build and deploy, scalable Agentic AI solutions. Prior to joining AWS, Rahul spent more than decade in technical consulting, engineering, and architecture, helping companies build digital products, powered by data and machine learning. In his free time, Rahul enjoys exploring cuisines, traveling, reading books(biographies and humor) and binging on investigative documentaries, in no particular order.
Kishor Aher is a Principal Product Manager at AWS, leading the Agentic AI team responsible for developing first-party tools such as Browser Tool, and Code Interpreter. As a founding member of Amazon Bedrock, he spearheaded the vision and successful launch of the service, driving key features including Converse API, Managed Model Customization, and Model Evaluation capabilities. Kishor regularly shares his expertise through speaking engagements at AWS events, including re:Invent and AWS Summits. Outside of work, he pursues his passion for aviation as a general aviation pilot and enjoys playing volleyball.