Home Gen AI News Talk Streamline grant proposal reviews using Amazon Bedrock

Streamline grant proposal reviews using Amazon Bedrock

4

Government and non-profit organizations evaluating grant proposals face a significant challenge: sifting through hundreds of detailed submissions, each with unique merits, to identify the most promising initiatives. This arduous, time-consuming process is typically the first step in the grant management process, which is critical to driving meaningful social impact.

The AWS Social Responsibility & Impact (SRI) team recognized an opportunity to augment this function using generative AI. The team developed an innovative solution to streamline grant proposal review and evaluation by using the natural language processing (NLP) capabilities of Amazon Bedrock. Amazon Bedrock is a fully managed service that lets you use your choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities that you need to build generative AI applications with security, privacy, and responsible AI.

Historically, AWS Health Equity Initiative applications were reviewed manually by a review committee. It took 14 or more days each cycle for all applications to be fully reviewed. On average, the program received 90 applications per cycle. The June 2024 AWS Health Equity Initiative application cycle received 139 applications, the program’s largest influx to date. It would have taken an estimated 21 days for the review committee to manually process these many applications. The Amazon Bedrock centered approach reduced the review time to 2 days (a 90% reduction).

The goal was to enhance the efficiency and consistency of the review process, empowering customers to build impactful solutions faster. By combining the advanced NLP capabilities of Amazon Bedrock with thoughtful prompt engineering, the team created a dynamic, data-driven, and equitable solution demonstrating the transformative potential of large language models (LLMs) in the social impact domain.

In this post, we explore the technical implementation details and key learnings from the team’s Amazon Bedrock powered grant proposal review solution, providing a blueprint for organizations seeking to optimize their grants management processes.

Building an effective prompt for reviewing grant proposals using generative AI

Prompt engineering is the art of crafting effective prompts to instruct and guide generative AI models, such as LLMs, to produce the desired outputs. By thoughtfully designing prompts, practitioners can unlock the full potential of generative AI systems and apply them to a wide range of real-world scenarios.

When building a prompt for our Amazon Bedrock model to review grant proposals, we used multiple prompt engineering techniques to make sure the model’s responses were tailored, structured, and actionable. This included assigning the model a specific persona, providing step-by-step instructions, and specifying the desired output format.

First, we assigned the model the persona of an expert in public health, with a focus on improving healthcare outcomes for underserved populations. This context helps prime the model to evaluate the proposal from the perspective of a subject matter expert (SME) who thinks holistically about global challenges and community-level impact. By clearly defining the persona, we make sure the model’s responses are tailored to the desired evaluation lens.

Your task is to review a proposal document from the perspective of a given persona, and assess it based on dimensions defined in a rubric. Here are the steps to follow:

1. Review the provided proposal document: {PROPOSAL}

2. Adopt the perspective of the given persona: {PERSONA}

Multiple personas can be assigned against the same rubric to account for various perspectives. For example, when the persona “Public Health Subject Matter Expert” was assigned, the model provided keen insights on the project’s impact potential and evidence basis. When the persona “Venture Capitalist” was assigned, the model provided more robust feedback on the organization’s articulated milestones and sustainability plan for post funding. Similarly, when the persona “Software Development Engineer” was assigned, the model relayed subject matter expertise on the proposed use of AWS technology.

Next, we broke down the review process into a structured set of instructions for the model to follow. This includes reviewing the proposal, assessing it across specific dimensions (impact potential, innovation, feasibility, sustainability), and then providing an overall summary and score. Outlining these step-by-step directives gives the model clear guidance on the required task elements and helps produce a comprehensive and consistent assessment.

3. Assess the proposal based on each dimension in the provided rubric: {RUBRIC}

For each dimension, follow this structure:
<Dimension Name>
 <Summary> Provide a brief summary (2-3 sentences) of your assessment of how well the proposal meets the criteria for this dimension from the perspective of the given persona. </Summary>
 <Score> Provide a score from 0 to 100 for this dimension. Start with a default score of 0 and increase it based on the information in the proposal. </Score>
 <Recommendations> Provide 2-3 specific recommendations for how the author could improve the proposal in this dimension. </Recommendations>
</Dimension Name>

4. After assessing each dimension, provide an <Overall Summary> section with:
 - An overall assessment summary (3-4 sentences) of the proposal's strengths and weaknesses across all dimensions from the persona's perspective
 - Any additional feedback beyond the rubric dimensions
 - Identification of any potential risks or biases in the proposal or your assessment

5. Finally, calculate the <Overall Weighted Score> by applying the weightings specified in the rubric to your scores for each dimension.

Finally, we specified the desired output format as JSON, with distinct sections for the dimensional assessments, overall summary, and overall score. Prescribing this structured response format makes sure that the model’s output can be ingested, stored, and analyzed by our grant review team, rather than being delivered in free-form text. This level of control over the output helps streamline the downstream use of the model’s evaluations.

6. Return your assessment in JSON format with the following structure:

{{ "dimensions": [ {{ "name": "<Dimension Name>", "summary": "<Summary>", "score": <Score>, "recommendations": [ "<Recommendation 1>", "<Recommendation 2>", ... ] }}, ... ], "overall_summary": "<Overall Summary>","overall_score": <Overall Weighted Score> }}

Do not include any other commentary beyond following the specified structure. Focus solely on providing the assessment based on the given inputs.

By combining these prompt engineering techniques—role assignment, step-by-step instructions, and output formatting—we were able to craft a prompt that elicits thorough, objective, and actionable grant proposal assessments from our generative AI model. This structured approach enables us to effectively use the model’s capabilities to support our grant review process in a scalable and efficient manner.

Building a dynamic proposal review application with Streamlit and generative AI

To demonstrate and test the capabilities of a dynamic proposal review solution, we built a rapid prototype implementation using Streamlit, Amazon Bedrock, and Amazon DynamoDB. It’s important to note that this implementation isn’t intended for production use, but rather serves as a proof of concept and a starting point for further development. The application allows users to define and save various personas and evaluation rubrics, which can then be dynamically applied when reviewing proposal submissions. This approach enables a tailored and relevant assessment of each proposal, based on the specified criteria.

The application’s architecture consists of several key components, which we discuss in this section.

The team used DynamoDB, a NoSQL database, to store the personas, rubrics, and submitted proposals. The data stored in DynamoDB was sent to Streamlit, a web application interface. On Streamlit, the team added the persona and rubric to the prompt and sent the prompt to Amazon Bedrock.

import boto3
import json

from api.personas import Persona
from api.rubrics import Rubric
from api.submissions import Submission

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")

def _construct_prompt(persona: Persona, rubric: Rubric, submission: Submission):
    rubric_dimensions = [
        f"{dimension['name']}|{dimension['description']}|{dimension['weight']}"
        for dimension in rubric.dimensions
    ]

    # add the table headers the prompt is expecting to the front of the dimensions list
    rubric_dimensions[:0] = ["dimension_name|dimension_description|dimension_weight"]
    rubric_string = "n".join(rubric_dimensions)
    print(rubric_string)

    with open("prompt/prompt_template.txt", "r") as prompt:
        prompt = prompt.read()
        print(prompt)
        return prompt.format(
            PROPOSAL=submission.content,
            PERSONA=persona.description,
            RUBRIC=rubric_string,
        )

Amazon Bedrock used Anthropic’s Claude 3 Sonnet FM to evaluate the submitted proposals against the prompt. The model’s prompts are dynamically generated based on the selected persona and rubric. Amazon Bedrock would send the evaluation results back to Streamlit for team review.

def get_assessment(submission: Submission, persona: Persona, rubric: Rubric):
    prompt = _construct_prompt(persona, rubric, submission)

    body = json.dumps(
        {
            "anthropic_version": "",
            "max_tokens": 2000,
            "temperature": 0.5,
            "top_p": 1,
            "messages": [{"role": "user", "content": prompt}],
        }
    )
    response = bedrock.invoke_model(
        body=body, modelId="anthropic.claude-3-haiku-20240307-v1:0"
    )
    response_body = json.loads(response.get("body").read())
    return response_body.get("content")[0].get("text")

The following diagram illustrates the show of the preceding figure.

The workflow consists of the following steps:

  1. Users can create and manage personas and rubrics through the Streamlit application. These are stored in the DynamoDB database.

  2. When a user submits a proposal for review, they choose the desired persona and rubric from the available options.
  3. The Streamlit application generates a dynamic prompt for the Amazon Bedrock model, incorporating the selected persona and rubric details.
  4. The Amazon Bedrock model evaluates the proposal based on the dynamic prompt and returns the assessment results.
  5. The evaluation results are stored in the DynamoDB database and presented to the user through the Streamlit application.

Impact

This rapid prototype demonstrates the potential for a scalable and flexible proposal review process, allowing organizations to:

  • Reduce application processing time by up to 90%
  • Streamline the review process by automating the evaluation tasks
  • Capture structured data on the proposals and assessments for further analysis
  • Incorporate diverse perspectives by enabling the use of multiple personas and rubrics

Throughout the implementation, the AWS SRI team focused on creating an interactive and user-friendly experience. By working hands-on with the Streamlit application and observing the impact of dynamic persona and rubric selection, users can gain practical experience in building AI-powered applications that address real-world challenges.

Considerations for a production-grade implementation

Although the rapid prototype demonstrates the potential of this solution, a production-grade implementation requires additional considerations and the use of additional AWS services. Some key considerations include:

  • Scalability and performance – For handling large volumes of proposals and concurrent users, a serverless architecture using AWS Lambda, Amazon API Gateway, DynamoDB, and Amazon Simple Storage Service (Amazon S3) would provide improved scalability, availability, and reliability.
  • Security and compliance – Depending on the sensitivity of the data involved, additional security measures such as encryption, authentication and access control, and auditing are necessary. Services like AWS Key Management Service (KMS), Amazon Cognito, AWS Identity and Access Management (IAM), and AWS CloudTrail can help meet these requirements.
  • Monitoring and logging – Implementing robust monitoring and logging mechanisms using services like Amazon CloudWatch and AWS X-Ray enable tracking performance, identifying issues, and maintaining compliance.
  • Automated testing and deployment – Implementing automated testing and deployment pipelines using services like AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy help provide consistent and reliable deployments, reducing the risk of errors and downtime.
  • Cost optimization – Implementing cost optimization strategies, such as using AWS Cost Explorer and AWS Budgets, can help manage costs and help maintain efficient resource utilization.
  • Responsible AI considerations – Implementing safeguards—such as Amazon Bedrock Guardrails—and monitoring mechanisms can help enforce the responsible and ethical use of the generative AI model, including bias detection, content moderation, and human oversight. Although the AWS Health Equity Initiative application form collected customer information such as name, email address, and country of operation, this was systematically omitted when sent to the Amazon Bedrock enabled tool to avoid bias in the model and protect customer data.

By using the full suite of AWS services and following best practices for security, scalability, and responsible AI, organizations can build a production-ready solution that meets their specific requirements while achieving compliance, reliability, and cost-effectiveness.

Conclusion

Amazon Bedrock—coupled with effective prompt engineering—enabled AWS SRI to review grant proposals and deliver awards to customers in days instead of weeks. The skills developed in this project—such as building web applications with Streamlit, integrating with NoSQL databases like DynamoDB, and customizing generative AI prompts—are highly transferable and applicable to a wide range of industries and use cases.


About the authors

Carolyn Vigil  is a Global Lead for AWS Social Responsibility & Impact Customer Engagement. She drives strategic initiatives that leverage cloud computing for social impact worldwide. A passionate advocate for underserved communities, she has co-founded two non-profit organizations serving individuals with developmental disabilities and their families. Carolyn enjoys Mountain adventures with her family and friends in her free time.

Lauren Hollis is a Program Manager for AWS Social Responsibility and Impact. She leverages her background in economics, healthcare research, and technology to support mission-driven organizations deliver social impact using AWS cloud technology.  In her free time, Lauren enjoys reading an playing the piano and cello.

 Ben West is a hands-on builder with experience in machine learning, big data analytics, and full-stack software development. As a technical program manager on the AWS Social Responsibility & Impact team, Ben leverages a wide variety of cloud, edge, and Internet of Things (IoT) technologies to develop innovative prototypes and help public sector organizations make a positive impact in the world.  Ben is an Army Veteran that enjoys cooking and being outdoors.

Mike Haggerty is a Senior Systems Development Engineer (Sr. SysDE) at Amazon Web Services (AWS), working within the PACE-EDGE team. In this role, he contributes to AWS’s edge computing initiatives as part of the Worldwide Public Sector (WWPS) organization’s PACE (Prototyping and Customer Engineering) team. Beyond his professional duties, Mike is a pet therapy volunteer who, together with his dog Gnocchi, provides support services at local community facilities.