AWS Bedrock Batch Inference and On-Demand Inference Dynamic Switching for Document Processing Pipelines

Amazon Bedrock’s new document processing solution provides a dynamic pipeline design that achieves both cost efficiency and processing speed. This architecture implements a mechanism that automatically selects on-demand inference and batch inference based on time constraints and cost optimization requirements.

For companies with large volumes of document processing, the choice of processing method is a crucial technical decision. The new solution offers flexibility to handle both large backlogs, such as hundreds of millions of land lease documents, and new documents added daily.

(Reference: Extract Data with On-demand and Batch Pipelines Dynamically)

Architecture Details and Processing Flow

The system consists of two parallel pipelines. The on-demand pipeline on the left processes individual documents within seconds, triggered by an AWS SQS FIFO queue. The batch inference pipeline on the right processes multiple document requests asynchronously in a single Amazon Bedrock batch inference job.

In on-demand processing, the queue message contains the document ID, LLM model ID, prompt ID/version, and system prompt ID/version. When the AWS Lambda function is triggered, it retrieves the PDF document from the specified Amazon S3 bucket, converts PDF pages to PNG images, obtains related prompts from Amazon Bedrock Prompt Management to configure the message to LLM, and saves the results to an Amazon DynamoDB table.

Creating a queue message can be done using the AWS CLI:

aws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/1111111111/ondemand-data-pipe

Both pipelines allow specifying prompt ID and version in the request, and the corresponding prompt text is retrieved from Amazon Bedrock Prompt Management.

(Reference: Extract Data with On-demand and Batch Pipelines Dynamically)

Integration of AI Agent Evaluation with Agent-EvalKit

In evaluating AI agents, assessing execution paths that cannot be captured by traditional output-level testing is becoming important. Agent-EvalKit is an open-source toolkit under the Apache 2.0 license that integrates with AI coding assistants like Claude Code, Kiro CLI, and Kilo Code, providing a complete evaluation workflow within the development environment.

This toolkit operates through six evaluation phases, reading the agent’s source code, generating target test cases, executing evaluations, and generating reports that include improvement recommendations with specific locations in the codebase. A travel inquiry agent built with Strands Agents SDK and Amazon Bedrock is used as an example to explain the operation of each phase.

Evaluations are driven through slash commands like /evalkit.plan and /evalkit.data, with natural language guidance to convey quality requirements to the assistant.

(Reference: Evaluate AI agents systematically with Agent-EvalKit)

Utilizing Documentation Features in API Gateway

Amazon API Gateway provides a documentation feature that allows adding and updating help content for individual API entities as part of the API development process. This feature enables saving source content and archiving different versions of documentation.

Documentation versions can be associated with API stages, and stage-specific documentation snapshots can be exported to external OpenAPI files for distribution as published documentation. When creating documentation parts for API entities using the API Gateway console, the following properties map is used:

{
"info": {
"description": "Your first API Gateway API.",
"contact": {
"name": "John Doe",
"email": "john.doe@api.com"
}
}
}

APIs can be documented using API Gateway REST API, AWS SDK, AWS CLI, or the API Gateway console, and documentation parts defined in external OpenAPI files can be imported and exported.

(Reference: Document an API using the API Gateway console)

Summary

  • Amazon Bedrock’s dynamic pipeline design enables processing documents with time constraints using on-demand inference within seconds and cost-oriented large volumes of documents efficiently using batch inference.
  • Combining Agent-EvalKit with Claude Code or Kiro CLI allows for complete execution path evaluation, including tool calls and intermediate states, within the development environment, providing specific code-level improvement suggestions.
  • Combining API Gateway’s documentation feature with Prompt Management enables creating detailed API documentation for each entity in the document processing pipeline, with version-controlled prompts and operational capabilities.