Member-only story

Comparing the Speed of LLMs in AWS Bedrock

5 min readNov 17, 2023

AWS Bedrock makes it easy to use several Foundational models

Update 19-Dec-2023: I have a new post that compares the performance of multiple models in AWS Bedrock. That post compares 8 models instead of 4 that were available when this post was written.

AWS Bedrock makes it easy to build and experiment with several generative AI models. AWS started with just a handful of foundational models but has been steadily adding more models, including the recently added Llama2–13B model.

This brief post will compare the performance (speed/response time) of different Text Generation models currently available in AWS Bedrock. The four models being compared are:

Cohere Command
Jurassic Mid
Jurassic Ultra
Llama2 13B

Cohere Command light is not part of this comparison as it was not working correctly for the few-shot example I used in this test. This comparison also excludes Amazon’s Titan models because I don’t have access to them yet. This is not a thorough comparison but should serve as a guide to further testing.

Test conditions and workload:

The box plots you see below are a result of response times for 50 requests (sequentially sent) to each model.

Comparing the Speed of LLMs in AWS Bedrock

Test conditions and workload:

Written by Bilal

Responses (1)