Comparing the Speed of LLMs in AWS Bedrock

Bilal
5 min readNov 17, 2023
AWS Bedrock makes it easy to use several Foundational models

Update 19-Dec-2023: I have a new post that compares the performance of multiple models in AWS Bedrock. That post compares 8 models instead of 4 that were available when this post was written.

AWS Bedrock makes it easy to build and experiment with several generative AI models. AWS started with just a handful of foundational models but has been steadily adding more models, including the recently added Llama2–13B model.

This brief post will compare the performance (speed/response time) of different Text Generation models currently available in AWS Bedrock. The four models being compared are:

  1. Cohere Command
  2. Jurassic Mid
  3. Jurassic Ultra
  4. Llama2 13B

Cohere Command light is not part of this comparison as it was not working correctly for the few-shot example I used in this test. This comparison also excludes Amazon’s Titan models because I don’t have access to them yet. This is not a thorough comparison but should serve as a guide to further testing.

Test conditions and workload:

The box plots you see below are a result of response times for 50 requests (sequentially sent) to each model.

--

--

Bilal

Learning new things everyday. Writing about things that I learn and observe. PhD in computer science. https://www.linkedin.com/in/mbilalce/