Update 19-Dec-2023: I have a new post that compares the performance of multiple models in AWS Bedrock. That post compares 8 models instead of 4 that were available when this post was written.
AWS Bedrock makes it easy to build and experiment with several generative AI models. AWS started with just a handful of foundational models but has been steadily adding more models, including the recently added Llama2–13B model.
This brief post will compare the performance (speed/response time) of different Text Generation models currently available in AWS Bedrock. The four models being compared are:
- Cohere Command
- Jurassic Mid
- Jurassic Ultra
- Llama2 13B
Cohere Command light is not part of this comparison as it was not working correctly for the few-shot example I used in this test. This comparison also excludes Amazon’s Titan models because I don’t have access to them yet. This is not a thorough comparison but should serve as a guide to further testing.
Test conditions and workload:
The box plots you see below are a result of response times for 50 requests (sequentially sent) to each model.