AWS API Gateway - Burst Throttling

To prevent your API from being overwhelmed by too many requests, Amazon API Gateway throttles requests to your API using the token bucket algorithm, where a token counts for a request.

What is the Burst?

In API Gateway, the burst limit corresponds to the maximum number of concurrent request submissions that API Gateway can fulfill at any moment without returning 429 Too Many Requests error responses.

The burst quota is determined by the API Gateway service team based on the overall RPS quota for the account in the Region. It is not a quota that a customer can control or request changes to.

What is the API Gateway Rate limits?

The rate limits are calculated in Requests Per Second, or RPS.

If a user sends too many requests, API rate limiting can throttle client connections instead of disconnecting them immediately. Throttling lets clients still use your services while still protecting your API.

How do the Rate and Burst Throttle work together?

When request submissions exceed the steady-state request rate and burst limits, API Gateway fails the limit-exceeding requests and returns 429 Too Many Requests error responses to the client. When a client receives these error responses, the client can resubmit the failed requests in a way that limits the rate, while complying with the API Gateway throttling limits.

To help understand these throttling limits, here is a example: Let's assume you set the throttling to Rate = 1000 (requests per second) and the Burst = 500 (requests):

If a caller submits 1000 requests in a one-second period evenly (for example, 1 request every millisecond), API Gateway processes all requests without dropping any.
If the caller submits 1000 requests in the first millisecond, API Gateway serves 500 of those requests due to the burst setting and throttles the rest in the one-second period (the remaining 500 requests would get a 429 Too Many Requests response).
If the caller submits 500 requests in the first millisecond and then evenly spreads another 500 requests through the remaining 999 milliseconds (for example, about 1 request every millisecond), API Gateway processes all 1000 requests in the one-second period without returning 429 Too Many Requests error responses.

Function concurrency limits and throttling in AWS Lambda

In AWS Lambda, a concurrency limit determines how many function invocations can run simultaneously in one region. Each region in your AWS account has a Lambda concurrency limit. The limit applies to all functions in the same region and is set to 1000 by default.

If you exceed a concurrency limit, Lambda starts throttling the offending functions by rejecting requests.

How to increase your AWS Lambda concurrency limits?

Open a support ticket with AWS to request an increase in your account level concurrency limit.

Create a new support case
Set Regarding value to Service Limit Increase.
Choose Lambda as the Limit Type.
Fill out the body of the form.
Wait for AWS to respond to your request.

API Gateway invoke Lambda

The default burst levels (5000) on AWS API Gateway is way higher than the default maximum concurrency of Lambda Functions (1000), so if you are using API Gateway with Lambda you will want to make sure that you have set a value for the Burst throttle setting that makes sense for your Lambda Concurrency level.

NextAWS API Gateway - Usage plans and API keys

Last updated 4 years ago

Was this helpful?