Easy and Robust Rate Limiting in Elixir-Alex Koutmos

Intro

In this blog post, we’ll be talking about what rate limiters are, when they are applicable, and how we can write 2 styles of rate limiters leveraging GenServer, Erlang’s queue data structure, and Task.Supervisor. Finally, we’ll write a sample application that leverages a rate limiter and we’ll make our application modular in the sense that we can swap our rate limiters via configuration to achieve the operational characteristics that we desire. Without further ado, let’s dive right into things!

What is a rate limiter?

A rate limiter, as its name implies, specifies the maximum allowable events that can occur within a given time frame. In other words, if you have a rate limit of 60 requests per minute, you cannot exceed 60 requests over a given 1 minute window. There are several algorithms out there that describe how someone could solve this problem. In today’s post, we’ll be covering 2 specific algorithms: leaky bucket [1] and token bucket [2]. There are other rate limiting algorithms including fixed window counters, sliding window logs, and sliding window counters [3] but I leave that to you to investigate :).

When should I use a rate limiter?

Rate limiters are useful any time that you need to control the flow of data/requests coming into and going out of your application. For example, when you need to control the rate of requests coming into your Phoenix API, you could use a rate limiter to ensure that users only make X number of requests per second (this can also be useful to mitigate some small scale DoS attacks). Rate limiting can also be very important if you rely on 3rd party APIs and the vendor you are using has some limitations as to how many requests you make before they respond to your requests with 429s [4]. Getting blocked by a vendor can have serious consequences as it can often take a long time to unblock your API key, IP address, or user account. As such, it is best to do your due diligence up front and ensure that you are not flooding your external APIs with requests.

Enjoying the blog post so far? Follow me on Twitter for more content Follow @akoutmos

Show me the code!

In order to understand how rate limiters work internally and how we can write them, we will create a sample application that mocks some external API calls to simulate consuming an external API. Both of our rate limiter implementations (leaky bucket and token bucket) will leverage GenServer, Erlang’s queue, and Task.Supervisor. We will use Erlang’s queue so that we can buffer up API calls when we cannot service them (i.e we’ve hit our rate limit). We’ll be leveraging Task.Supervisor to spin off Tasks for each of our in-flight requests as not to block our rate limiting GenServer. Remember that GenServers, in order to provide atomic access to their internal state, can only execute 1 handle_* callback function at a time. With that all being said, let’s jump right into it!

Step 1: Create a new Elixir project and mock API module - commit

To begin, we’ll start off by creating a vanilla Elixir project with a supervisor. To create your application run the following in your terminal:

$ mix new payments_client --sup

After that is done, create the file lib/mock_api.ex with the following contents (this module will contain all of the mock API calls along with some randomized latency to make things interesting ;)):

defmodule PaymentsClient.MockAPI do
  def create_payment(user_id, new_payment_info) do
    Process.sleep(random_latency())

    %{
      status: 201,
      user_id: user_id,
      payment: new_payment_info
    }
  end

  # Do something with the resp here
  def handle_create_payment(_resp), do: nil

  def delete_payment(user_id, payment_id) do
    Process.sleep(random_latency())

    %{
      status: 204,
      user_id: user_id,
      id: payment_id
    }
  end

  # Do something with the resp here
  def handle_delete_payment(_resp), do: nil

  def charge_payment(user_id, payment_id, amount) do
    Process.sleep(random_latency())

    %{
      status: 200,
      user_id: user_id,
      amount: amount,
      id: payment_id,
      payment_processed: Enum.random(~w(success failed))
    }
  end

  # Do something with the resp here
  def handle_charge_payment(_resp), do: nil

  defp random_latency do
    Enum.random(100..400)
  end
end

Our mock API client has a few functions that simulate operations that you would perform against a 3rd party payments processor. We won’t bother with actually integrating with a payments processor as that is beyond the scope of this tutorial, but I think this makes the problem we are trying to solve a bit more real world [5]. With that in place, let’s get to work on defining what our rate limiters will look like and how we will define/configure them.

Step 2: Defining our rate limiter interface - commit

In order to ensure that we can swap rate limiting implementations with minimal hassle, we will leverage Elixir behaviours to define our interface and also leverage dependency injection so that we can configure what rate limiter we are using in our sample application. To learn more about this technique (dependency injection in Elixir), I highly suggest reading José Valim’s blog post on “Mocks and Explicit Contracts” [6].

Before getting to the behaviour module, we’ll want to create a config/config.exs file so that we can tweak our settings at application build time. If you want to see how you can configure these types of settings at run time with Mix releases, checkout out my post on Docker multi-stage builds Multi-stage Docker Builds and Elixir 1.9 Releases.

import Config

config :payments_client, RateLimiter,
  rate_limiter: PaymentsClient.RateLimiters.LeakyBucket,
  timeframe_max_requests: 60,
  timeframe_units: :seconds,
  timeframe: 60

With our configuration in place, let’s go ahead and define out rate limiter behaviour module. Create the file lib/rate_limiter.ex with the following contents:

defmodule PaymentsClient.RateLimiter do
  @callback make_request(request_handler :: tuple(), response_handler :: tuple()) :: :ok

  def make_request(request_handler, response_handler) do
    get_rate_limiter().make_request(request_handler, response_handler)
  end

  def get_rate_limiter, do: get_rate_limiter_config(:rate_limiter)
  def get_requests_per_timeframe, do: get_rate_limiter_config(:timeframe_max_requests)
  def get_timeframe_unit, do: get_rate_limiter_config(:timeframe_units)
  def get_timeframe, do: get_rate_limiter_config(:timeframe)

  def calculate_refresh_rate(num_requests, time, timeframe_units) do
    floor(convert_time_to_milliseconds(timeframe_units, time) / num_requests)
  end

  def convert_time_to_milliseconds(:hours, time), do: :timer.hours(time)
  def convert_time_to_milliseconds(:minutes, time), do: :timer.minutes(time)
  def convert_time_to_milliseconds(:seconds, time), do: :timer.seconds(time)
  def convert_time_to_milliseconds(:milliseconds, milliseconds), do: milliseconds

  defp get_rate_limiter_config(config) do
    :payments_client
    |> Application.get_env(RateLimiter)
    |> Keyword.get(config)
  end
end

Let’s break this down top-to-bottom so that it is a bit more clear. Our RateLimiter module contains a single callback function that needs to be present in modules implementing the behaviour. That function is make_request/2 and it takes as arguments a request handler (which is a 3 element tuple {Module, function, ["args", "list"]}) and a response handler (which is a 2 element tuple {Module, Function}). The idea here is that the rate limiter implementation dispatches calls to the request handler when it is able to do so (i.e no blocked by rate limit). Once the request handler terminates, the resulting value is then passed to the response handler. In other words, our rate limiter is completely agnostic of any business logic, and merely dispatches calls to separate functions when it is free to do so.

After our callback definition, we have our make_request/2 function which merely proxies the function call to which ever rate limiter we currently have configured (for example, in the config.exs file earlier we specified PaymentsClient.RateLimiters.LeakyBucket as the rate limiter of choice).

The rest of the module is then comprised of functions for fetching settings out of our configuration and some utilities that we will need in all of our rate limiter implementations. With all that in place, we’ll also want to update our lib/payments_client/application.ex file to start up our rate limiter GenServer along with the necessary configs (you’ll also notice that we are starting up another supervisor Task.Supervisor…but more on that later):

...

alias PaymentsClient.RateLimiter

def start(_type, _args) do
  children = [
    {Task.Supervisor, name: RateLimiter.TaskSupervisor},
    {RateLimiter.get_rate_limiter(),
     %{
       timeframe_max_requests: RateLimiter.get_requests_per_timeframe(),
       timeframe_units: RateLimiter.get_timeframe_unit(),
       timeframe: RateLimiter.get_timeframe()
     }}
  ]

  ...
end

With all that in place, let’s get going on writing our first rate limiter…the leaky bucket!

Step 3: Implementing a leaky bucket rate limiter - commit

As described in [1], a leaky bucket rate limiter is one where the input into it may be variable and can fluctuate, but the output will be at a consistent rate and within the configured limits. In order to provide this type of functionality, we will leverage Erlang’s built in queue data type (https://erlang.org/doc/man/queue.html) to buffer up requests. We will then pop items off the queue at the rate we specify in our configuration. Let’s look at the code for the leaky bucket implementation in see this in action. Let’s Create a file lib/rate_limiters/leaky_bucket.ex and add some of our boilerplate GenServer stuff and some initial functionality:

defmodule PaymentsClient.RateLimiters.LeakyBucket do
  use GenServer

  require Logger

  alias PaymentsClient.RateLimiter

  @behaviour RateLimiter

  def start_link(opts) do
    GenServer.start_link(__MODULE__, opts, name: __MODULE__)
  end

  @impl true
  def init(opts) do
    state = %{
      request_queue: :queue.new(),
      request_queue_size: 0,
      request_queue_poll_rate:
        RateLimiter.calculate_refresh_rate(opts.timeframe_max_requests, opts.timeframe, opts.timeframe_units),
      send_after_ref: nil
    }

    {:ok, state, {:continue, :initial_timer}}
  end

  # ---------------- Client facing function ----------------

  @impl RateLimiter
  def make_request(request_handler, response_handler) do
    GenServer.cast(__MODULE__, {:enqueue_request, request_handler, response_handler})
  end

  # ---------------- Server Callbacks ----------------

  @impl true
  def handle_continue(:initial_timer, state) do
    {:noreply, %{state | send_after_ref: schedule_timer(state.request_queue_poll_rate)}}
  end

  @impl true
  def handle_cast({:enqueue_request, request_handler, response_handler}, state) do
    updated_queue = :queue.in({request_handler, response_handler}, state.request_queue)
    new_queue_size = state.request_queue_size + 1

    {:noreply, %{state | request_queue: updated_queue, request_queue_size: new_queue_size}}
  end
end

Here we have added our start_link/1 and init/1 functions that we normally find in our GenServers. Our init/1 function takes in the configuration that we passed in our application.ex file and creates the initial state for the GenServer (including the empty queue). It is important to note that we leverage handle_continue/2 to start our queue polling timer. You could also do this in your init/1 function but personally, I prefer to delegate such things to handle_continue/2 and keep the init function free of message sending (as schedule_timer/1 uses Process.send_after/3).

Further down the code snippet you’ll see that we also implement our make_request/2 callback from the behaviour we wrote earlier and also have a handle_cast/2 callback. handle_cast/2 simply adds incoming requests to our state’s queue for processing during the next timer tick. With all that in place, let’s get to the meat of our rate limiter and look at our handle_info/2 callback implementations. Add the following to lib/rate_limiters/leaky_bucket.ex:

defmodule PaymentsClient.RateLimiters.LeakyBucket do
  ...

  @impl true
  def handle_info(:pop_from_request_queue, %{request_queue_size: 0} = state) do
    # No work to do as the queue size is zero...schedule the next timer
    {:noreply, %{state | send_after_ref: schedule_timer(state.request_queue_poll_rate)}}
  end

  def handle_info(:pop_from_request_queue, state) do
    {{:value, {request_handler, response_handler}}, new_request_queue} = :queue.out(state.request_queue)
    start_message = "Request started #{NaiveDateTime.utc_now()}"

    Task.Supervisor.async_nolink(RateLimiter.TaskSupervisor, fn ->
      {req_module, req_function, req_args} = request_handler
      {resp_module, resp_function} = response_handler

      response = apply(req_module, req_function, req_args)
      apply(resp_module, resp_function, [response])

      Logger.info("#{start_message}\nRequest completed #{NaiveDateTime.utc_now()}")
    end)

    {:noreply,
     %{
       state
       | request_queue: new_request_queue,
         send_after_ref: schedule_timer(state.request_queue_poll_rate),
         request_queue_size: state.request_queue_size - 1
     }}
  end

  def handle_info({ref, _result}, state) do
    Process.demonitor(ref, [:flush])

    {:noreply, state}
  end

  def handle_info({:DOWN, _ref, :process, _pid, _reason}, state) do
    {:noreply, state}
  end

  defp schedule_timer(queue_poll_rate) do
    Process.send_after(self(), :pop_from_request_queue, queue_poll_rate)
  end
end

Let’s break this down so we can appreciate the elegance that is GenServer and the BEAM :). Our first handle_info/2 function definition matches on a queue size of zero and is effectively a noop. We merely schedule the next queue pop and return. We keep track of the queue size ourselves for 2 reasons:

Erlang’s queue implementation does not keep track of the queue size, and calling :queue.len(my_queue) is an O(N) operation (https://erlang.org/doc/man/queue.html#len-1).
Having the queue size as part of the GenServer’s state makes it very easy to pattern match in the function head.

The only bad side here being that we need to keep track of the queue size our selves and if we forget the increment or decrement the size bad things will happen. But we enjoy to live dangerously, so here we are! Our next handle_info/2 match is where the real magic happens. Here, we pop our next request off the queue (which contains the request and response handlers) and make a call to Task.Supervisor.async_nolink/2 in order to off load the task execution from the GenServer and attach it to the supervisor that we declared earlier in lib/payments_client/application.ex. This does several things for us:

Our GenServer will not be blocked by actually making the HTTP request to the external service.
A failure in the task will not bring down the rate limiting GenServer.
By minimizing the amount of work our GenServer has to do, it should be able to accurately rate limit our requests.

Inside of the anonymous function that we pass to Task.Supervisor.async_nolink/2, we leverage apply/3 (found within the Kernel module) to call the handlers that were provided to the rate limiter. First we call the request handler and then pass the result of that function to the response handler. This is what yields our dynamic dispatching behaviour that decouples our rate limiter from any business logic. Further down you see a couple of handle_info/2 functions that we need in order to receive messages from the spawned task. Although we do not do it here, we could handle task errors and perform retries, slow down the rate limiter, perform some logging etc. You can find more information if you look at the docs for Task.Supervisor.async_nolink/2 https://hexdocs.pm/elixir/Task.Supervisor.html#async_nolink/3. Our last function schedule_timer/1 is a utility function that we call in a few places to send the rate limiter GenServer a message that is is time to pop a request off the queue.

With our leaky bucket implementation complete, it is time to tackle our token bucket implementation. Let’s get to it!

Step 4: Implementing a token bucket rate limiter - commit

As described in [2], a token bucket rate limiter leverages a token counter to ensure that a request is serviceable given your rate limit. At a time interval consistent with your rate limit, a new token is added to the counter. Whenever a request is made a token is decremented from the counter. The token counter cannot exceed the maximum allowed number of requests for the given time frame.

Let’s walk through this with a simple example before jumping into the code. If we have a rate limit of 60 requests/min, our token bucket rate limiter will add 1 token to our counter ever second until it hits a maximum of 60. At that point, no additional tokens will be added to the counter. The reason for this being that if we get a burst of requests, we don’t want to service more requests than are desired. Once some tokens are used for the purposes of making requests, additional tokens will be refreshed at our configured time interval.

The token bucket rate limiter differs from the leaky bucket in that we only leverage a queue to buffer up requests that we cannot presently service given an insufficient number of tokens. This allows us to service bursts of traffic without having to shape the traffic into a uniform frequency.

With all that being said, let’s jump into the implementation! Create a file lib/rate_limiters/token_bucket.ex and add the following contents (we’ll break up the rate limiter into two parts so it is easy to discuss the work that we did):

defmodule PaymentsClient.RateLimiters.TokenBucket do
  use GenServer

  require Logger

  alias PaymentsClient.RateLimiter

  @behaviour RateLimiter

  def start_link(opts) do
    GenServer.start_link(__MODULE__, opts, name: __MODULE__)
  end

  @impl true
  def init(opts) do
    state = %{
      requests_per_timeframe: opts.timeframe_max_requests,
      available_tokens: opts.timeframe_max_requests,
      token_refresh_rate:
        RateLimiter.calculate_refresh_rate(opts.timeframe_max_requests, opts.timeframe, opts.timeframe_units),
      request_queue: :queue.new(),
      request_queue_size: 0,
      send_after_ref: nil
    }

    {:ok, state, {:continue, :initial_timer}}
  end

  # ---------------- Client facing function ----------------

  @impl RateLimiter
  def make_request(request_handler, response_handler) do
    GenServer.cast(__MODULE__, {:enqueue_request, request_handler, response_handler})
  end

  # ---------------- Server Callbacks ----------------

  @impl true
  def handle_continue(:initial_timer, state) do
    {:noreply, %{state | send_after_ref: schedule_timer(state.token_refresh_rate)}}
  end
end

Similarly to the leaky bucket, we instantiate the GenServer the same way and delegate timer scheduling to our schedule_timer/1 private function. We also add an additional value to our GenServer state. The value available_tokens defines how many tokens are allowed within a given time range and will hold our token counter. Given that GenServer state operations are atomic, we can be guaranteed that the counter increases/decreases will be performed without any additional synchronization logic. With that small tweak, everything should be familiar from the leaky bucket. With the boilerplate out of the way, on to the interesting bits. Add the following to your token bucket module:

defmodule PaymentsClient.RateLimiters.TokenBucket do
  ...

  @impl true
  # No tokens available...enqueue the request
  def handle_cast({:enqueue_request, request_handler, response_handler}, %{available_tokens: 0} = state) do
    updated_queue = :queue.in({request_handler, response_handler}, state.request_queue)
    new_queue_size = state.request_queue_size + 1

    {:noreply, %{state | request_queue: updated_queue, request_queue_size: new_queue_size}}
  end

  # Tokens available...use one of the tokens and perform the operation immediately
  def handle_cast({:enqueue_request, request_handler, response_handler}, state) do
    async_task_request(request_handler, response_handler)

    {:noreply, %{state | available_tokens: state.available_tokens - 1}}
  end

  @impl true
  def handle_info(:token_refresh, %{request_queue_size: 0} = state) do
    # No work to do as the queue size is zero...schedule the next timer and increase the token count
    token_count =
      if state.available_tokens < state.requests_per_timeframe do
        state.available_tokens + 1
      else
        state.available_tokens
      end

    {:noreply,
     %{
       state
       | send_after_ref: schedule_timer(state.token_refresh_rate),
         available_tokens: token_count
     }}
  end

  def handle_info(:token_refresh, state) do
    {{:value, {request_handler, response_handler}}, new_request_queue} = :queue.out(state.request_queue)

    async_task_request(request_handler, response_handler)

    {:noreply,
     %{
       state
       | request_queue: new_request_queue,
         send_after_ref: schedule_timer(state.token_refresh_rate),
         request_queue_size: state.request_queue_size - 1
     }}
  end

  def handle_info({ref, _result}, state) do
    Process.demonitor(ref, [:flush])

    {:noreply, state}
  end

  def handle_info({:DOWN, _ref, :process, _pid, _reason}, state) do
    {:noreply, state}
  end

  defp async_task_request(request_handler, response_handler) do
    start_message = "Request started #{NaiveDateTime.utc_now()}"

    Task.Supervisor.async_nolink(RateLimiter.TaskSupervisor, fn ->
      {req_module, req_function, req_args} = request_handler
      {resp_module, resp_function} = response_handler

      response = apply(req_module, req_function, req_args)
      apply(resp_module, resp_function, [response])

      Logger.info("#{start_message}\nRequest completed #{NaiveDateTime.utc_now()}")
    end)
  end

  defp schedule_timer(token_refresh_rate) do
    Process.send_after(self(), :token_refresh, token_refresh_rate)
  end
end

Our handle_info/2 functions that match on :enqueue_request perform 2 different things. The first function adds the request to the request queue when available_tokens is matched on 0. When we do not have any available tokens, we cannot service the request and so, we queue it up for future processing. Our next function that matches on :enqueue_request will service the request immediately given there are tokens available (bypassing the queue) and decrement the available_tokens.

Our handle_info/2 functions that match on :token_refresh operate in a similar pattern. The first function pattern matches on the size of the request queue and when it is zero (i.e no requests are buffered up) it adds an additional token to the available tokens counter if we have not hit our maximum. The next function pops an item off the queue and services it. The reason we do this is because it is effectively like adding a token to the counter, servicing the next request, and then decrementing the counter. After that, everything else should be similar to what we had in the leaky bucket implementation.

Step 5: Creating some test data and comparing the rate limiters - commit

With both of our rate limiters set up and ready to go, it is time to create some dummy requests and see how they behave. Create the file lib/load_generator.ex with the following contents:

defmodule PaymentsClient.LoadGenerator do
  alias PaymentsClient.{MockAPI, RateLimiter}

  def create_requests(num_requests) do
    1..num_requests
    |> Enum.each(fn _ ->
      {request_handler, response_handler} = generate_random_request()

      RateLimiter.make_request(request_handler, response_handler)
    end)
  end

  defp generate_random_request do
    case Enum.random(1..3) do
      1 ->
        {
          {MockAPI, :create_payment, [123, %{cc_number: 1_234_567_890, exp_date: "01/28"}]},
          {MockAPI, :handle_create_payment}
        }

      2 ->
        {
          {MockAPI, :delete_payment, [123, 456]},
          {MockAPI, :handle_delete_payment}
        }

      3 ->
        {
          {MockAPI, :charge_payment, [123, 456, 10.00]},
          {MockAPI, :handle_charge_payment}
        }
    end
  end
end

This simple request generator will create N number of requests and send them off to the configured GenServer (using whatever is set in config/config.exs). All the values are hard coded as we don’t need anything elegant here and the focus is more on the rate limiters that we wrote. With that being said, let’s move forward with the following configuration file and open up IEx with iex -S mix:

import Config

config :payments_client, RateLimiter,
  rate_limiter: PaymentsClient.RateLimiters.LeakyBucket,
  timeframe_max_requests: 60,
  timeframe_units: :seconds,
  timeframe: 60

Inside of IEx you should be able to run the following and get similar results:

iex(1) ▶ PaymentsClient.LoadGenerator.create_requests(5)
:ok
iex(2) ▶
13:48:04.846 [info]  Request started 2019-12-29 18:48:04.459617
Request completed 2019-12-29 18:48:04.843232

13:48:05.840 [info]  Request started 2019-12-29 18:48:05.475291
Request completed 2019-12-29 18:48:05.840202

13:48:06.735 [info]  Request started 2019-12-29 18:48:06.476217
Request completed 2019-12-29 18:48:06.735223

13:48:07.875 [info]  Request started 2019-12-29 18:48:07.477133
Request completed 2019-12-29 18:48:07.875277

13:48:08.669 [info]  Request started 2019-12-29 18:48:08.478202
Request completed 2019-12-29 18:48:08.669178

As we can see from our results, our leaky bucket rate limiter is only allowing 1 request per second (60 reqs/min) to execute. If we take the differences between all the times (1.0157, 1.0009, 1.0009, 1.0011) we’ll see that our rate limiter is fairly accurate with a μ (mean) of 1.00465 seconds and a σ (standard deviation) of 0.00638. Granted the resolution of our rate limiter is fairly slow and so the GenServer isn’t very busy. I ran a similar test where the rate limit was 100 reqs/sec and calculated a μ of 11.572 milliseconds and a σ of 1.61106. As we can see, the busier the GenServer gets (i.e more requests in a shorter timeframe), the greater the standard deviation gets.

With our leaky bucket rate limiter tested, let’s switch over to the token bucket and throw some load at it. In your config/config.exs file, change the rate limiter implementation to PaymentsClient.RateLimiters.TokenBucket and open up IEx:

iex(1) ▶ :sys.get_state(PaymentsClient.RateLimiters.TokenBucket)
%{
  available_tokens: 60,
  request_queue: {[], []},
  request_queue_size: 0,
  requests_per_timeframe: 60,
  send_after_ref: #Reference<0.2711890870.3707764742.155126>,
  token_refresh_rate: 1000
}
iex(2) ▶ PaymentsClient.create_requests(5)
:ok
iex(3) ▶
14:54:52.148 [info]  Request started 2019-12-29 19:54:51.950992
Request completed 2019-12-29 19:54:52.145529

14:54:52.237 [info]  Request started 2019-12-29 19:54:51.950854
Request completed 2019-12-29 19:54:52.237430

14:54:52.240 [info]  Request started 2019-12-29 19:54:51.950912
Request completed 2019-12-29 19:54:52.240022

14:54:52.247 [info]  Request started 2019-12-29 19:54:51.951032
Request completed 2019-12-29 19:54:52.247437

14:54:52.250 [info]  Request started 2019-12-29 19:54:51.945182
Request completed 2019-12-29 19:54:52.250426
iex(3) ▶ :sys.get_state(PaymentsClient.RateLimiters.TokenBucket)
%{
  available_tokens: 57,
  request_queue: {[], []},
  request_queue_size: 0,
  requests_per_timeframe: 60,
  send_after_ref: #Reference<0.2711890870.3707764742.155204>,
  token_refresh_rate: 1000
}

As we can see from this output, all of the request start times are within milliseconds of one another, and running :sys.get_state/1 shows us that the available token count has decreased. This here shows us the crucial difference between the token bucket implementation and the leaky bucket implementation. The token bucket allows for bursts in traffic and will not delay if tokens are available, whist the leaky bucket implementation will throttle all incoming requests to conform to the configured rate limit.

Closing thoughts

Well done and thanks for sticking with me to the end! We covered quite a lot of ground and hopefully you picked up a couple of cool tips and tricks along the way. To recap, we leveraged GenServer, Erlang’s queue, and Task.Supervisor to create two fairly straightforward and simple rate limiters. We covered the differences between the two implementations and determined that the token bucket implementation is suitable when we expect bursty traffic. We also took a look at how accurate our rate limiter was and how the standard deviation rose as we increased the traffic through the rate limiter. All things considered, the consistency and accuracy of the rate limiters was within expectations given the tools that we were using.

Feel free to leave comments or feedback or even what you would like to see in the next tutorial. Till next time!

Enjoying the blog post so far? Follow me on Twitter for more content Follow @akoutmos

Additional Resources

Below are some additional resources if you would like to deep dive into any of the topics covered in the post.