The State of Elixir HTTP Clients

Posted by Alex Koutmos on Monday, August 10, 2020

Contents

In today’s post, we’ll learn about the Elixir HTTP client libraries Mint and Finch. We’ll discuss how Finch is built on top of Mint and what the benefits are of this abstraction layer. We’ll also talk about some of the existing HTTP client libraries in the ecosystem and discuss some of the things that make Mint and Finch different. Finally, we’ll put together a quick project that makes use of Finch to put all of our learning into action. Let’s jump right in!

What tools are currently available in the Elixir ecosystem?

While an HTTP client may not be the most interesting part of your application, more than likely you are using an HTTP client at some point to interface with 3rd party resources or even internal HTTP microservices. Having an HTTP client with usable ergonomics and a friendly API can help ensure that you deliver application features fast.

If you are looking for something that comes with the BEAM, you can leverage the :httpc module (documentation for httpc). While :httpc works great for simple requests, it can be limiting at times given it does not have built in support for connection pools, SSL verification requires a bit of ceremony to get set up and the API is not the most intuitive to work with (see this article for more information).

HTTPoison is a 3rd party library provides some nice abstractions on top of the Erlang library Hackney in order to provide a nice developer experience from within Elixir. It also supports some nice features like connection pools and the ability to create HTTP client library modules via use HTTPoison.Base. Unfortunately, in the past I have experienced issues with Hackney in high traffic applications (so have other users). If I suspect that I will be making a large amount of concurrent requests to an HTTP API, I try to keep that in mind and plan for possible failure scenarios.

Tesla provides additional abstraction layers in that an HTTP client that can configured using middlewares (similar to Plug), mock responses for testing, and even different adapters to perform the HTTP requests. In fact you can even use Mint as an adapter in Tesla. If I am making a very involved HTTP client with many different interactions and behaviors, I’ll usually reach for Tesla as it makes it easy to package these pieces of functionality together and the testing utilities make your ExUnit tests very clean.

How are Mint and Finch different?

The previously mentioned tools all rely on a process to keep track of the ongoing HTTP connection. Mint on the other hand provides a low-level, process-less API for interacting with TCP/SSL sockets. Every time you interact with Mint for example, you will be given back a new Mint.HTTP1 or Mint.HTTP2 struct handler. In addition, any data coming into the socket will be sent as a message to the process that initiated the connection. This data can then be captured via a simple receive block and handled accordingly. While this may seem limiting, it is by design. The architecture of Mint lends itself to being extensible and enables other library authors to wrap Mint however they see fit.

This is where Finch comes in. Finch is a library that wraps Mint and provides many of the HTTP client features that you would expect from a fully fledged HTTP client. For example, with Finch you get connection pooling and request Telemetry both out of the box. While that feature set may not be as thorough as HTTPoison or Tesla, Finch is very much focused on being lightweight and performant.

Hands on project

Now that we have discussed some of the design decisions that went into Mint and Finch, it is time to dive into a sample project. Our sample application will be a functional programming language Hacker News counter. It will work by taking a Hacker News article ID and then fetching all of the child posts of that parent post. It will then leverage a connection pool to the Hacker News Firebase API to perform a number of concurrent calls to the API to fetch all of the child posts. Once all of the child posts have been fetched, all of the text bodies will be extracted and analyzed for the names of certain functional programming languages. Finally, we’ll print out our results, along with some metrics that were collected via Telemetry events. With all that being said, let’s jump right to it (for reference all the code can be found at https://github.com/akoutmos/functional_langs)!

Let’s start off by creating a new Elixir project with a supervision tree:

$ mix new functional_langs --sup

With that in place, let’s open up our mix.exs file and add our required dependencies:

defp deps do
  [
    {:finch, "~> 0.3.0"},
    {:jason, "~> 1.2"}
  ]
end

Once your mix.exs file has been updated, switch over to the terminal and run mix deps.get to fetch your dependencies. Next we’ll want to create a file lib/functional_langs/hacker_news_client.ex that will encompass all of our Finch related code. This file will include all of the calls necessary to interact with the Hacker News Firebase API and will also include some utility functions to extract the desired data out from the JSON payloads. Let’s start off by adding some of the foundation of our lib/functional_langs/hacker_news_client.ex file:

defmodule FunctionalLangs.HackerNewsClient do
  alias Finch.Response

  def child_spec do
    {Finch,
     name: __MODULE__,
     pools: %{
       "https://hacker-news.firebaseio.com" => [size: pool_size()]
     }}
  end

  def pool_size, do: 25

  def get_item(item_id) do
    :get
    |> Finch.build("https://hacker-news.firebaseio.com/v0/item/#{item_id}.json")
    |> Finch.request(__MODULE__)
  end
end

Our child_spec/0 function defines the child spec for the Finch connection pool. We will be leveraging this function in our lib/functional_langs/application.ex file so that we can start up our Finch connection pool within our application supervision tree. Our pool_size/0 function defines how big our connection pool to the https://hacker-news.firebaseio.com address will be. For testing purposes, 25 concurrent connections should be more than enough to traverse even the largest Hacker News posts. Lastly, the get_item/1 function is what makes the actual GET call to the Hacker News Firebase API. We first define the HTTP verb as an atom, we then build the request, and finally we make the request while providing the name of the module (you notice that this lines up with the name of the connection pool in child_spec/0). With that in place, we are able to make calls to the Hacker News Firebase API and leverage our connection pool all using Finch.

With that in place, let’s wrap up our FunctionalLangs.HackerNewsClient module:

defmodule FunctionalLangs.HackerNewsClient do
  ...

  def get_child_ids(parent_item_id) do
    parent_item_id
    |> get_item()
    |> handle_parent_response()
  end

  defp handle_parent_response({:ok, %Response{body: body}}) do
    child_ids =
      body
      |> Jason.decode!()
      |> Map.get("kids")

    {:ok, child_ids}
  end

  def get_child_item(child_id) do
    child_id
    |> get_item()
    |> get_child_item_text()
  end

  defp get_child_item_text({:ok, %Response{body: body}}) do
    body
    |> Jason.decode!()
    |> case do
      %{"text" => text} -> String.downcase(text)
      _ -> ""
    end
  end
end

While the majority of this code snippet is related to unpacking and massaging the data, one thing I will touch on though is the pattern match on the Response struct. If you’ll recall from the previous code snippet we alias Finch.Response and then use that alias here. By pattern matching on the struct we are able to easily extract the response body and work with the returned data. For more details on Finch.Response struct, feel free to check out thedocumentation.

With our client module wrapped up, let’s quickly open up lib/functional_langs/application.ex and add the following line to our supervision tree so that our connection pool can be started with our application:

def start(_type, _args) do
  children = [
    FunctionalLangs.HackerNewsClient.child_spec()
  ]

  ...
end

Our application will use the Hacker News client that we just wrote to fetch all of the child posts of a Hacker News item and then look for occurrences of functional programming language names in each of those child posts. Once we have everything tallied up, we’ll present a nice ASCII chart and some overarching metrics. With that said, let’s open up lib/functional_langs.ex and start by adding the following:

defmodule FunctionalLangs do
  require Logger

  alias FunctionalLangs.HackerNewsClient

  @langs_of_interest ~w(elixir erlang haskell clojure scala f# idris ocaml)
  @label_padding 7
  @telemetry_event_id "finch-timings"

  def generate_report(parent_item_id) do
    # Setup Telemetry events and Agent to store timings
    {:ok, http_timings_agent} = Agent.start_link(fn -> [] end)
    start_time = System.monotonic_time()
    attach_telemetry_event(http_timings_agent)

    # Get all of the child IDs associated with a parent item
    {:ok, child_ids} = HackerNewsClient.get_child_ids(parent_item_id)

    # Concurrently process all of the child IDs and aggregate the results into a graph
    child_ids
    |> Task.async_stream(&HackerNewsClient.get_child_item/1, max_concurrency: HackerNewsClient.pool_size())
    |> Enum.reduce([], fn {:ok, text}, acc ->
      [text | acc]
    end)
    |> Enum.map(&count_lang_occurences/1)
    |> Enum.reduce(%{}, &sum_lang_occurences/2)
    |> print_table_results()

    # Calculate average API request time and total time
    average_time = calc_average_req_time(http_timings_agent)
    total_time = System.convert_time_unit(System.monotonic_time() - start_time, :native, :millisecond)

    # Clean up side-effecty resources
    :ok = Agent.stop(http_timings_agent)
    :ok = :telemetry.detach(@telemetry_event_id)

    IO.puts("Average request time to Hacker News Firebase API: #{average_time}ms")
    IO.puts("Total time to fetch all #{length(child_ids)} child posts: #{total_time}ms")
  end
end

Our first few lines of generate_report/1 are responsible for starting an Agent that will be used to collecting Telemetry metrics from Finch. The Agent PID is sent over to the attach_telemetry_event/1 function so that the handler that is created knows the PID of the Agent. The implementation of the attach_telemetry_event/1 function is as follows:

defmodule FunctionalLangs do
  ...

  defp attach_telemetry_event(http_timings_agent) do
    :telemetry.attach(
      @telemetry_event_id,
      [:finch, :response, :stop],
      fn _event, %{duration: duration}, _metadata, _config ->
        Agent.update(http_timings_agent, fn timings -> [duration | timings] end)
      end,
      nil
    )
  end
end

This function simply attaches to the Finch [:finch, :response, :stop] event and adds the duration metric to the Agent’s list of metrics. Back in our generate_report/1 function, our next step is to get all of the child item IDs from the provided parent_item_id and put that into our processing pipeline. Our processing pipeline leverages Task.async_stream/3 in order to concurrently process all of the child item IDs. One important thing to note is that our options to Task.async_stream/3 include max_concurrency: HackerNewsClient.pool_size(). The reason for this being that we only want to run the same number of concurrent tasks as there are connections in the Finch connection pool.

Our next piece of the pipeline is to reduce on the results from Task.async_stream/3 and to extract all the text blocks that were fetched. Afterwards, we perform an Enum.map/2 and an Enum.reduce/3 on those results in order to tally up our results. The implementations of the functions used in Enum.map/2 and Enum.reduce/3 are:

defmodule FunctionalLangs do
  ...

  defp count_lang_occurences(text) do
    Map.new(@langs_of_interest, fn string_of_interest ->
      count = if String.contains?(text, string_of_interest), do: 1, else: 0

      {string_of_interest, count}
    end)
  end

  defp sum_lang_occurences(counts, acc) do
    Map.merge(acc, counts, fn _lang, count_1, count_2 ->
      count_1 + count_2
    end)
  end
end

The last step in our pipeline is to take the results and pretty print a sorted ASCII chart so that we can see our results from greatest to least. Below is the implementation of print_table_results/1:

defmodule FunctionalLangs do
  ...

  defp print_table_results(results) do
    results
    |> Enum.sort(fn {_lang_1, count_1}, {_lang_2, count_2} ->
      count_1 > count_2
    end)
    |> Enum.each(fn {language, count} ->
      label = String.pad_trailing(language, @label_padding)
      bars = String.duplicate("█", count)

      IO.puts("#{label} |#{bars}")
    end)
  end
end

With that complete, all that is left is to aggregate all of our captured Telemetry metrics and compute the average request time, clean up our side-effecty resources, and print our our results. The implementation of calc_average_req_time/1 looks like so:

defmodule FunctionalLangs do
  ...

  defp calc_average_req_time(http_timings_agent) do
    http_timings_agent
    |> Agent.get(fn timings -> timings end)
    |> Enum.reduce({0, 0}, fn timing, {sum, count} ->
      {sum + timing, count + 1}
    end)
    |> case do
      {_, 0} ->
        "0"

      {sum, count} ->
        sum
        |> System.convert_time_unit(:native, :millisecond)
        |> Kernel./(count)
        |> :erlang.float_to_binary(decimals: 2)
    end
  end
end

With all that in place, we are ready to use our application! In your terminal, run iex -S mix in order to launch an IEx session with all of our modules loaded. Once your IEx session is up and running, you can call your FunctionalLangs.generate_report/1 function to generate the language occurrence report (the “23702122” is from the “July 2020 Who is hiring” post):

iex(1) ▶ FunctionalLangs.generate_report("23702122")
scala   |█████████████████████████████████████████████
elixir  |██████████████
clojure |██████████
haskell |█████
f#      |███
erlang  |██
ocaml   |█
idris   |█
Average request time to Hacker News Firebase API: 58.17ms
Total time to fetch all 564 child posts: 2185ms

Conclusion

Thanks for sticking with me to the end and hopefully you learned a thing or two about Mint and Finch! From this tutorial we learned how to create connection pools with Finch, attach Telemetry handlers to Finch events, and how to make large amounts of concurrent requests to an API. While Finch is relatively new compared to other HTTP clients, it is built upon tried and tested libraries like Mint and NimblePool. If you would like to learn more about Mint and Finch, I suggest looking at following resources:


comments powered by Disqus