Our application is driven by calls to action delivered to project backers via email. We send ~2 million emails a month with 98%+ deliverability.

Given the important role email plays in our business, we wanted to have better technical and behavioral insights into what was going on with our sent emails. We wanted to know more about our email at the project and backer level as well as what kind of impacts our emails have on conversion rates. Here are some of the steps we took to get a better picture:

Webhooks vs on-the-fly querying

We use Mailgun for sending email. Their platform provides two main ways to understand what your email is doing:

  1. HTTP APIs with rolled up statistics
  2. Event-based webhooks

We weren’t sure of our use cases for the data yet, so we decided that storing the data in our database would be the most flexible path forward.

Sending headers via ActionMailer callbacks

We instrumented the mailers of interest to include additional SMTP headers including the backer_id and project_id. ActionMailer callbacks allowed us a convenient hook into the email-sending lifecycle that looks like the below wrapper:

class BackerMailer < ActionMailer::Base
  after_action :add_metadata_headers
  ...

  def sample_email(backer_id, project_id)
    @backer = Backer.find(backer_id)
    @project = Project.find(project_id)
    ...
  end

  def add_metadata_headers
    headers['X-Mailgun-Variables'] = {
      "backer_id": @backer.id,
      "project_id": @project.id,
      "subject": mail.subject
    }.to_json
  end
end

Upon processing the event webhooks, we read out these headers again and persisted the following data:

  • backer_id
  • project_id
  • email subject
  • message_id (unique across Mailgun)
  • event status (delivered, dropped, complained, bounced)

We set up an endpoint in our app to receive webhooks from Mailgun:

class Emails::SentEmailsController < ApplicationController
  before_action :verify_signature

  def create
    if missing_required_params?
      # tell Mailgun not to try again
      head :ok
      return
    end

    process_event!
    head :ok
  end

  private

  def missing_required_params?
    # assert the required headers are present so we process
    # only the emails for which we've passed in our custom headers
    ...
  end

  def process_event!
    sent_email = SentEmail.find_or_initialize_by(
      message_id: params["Message-Id"]
    )

    sent_email.backer_id = backer_id
    sent_email.status = event_status
    sent_email.subject = subject.squish

    sent_email.save!
  end

  def verify_signature
    # verify the signature of the request
    ...
  end
end

Storing and updating events

The underlying model and table looked like this:

class SentEmail < ApplicationRecord
  belongs_to :backer
end

# == Schema Information
#
# Table name: sent_emails
#
#  id         :integer          not null, primary key
#  message_id :string           not null
#  status     :string           not null
#  backer_id  :integer          not null
#  subject    :string           not null
#  created_at :datetime         not null
#  updated_at :datetime         not null
#
# Indexes
#
#  index_sent_emails_on_backer_id   (backer_id)
#  index_sent_emails_on_message_id  (message_id) UNIQUE
#

With this setup, we could query for emails sent to a particular backer or roll up statistics for a given project. This setup has given us much better insight into the emails we send to each user and the higher-level effect our emails are having on each particular project.

Questions we can now answer

With the above setup we can interrogate our system about email deliverability. Here are some specific queries we’re running:

  • Backer level deliverability
    • backer.sent_emails.group(:status).count
  • Project level deliverability
    • project.sent_emails.group(:status).count
  • Project and email type deliverability
    • project.sent_emails.group(:subject, :status).count
  • Email deliverability across our system
    • SentEmail.group(:status).count

Having this data in our system means we can now surface relevant statistics to our project creators about what our application is doing and how emails affect their business.

A couple of optimization improvements

With the above implementation we are seeing response times averaging around ~30ms with reasonable P99 outlier response times. Although there are a handful of improvements we could make (see below), we are cautious not to pre-optimize and have been happy with this setup.

We are continuing to monitor things. In the event of performance issues, here are a couple ideas we are considering for the future:

  • Enqueueing background jobs for later asynchronous processing in the webhook instead of doing Postgres queries in-band. We would be trading a Postgres query for a Redis query here.
  • Extracting the controller action into middleware and reducing the amount of the Rails stack required to process the action.

Further reading