January 27, 2020

Export to CSV using service objects

Discover how to use service objects to export ActiveRecord models to CSV easily.

Oftentimes, there comes a point where data needs to be exported from your application. Whether this is to allow users to process records, or simply just for archival reasons, exporting data can be done quickly and easily with the help of service objects.

Service objects are Plain Old Ruby Objects (POROs) that can be added to your Ruby on Rails application to help with processing logic; they are essentially used for any processing that needs to happen away from your models and controllers - it's important to try to keep them as lean as possible.

For this tutorial, we're going to assume the role of working in a surgery with several doctors who have several patients. These patients also have appointments. Our task is to export all the patients with the doctor's name. We may also need to export all the appointments a particular patient has had too, so we're also going to do that.

Before we begin

It's important that we use a proper RDBMS for this as the data needs to correctly relate. Likewise, we're going to be using Faker to help us generate some fake data.

Let's create our application. We're going to skip tests just for the sake of this tutorial:

rails new whats_up_doc \
          -d postgresql \
          --skip-test

When the application has finished creating, open the folder in your favourite code editor (I'm Visual Studio Code).

Setting up our database

You will need to configure config/database.yml to match your databsae credentials. You should have these set up for local development.

Once you've configured that, prepare the database:

bundle exec rails db:prepare

Hopefully you should see something like:

Created database 'whats_up_doc_development'
Created database 'whats_up_doc_test'

This means you're now ready to carry on generating some models and outputting some CSVs!

Setting up our models

It's already been decided that we're within a surgey with several doctors, who have several patients. Each patient belongs to one doctor; therefore, one doctor has many patients. Each patient also has many appointments.

As a result, we can assume our Doctor model to have:

  • first_name:string
  • last_string:string
  • gender:string

This is only a very basic structure, but it's enough for our purposes; we store gender in the case of a gender-specific doctor being required.

Building on from our Doctor model, our Patient model will have:

  • doctor:belongs_to
  • nhs_number:string
  • first_name:string
  • last_name:string
  • gender:string
  • address_line_1:string
  • address_line_2:string
  • city:string
  • postcode:string
  • preferred_contact_number:string

Note that we are storing our NHS number and preferred contact number as strings in instances where we want this to include a space.

Now, we need to to crete our Appointment model:

  • patient:belongs_to
  • booking:datetime

This all looks great. All we need to do now is generate our models:

rails g model Doctor first_name last_name gender
rails g model Patient doctor:belongs_to nhs_number first_name last_name gender address_line_1 address_line_2 city postcode preferred_contact_number
rails g model Appointment patient:belongs_to booking:datetime

Setting up our database-level constraints

To enhance our validation and ensure that our data remains valid, we need to configure the restrictions within the migrations.

Firstly, open db/migration/TIMESTAMP_create_patients.rb, where TIMESTAMP will be the time you created the model.

class CreatePatients < ActiveRecord::Migration[6.0]
  def change
    create_table :patients do |t|
      t.belongs_to :doctor, null: false, foreign_key: true
      t.string :nhs_number, null: false
      t.string :first_name, null: false
      t.string :last_name, null: false
      t.string :gender, null: false
      t.string :address_line_1, null: false
      t.string :address_line_2
      t.string :city, null: false
      t.string :postcode, null: false
      t.string :preferred_contact_number, null: false

      t.timestamps
    end
  end
end

Let's also do the same to ensure that the appointments 'booking' field cannot be null.

class CreateAppointments < ActiveRecord::Migration[6.0]
  def change
    create_table :appointments do |t|
      t.belongs_to :patient, null: false, foreign_key: true
      t.datetime :booking, null: false

      t.timestamps
    end
  end
end

Finally, for extra validation, let's assume all the fields on our Doctor model are required:

class CreateDoctors < ActiveRecord::Migration[6.0]
  def change
    create_table :doctors do |t|
      t.string :first_name, null: false
      t.string :last_name, null: false
      t.string :gender, null: false

      t.timestamps
    end
  end
end

Let's now migrate our database.

bundle exec rails db:migrate

Adding some data

Install Faker

Before we can add any data to our application, we must first add the Faker gem. This is only going to be used in development, so let's add it to our development group only.

group :development do
  # Access an interactive console on exception pages or by calling 'console' anywhere in the code.
  gem 'web-console', '>= 3.3.0'
  # ...
  # Add faker here
  gem 'faker'
end

Then run:

bundle install

Configure our seeds

Let's head over to our db/seeds.rb file. This is where we're going to set up our test data.

For our example, we are going to have four doctors that can have anywhere between 10 and 20 patients, who may have anywhere from 1 to 5 appointments. The surgery is new but all patients have an initial appointment before they are added to the system.

Open db/seeds.rb. Then, let's create our 4 doctors.

Faker::Config.locale = 'en-US'

4.times do
  doctor_gender = Faker::Gender.binary_type
  doctor = Doctor.create!(
    first_name: (doctor_gender == 'Male' ? Faker::Name.male_first_name : Faker::Name.female_first_name),
    last_name: Faker::Name.last_name,
    gender: doctor_gender
  )
end

Here, you can see the gender is assigned to a variable, which is then used to determine the gender of the first name field. Faker has also been given a default locale to help generate data relating to the specific country code.

This will create four doctors. Though, these doctors currently don't have any patients. Using 'rand', we're going to generate a random number of patients for the doctor. This will be done within our doctor creation block, as follows:

Faker::Config.locale = 'en-GB'

4.times do
  doctor_gender = Faker::Gender.binary_type
  doctor = Doctor.create!(
    first_name: (doctor_gender == 'Male' ? Faker::Name.male_first_name : Faker::Name.female_first_name),
    last_name: Faker::Name.last_name,
    gender: doctor_gender
  )

  rand(10..20).times do
    patient_gender = Faker::Gender.binary_type
    patient = Patient.create!(
      doctor: doctor,
      first_name: (patient_gender == 'Male' ? Faker::Name.male_first_name : Faker::Name.female_first_name),
      last_name: Faker::Name.last_name,
      gender: patient_gender,
      nhs_number: Faker::NationalHealthService.british_number,
      address_line_1: Faker::Address.street_address,
      address_line_2: Faker::Address.secondary_address,
      city: Faker::Address.city,
      postcode: Faker::Address.postcode,
      preferred_contact_number: Faker::PhoneNumber.phone_number
    )
  end
end

All that is left to do in our seeds file now is randomly create appointments.

You may have already guessed where this is going to go - it sits within our patient block.

This means the contents of db/seeds.rb should look like:

# This file should contain all the record creation needed to seed the database with its default values.
# The data can then be loaded with the rails db:seed command (or created alongside the database with db:setup).
#
# Examples:
#
#   movies = Movie.create([{ name: 'Star Wars' }, { name: 'Lord of the Rings' }])
#   Character.create(name: 'Luke', movie: movies.first)

Faker::Config.locale = 'en-GB'

4.times do
  doctor_gender = Faker::Gender.binary_type
  doctor = Doctor.create!(
    first_name: (doctor_gender == 'Male' ? Faker::Name.male_first_name : Faker::Name.female_first_name),
    last_name: Faker::Name.last_name,
    gender: doctor_gender
  )

  rand(10..20).times do
    patient_gender = Faker::Gender.binary_type
    patient = Patient.create!(
      doctor: doctor,
      first_name: (patient_gender == 'Male' ? Faker::Name.male_first_name : Faker::Name.female_first_name),
      last_name: Faker::Name.last_name,
      gender: patient_gender,
      nhs_number: Faker::NationalHealthService.british_number,
      address_line_1: Faker::Address.street_address,
      address_line_2: Faker::Address.secondary_address,
      city: Faker::Address.city,
      postcode: Faker::Address.postcode,
      preferred_contact_number: Faker::PhoneNumber.phone_number
    )

    rand(1..5).times do
      Appointment.create!(
        patient: patient,
        booking: Faker::Time.forward(days: 90)
      )
    end
  end
end

Add the data to your database with:

bundle exec rails db:seed

Outputting the data

So far, we have lots of random data set up so that we can export CSVs. The next step is to create a way of accessing this through our application.

Generate the routes

Let's generate a controller to help us do this.

rails g controller Patients index show --no-helper --skip-routes

Then, pop in to config/routes.rb and ensure you have the patient routes, as follows:

# frozen_string_literal: true

Rails.application.routes.draw do
  resources :patients, only: %i[index show]
end

Add the service object

Next, we need to create our service object. We do this while our server isn't running to save the need to restart it later; we're going to create a 'services' folder in our 'app' directory.

Then, within that folder, create a new folder named 'csv_exports' and a file named 'base_service.rb': you should now have app/serivces/csv_exports/base_service.rb.

Within that file, we're going to add our base service code:

# frozen_string_literal: true

module CsvExports
  #
  # Base CSV export service
  #
  class BaseService
    require 'csv'

    #
    # Initialise CSV export
    #
    # @param [collection] collection
    # @param [array] csv_columns
    #
    def initialize(collection, csv_columns = nil)
      @collection = collection
      @csv_columns = csv_columns
    end

    #
    # Generate CSV
    #
    # @return [csv]
    #
    def to_csv
      CSV.generate(headers: true) do |csv|
        csv << column_names
        if iterable?(collection)
          collection.each do |item|
            csv << data(item)
          end
        else
          csv << data(collection)
        end
      end.encode('cp1252').force_encoding('UTF-8')
    end

    private

    attr_reader :collection, :csv_columns

    #
    # Check if collection is provided
    #
    # @param [object] object
    #
    # @return [boolean]
    #
    def iterable?(object)
      object.respond_to?(:each)
    end

    #
    # Column names used for CSV headers
    #
    # @return [array]
    #
    def column_names
      if csv_columns.blank? && collection.present?
        collection.first.class.column_names
      elsif csv_columns.present?
        csv_columns
      end
    end

    #
    # Format the CSV row
    #
    # @param [object] item
    #
    # @return [array]
    #
    def data(item)
      data = []
      column_names.each do |column|
        data << item.public_send(column)
      end
      data
    end
  end
end

This is the service object that is going to do a lot of the work for us, using Ruby's built in CSV class. It checks to see if the CSV columns are provided, and uses those if so, but if they are not, then it falls back to the column names found in the model.

Updating our controller

Now we have our structure in place, we must head over to our controller and ensure that it responds to the CSV format.

Navigate to app/controllers/patients_controller.rb and update your controller to match:

# frozen_string_literal: true

class PatientsController < ApplicationController
  def index
    respond_to do |format|
      format.csv { process_csv_file(all_patients) }
    end
  end

  def show; end

  private

  def all_patients
    CsvExports::BaseService.new(Patient.all).to_csv
  end

  def process_csv_file(csv)
    send_data csv,
              type: 'text/csv; charset=iso-8859-1; header=present',
              disposition: 'attachment',
              filename: 'download.csv'
  end
end

You will see here that the index action responds to the CSV format, while there are also two additional methods; one for the base service object and one for sending the CSV data.

You can test to make sure all of this is working by running your server:

bundle exec rails server

Then, navigate over to: http://localhost:3000/patients.csv where you should be prompted to download a CSV containing lots of test data.

You can open this in Visual Studio Code, or your current code editor, so see how it's formatted.

As an example:

id,doctor_id,nhs_number,first_name,last_name,gender,address_line_1,address_line_2,city,postcode,preferred_contact_number,created_at,updated_at
1,1,431 949 7414,Dallas,Johnston,Male,670 Wally Fork,Apt. 323,Lake Alfredton,W0 9HB,0141 248 4682,2019-12-28 12:52:35 UTC,2019-12-28 12:52:35 UTC

It should all look good except for one caveat; 'doctor_id' isn't very useful, similarly as we have a dummy NHS number, we don't need the 'id' field.

Customising the output

Let's customise the output. For tidiness, we're going to create a new service object for our Patient model. Create app/services/csv_exports/patients_export_service.rb.

Our empty service object should look something like this, as we're going to inherit from the base service:

# frozen_string_literal: true

module CsvExports
  class PatientsExportService < BaseService
  end
end

While we remember, let's head back to our controller and update our 'all_patients' method:

def all_patients
  CsvExports::PatientsExportService.new(Patient.all, 'index').to_csv
end

In addition, we have passed a string to the csv columns to allow us to identify which columns to use. We could have passed an array of colum names here, but to keep things reusable, we're going to use a case statement to provide these column names.

Now, back to our service object.

We've decided that we want to remove the 'id' field and we also need to update 'doctor_id' to use the Doctor's full name.

Before we can do that, let's head over to app/models/doctor.rb and create a method that outputs the full name of the doctor.

While we're there, it's also important to set up the relationship to patients within the model. Our Doctor model should now look like this:

# frozen_string_literal: true

class Doctor < ApplicationRecord
  has_many :patients, dependent: :restrict_with_error

  def full_name
    [first_name, last_name].join(' ')
  end
end

The next step is to add a delegation from our Patient model to the full name method we have just created. Let's do that now in app/models/patient.rb:

While we're there, we should also set up the appointments relationship.

# frozen_string_literal: true

class Patient < ApplicationRecord
  belongs_to :doctor

  has_many :appointments, dependent: :destroy

  delegate :full_name, to: :doctor, prefix: true
end

All is set, now to simply make this work. Our new patients service object app/services/csv_exports/patients_export_service.rb is awaiting the configuration of our case statement.

# frozen_string_literal: true

module CsvExports
  class PatientsExportService < BaseService
    private

    #
    # Override the column names method from the base service, used for CSV headers
    #
    # @return [array]
    #
    def column_names
      case csv_columns
      when 'index'
        index_column_names
      else
        collection.first.class.column_names if collection.present?
      end
    end

    #
    # Create a flattened array of fields to use for the CSV
    #
    # @return [array]
    #
    def index_column_names
      [
        [collection.first.class.column_names.reject { |column| index_rejected_columns.include?(column) }],
        %w[doctor_full_name]
      ].flatten
    end

    #
    # Field names to reject
    #
    # @return [array]
    #
    def index_rejected_columns
      %w[id doctor_id]
    end
  end
end

The service object also allows an empty string and will fall-back to the column names of the collection, like the base service does. You could store the CSV column names as an array within your model and pass these to the base service object, but I prefer to use this setup to have a specific file dedicated to the column names; it's useful should I want to define different field outputs in the future, such as outputting a list of patient addresses without the doctor details.

Likewise, using this method of returning the class column names means that if a new field is added in the future, it's automatically added to the CSV.

Now, head back over to your web browser and re-download the CSV file you created. The 'id' and 'doctor_id' fields should now be removed and the 'doctor_full_name' field should appear at the end of the list.

This is all great. You can customise the order of the fields within the 'index_column_names' method, for example, you could have the doctor's name appear first:

def index_column_names
  [
    %w[doctor_full_name],
    [collection.first.class.column_names.reject { |column| index_rejected_columns.include?(column) }]
  ].flatten
end

As an example:

doctor_full_name,nhs_number,first_name,last_name,gender,address_line_1,address_line_2,city,postcode,preferred_contact_number,created_at,updated_at
Chadwick Barrows,431 949 7414,Dallas,Johnston,Male,670 Wally Fork,Apt. 323,Lake Alfredton,W0 9HB,0141 248 4682,2019-12-28 12:52:35 UTC,2019-12-28 12:52:35 UTC

Alternatively, you can define the column names as an array within the method to just have those specific fields:

def index_column_names
  %w[
    doctor_full_name
    nhs_number
    first_name
    last_name
  ]
end

This would also mean you could remove the 'index_rejected_columns' as it's no longer used.

Outputting appointments

At the beginning of the tutorial, we set out to create a list of patients as a CSV and a list of appointments that patient has had.

We will need to ensure that our appointments CSV contains only the following fields:

  • doctor_full_name
  • patient_first_name
  • patient_last_name
  • booking

Have a go yourself and see if you can try, then come back to the tutorial to compare.

Have you had a go? If you have, you may have realised that we need to add a responder to our show action in app/controllers/patients_controller.rb.

def show
  respond_to do |format|
    format.csv { process_csv_file(patient_appointments) }
  end
end

Then, create a new private method in named 'patient'.

def patient
  Patient.find(params[:id])
end

After that, create a new private method in named 'patient_appointments'.

def patient_appointments
  CsvExports::AppointmentsExportService.new(patient.appointments).to_csv
end

You may note that I have not passed the CSV columns to the new appointments export service. This is because I have decided there will be no other way to output appointments, so the service should simply have the columns stored as a array.

To verify, the controller now looks like this:

# frozen_string_literal: true

class PatientsController < ApplicationController
  def index
    respond_to do |format|
      format.csv { process_csv_file(all_patients) }
    end
  end

  def show
    respond_to do |format|
      format.csv { process_csv_file(patient_appointments) }
    end
  end

  private

  def patient
    Patient.find(params[:id])
  end

  def patient_appointments
    CsvExports::AppointmentsExportService.new(patient.appointments).to_csv
  end

  def all_patients
    CsvExports::PatientsExportService.new(Patient.all, 'index').to_csv
  end

  def process_csv_file(csv)
    send_data csv,
              type: 'text/csv; charset=iso-8859-1; header=present',
              disposition: 'attachment',
              filename: 'download.csv'
  end
end

Likewise, the new app/services/csv_exports/appointments_export_service.rb contains:

# frozen_string_literal: true

module CsvExports
  class AppointmentsExportService < BaseService
    private

    #
    # Override the column names method from the base service, used for CSV headers
    #
    # @return [array]
    #
    def column_names
      %w[
        doctor_full_name
        patient_first_name
        patient_last_name
        booking
      ]
    end
  end
end

An updated app/models/appointment.rb file should contain:

# frozen_string_literal: true

class Appointment < ApplicationRecord
  belongs_to :patient

  delegate :doctor_full_name, to: :patient
  delegate :first_name, :last_name, to: :patient, prefix: true
end

You can now test this using: http://localhost:3000/patients/1.csv - this uses patient ID one, but you can change this to whatever number you prefer and see what you get.

As an example, patient ID 1 has the following appointments:

doctor_full_name,patient_first_name,patient_last_name,booking
Chadwick Barrows,Dallas,Johnston,2020-01-30 19:50:49 UTC
Chadwick Barrows,Dallas,Johnston,2020-01-24 22:04:09 UTC
Chadwick Barrows,Dallas,Johnston,2020-01-03 01:44:13 UTC
Chadwick Barrows,Dallas,Johnston,2020-01-07 01:11:47 UTC
Chadwick Barrows,Dallas,Johnston,2020-03-18 05:51:17 UTC

What do you think?

Has this helped you output data and been a good introduction to using service objects? I hope it has. If you did, or didn't, enjoy this tutorial, please contact me either way and let me know.