Capybara and Selenium for Testing and Scraping

Leigh HallidayJuly 14th, 2016Last Updated: July 13th, 2016

0 60 6 minutes read

Capybara, aside from being the largest rodent in the world, is also a fantastic tool to aid you in interacting with browser functionality in your code, either for testing or just to interact with or scrape data from a website.

Capybara isn’t what actually interacts with the website — rather, it’s a layer that sits between you and the actual web driver. This could be Selenium, PhantomJS, or any of the other drivers that Capybara supports. It provides a common interface and a large number of helper methods for extracting information, inputting data, testing, or clicking around.

Just like any abstraction, sometimes you need to go deeper, and Capybara won’t stop you from doing that. You can easily bypass it to get at the underlying drivers if you need more fine-tuned functionality.

Testing with Capybara

Capybara integrates really nicely with all of the common test frameworks used with Rails. It has extensions for RSpec, Cucumber, Test::Unit, and Minitest. It’s used mostly with integration (or feature) tests, which test not so much a single piece of functionality but rather an entire user flow.

You can use Capybara to test whether certain content exists on the page or to input data into a form and then submit it. This is where you try to ensure that the same key flows (such as registration, checkout, etc.) that your user will take work not just in isolation but flow nicely from one to another.

With RSpec, we need to first ensure that in our rspec_helper.rb file we include the line require 'capybara/rails'. Next, let’s create a new folder called features where we’ll put all of the tests which include Capybara.

Imagine that we have an application for managing coffee farms. In this application, creating a coffee farm is one of the most important functions you can perform, and therefore should be tested thoroughly.

# spec/features/creating_farm_spec.rb
require 'rails_helper'

RSpec.describe 'creating a farm', type: :feature do
  it 'successfully creates farm' do
    visit '/farms'
    click_link 'New Farm'

    within '#new_farm' do
      fill_in 'Acres', with: 10
      fill_in 'Name', with: 'Castillo Andino'
      fill_in 'Owner', with: 'Albita'
      fill_in 'Address', with: 'Andes, Colombia'
      fill_in 'Varieties', with: 'Colombia, Geisha, Bourbon'
      fill_in 'Rating', with: 10
    end
    click_button 'Create Farm'

    expect(page).to have_content 'Farm was successfully created.'
    expect(page).to have_content 'Castillo Andino'
  end
end

There are a few things to note with Capybara. The first is that it provides a ton of great helpers such as click_link, fill_in, click_button, etc. Many of these helpers provide a variety of ways to actually find the HTML element that you’re looking for.

In the example above, we see CSS selectors used with the within method. We also see selecting and filling in an input field by using the text in its label.

There’s also a third way, not shown here, which allows you to select elements using xpath. While xpath is the most powerful for selecting, it’s the least clear way. For the purposes of your own sanity, you should probably aim to include an ID or class property in your HTML to ensure that selecting is straightforward.

Scraping with Capybara

Capybara isn’t only for testing. It can also be used in web scraping. I’ll admit that it isn’t the fastest method, and if all you are doing is visiting a page to extract information without too much interaction with the DOM in terms of data input or clicking, it may not be the best approach. For that, you may want to investigate something like mechanize or even nokogiri if all you are doing is reading HTML to extract information from it.

But for the situation where you maybe have to first log in as a user, click on a tab, and then extract some information, this is the sweet spot for Capybara.

I’ve recently had to rent a car, and I ended up using Hotwire for this. Let’s use Capybara to log in and retrieve my confirmation number. In this case, it would be more difficult to use a different scraping tool because it is an Angular SPA, so Capybara works perfectly.

I’ll create a Rake task which will log in to my account and then loop through all of the confirmation codes and print them to the screen. I’ve used an xpath selector here to show that even if there isn’t an easy CSS selector to use, you can still find the element that you’re looking for. This also demonstrates how to use Capybara outside of your testing environment.

namespace :automate do
  desc 'Grab hotwire confirmation code'
  task hotwire: :environment do |t, args|
    session = Capybara::Session.new(:selenium)
    session.visit 'https://www.hotwire.com/checkout/#!/account/login'

    session.find('#sign-in-email').set(ENV.fetch('EMAIL'))
    session.find(:xpath, '//input[@type="password"]').set(ENV.fetch('PASSWORD'))
    session.find('.hw-btn-primary').click

    session.all('.confirmation-code').each do |code|
      puts code.text
    end
  end
end

On the screen, we get Car confirmation 31233321CA3 outputted (not my real confirmation number, of course).

Any time we use the find method or all, we are given an instance of the Capybara::Node::Element object. This object allows us to click it, extract the text, ask for its DOM path, and interact with it in a variety of other ways.

One other interesting method is the native method, which returns us the underlying driver’s object. In this case, it’s an instance of Selenium::WebDriver::Element because we are using Selenium. As useful of an abstraction as Capybara is, there will always be times when you need to gain access to the underlying layer.

As you can see, this could be an easy way to automate a task that has no other alternative than to use the “Screen Scraping” approach. Keep in mind that this is quite brittle, as a slight change to one of their classes or IDs means that the whole thing will stop working.

Interacting with JavaScript

One of the things that Capybara gives you is the ability to interact with your webpages using JavaScript. You aren’t limited to only using Ruby to find and interact with the DOM nodes on your page.

Capybara gives you two methods to invoke some JavaScript:

execute_script (does not return values)
evaluate_script (does return values)

These work great, but as usual you can bypass Capybara if needed and use the underlying driver to execute JavaScript. This allows us to pass arguments to our JavaScript code:

# example of returning values from javascript
classes = session.driver.browser.execute_script(
  "return document.getElementById(arguments[0]).className;",
  'sign-in-password'
)
puts classes
# => ng-pristine ng-untouched ng-invalid ng-invalid-required

The execute_script method allows you to return values from JS functions which are called. If you imagine that your code is being invoked like this:

var result =
  (function(arguments) {
    return document.getElementById(arguments[0]).className;
  })(['sign-in-password']);

You’ll see that the arguments you pass in the second and higher parameter positions get placed into an array and passed to an anonymous function. The anonymous function contains, as its body, the code which was in the first parameter. This is why you must explicitly include a return statement if you want to use the value it returns.

Selenium takes care of how to convert what would end up being a JavaScript return value into something you can use in Ruby.

Configuring Capybara

As I mentioned in the introduction, Capybara works by allowing you to work with a number of different web drivers. These could be lightweight headless drivers such as PhantomJS (via poltergeist) or RackTest, but it could also be Selenium either running locally or connecting to a Selenium grid server remotely.

Here is an example of how you might configure Capybara to work with a remote Selenium server.

Capybara.register_driver(:firefox) do |app|
  Capybara::Selenium::Driver.new(
    app,
    browser: :remote,
    url: ENV.fetch('SELENIUM_URL'),
    desired_capabilities: Selenium::WebDriver::Remote::Capabilities.firefox
  )
end

Which would now allow you to set the driver to :firefox with the code Capybara.default_driver = :firefox.

Debugging Capybara

With a headless driver, it is sometimes hard to see what the page looks like at the time you are interacting with it. Just because it is headless doesn’t mean you need to be blind. You are able to request a screenshot of how the page looks and also extract the source code of the page.

File.open('/tmp/source.html', 'w') do |file|
  io = StringIO.new(session.driver.browser.page_source)
  file.write(io.read)
end

File.open('/tmp/screenshot.png', 'w', encoding: 'ascii-8bit') do |file|
  io = StringIO.new(session.driver.browser.screenshot_as(:png))
  file.write(io.read)
end

Another way to help with debugging is by placing a binding.pry call, which will pause the script and allow you to step into the code and perform commands that interact with the web page. You can even open up the Firefox/Chrome developer console and play with the JavaScript of the page in its current state.

Conclusion

The truth is that I don’t generally use Capybara when testing my Rails applications. It slows the tests down and potentially binds your tests quite closely to the DOM.

However, it does have its place, especially when you need to guarantee that certain important user flows work exactly as expected. It also finds the odd use in scraping when tools such as mechanize or nokogiri aren’t enough.

Capybara is a great tool to have in your Ruby toolbelt. Give it a try the next time you want to make sure that a certain key user flow works or you need to automate a task via scraping.

Reference:

Capybara and Selenium for Testing and Scraping from our WCG partner Leigh Halliday at the Codeship Blog blog.