Understanding require and Friends in Ruby
Author: Eric Mathison
Date: September 14, 2013
Last edited: April 5, 2020
Ruby's require method is a tool for referencing and executing code that is
not actually contained in the current file. My initial attempts at using
require left me a bit confused. Not until later did I realize that Ruby's
require method sits atop something more organized and elegant than I had
realized. It turns out that require's main use case is generally not one off,
relative references to other random Ruby files as I had expected. Instead,
require is generally used to look in certain pre-determined directory
locations for Ruby libraries. I had previously sort of assumed that a require
in Ruby would be something like the relative references to an image in CSS or a
reference to a JavaScript file in HTML. Not so. In fact, Ruby's require seems
to have drawn several concepts from the UNIX environment.
So How Does Ruby's require Work Anyway?
The best way to think of require is in relation to the UNIX $PATH variable.
Just by way of a refresher, the $PATH variable in UNIX contains a list of
directories where executables can be found. So when you type the name of a
program on any UNIX terminal, your computer is looking through the executable
files in the directories specified in your $PATH variable. require does
something very similar. When, for example, you write require 'set' at the top
of your Ruby file, you are telling Ruby to look through a bunch of directories
for a library called set.rb (Ruby's set library).
So where does Ruby look for set.rb? Well, once again, Ruby has something very
similar to UNIX's $PATH variable. It is the global variable $LOAD_PATH also
sometimes known by it's ugly and undescriptive alias $: (which I don't suggest
using by the way--short though it may be). It is an array of directory names
where Ruby looks when it comes across a require.
Here's where the analogy falls short however. The UNIX $PATH generally
contains a list of bin directories while the Ruby $LOAD_PATH contains an array
of lib directories. Well, what on earth are lib and bin directories and what
difference does it make? The lib and bin directories are another Ruby concept
which originally comes from UNIX. A bin directory is a place for standalone
executables. The name bin is a reference to binary files (as opposed to plain
text files) since before the days of ubiquitous scripting languages, executables
needed to be compiled from source into a binary file. Ruby, of course, is an
interpreted language and doesn't need to be compiled but the concept of
standalone executables still applies. A lib directory on the other hand is where
library code goes. This is the code you reference in other code. Referencing
library code is the whole reason we need requires in the first place.
When you require a library you are telling that code to be executed any time the
file containing the require is executed. The required file is executed only
once. Even if the require is repeated in the same file (why anyone would want to
do this I don't know) it will be executed only once*. This is because Ruby
checks whether the $LOADED_FEATURES variable a.k.a $" (not to be confused
with $LOAD_PATH) already contains the name of the required file before
executing
The really nice thing about the way require works in combination with the load
path is that as long as a library is in the load path, there is no need to
specify the full path or even to know where exactly it is located. You don't
even need to put the '.rb' on the end since that is implicit. All you need to
put is the extensionless name of the file. Personally, I think this makes for
some very clean looking code.
It is important to also realize that in your application, there is generally no
need to add directories to the load path manually as this is the environment's
job. I mentioned above that the $LOAD_PATH contains an array of lib directory
paths. RubyGems will, for example, automatically add the lib directory of gems
installed on your computer to the load path. Well, that's actually only mostly
true. If you actually run a Ruby script that just outputs $LOAD_PATH without
doing anything else, you won't see every single installed gem's load path
listed. This is because RubyGems lazily loads Gem's lib directories. So if you
want to see your gem's lib directory listed in the load path you have to
actually require it first. But things work essentially the same as if every
single gem's lib directory was on the load path.
Slightly unrelated to the topic of this article but perhaps also interesting to
note is that RubyGems will also add a gem's bin directory to the UNIX $PATH
variable so that a gem's executables can be run from the terminal.
Understanding load
The load method is very similar to require. The main differences is that it
will run the referenced code as many times as load calls it. As I mentioned
above, require only runs the first time.
load must be given the '.rb' file extension because it won't be inferred like
it was with require.
All in all, I would say that load has a fairly small amount of usefulness.
Probably it's main use case is re-running a file from a Ruby REPL (like pry or
irb) after making some edits. It's also very helpful when you need to load ruby
files that do not have a .rb extension since require will automatically look
for files ending in .rb. This situation can occur when trying to load ruby
script files or executables which tend not to have an extension.
Understanding require_relative
require_relative is essentially what I had originally thought require would
be used for. If you want to reference a file relative to another file, not
treat it like a library, and bypass the load path, this is the tool for you.
In order to understand the need for require_relative, we must understand a bit
more about require. So far, the only use I've mentioned for require is
executing code on the load path (code that would be considered a library).
That's for good reason. That's what it is good for and you probably shouldn't be
using require for anything else.
In the past (Ruby 1.8 and earlier), if you wanted to do a one off require
without managing the load path, the best way would have been to generate the
full path to that file and pass it as an argument to require. It would look
something like this:
require File.expand_path('my_file', File.dirname(__FILE__))
File.expand_path gives the full path to the file specified in the first
argument. Specifying the second argument essentially says, look for my_file.rb
relative to such and such directory. File.dirname(__FILE__) returns the
directory path for the current file (that this line was written in). So put
together, require is getting passed the full path to my_file.rb which is the
file sitting in the same directory in which this line was written. Another way
of writing the same thing looks like this:
require File.expand_path('../my_file', __FILE__)
The difference here is that the "directory" we are passing in the second
parameter isn't actually a directory. It is just the current file. Fortunately,
File.expand_path doesn't care since theoretically, we could make a directory
called 'current_file.rb' strange as that would be. Assuming I'm running this
line from my home directory, without the '../' before the file name in the
first parameter, the string we are passing to require would look something
like this: '/home/eric/current_file.rb/myfile.rb'. From File.expand_path's
perspective, adding the '../' brings us up a directory but from our
perspective, it simply causes current_file.rb to be removed from the path. So,
with '../' back in place, the path would expand to '/home/eric/my_file.rb'.
Let me back up and explain why passing require the full path to a file would
even be necessary. When I write something like require '../my_file.rb', that
path is relative to the current working directory of the process executing this
line NOT necessarily the file it was written in. That's important and it means
that depending on what directory you are in when you run ruby current_file.rb,
your code may or may not be able to find your relative reference to my_file.rb.
Yikes!
Enter require_relative stage left. require_relative does essentially what
the above two examples using require do except with the added benefit of a
clean syntax. So instead of needing that massive monstrosity, all that
require_relative needs to do is require_relative 'myfile' and my file will
be reference relative the file containing that line (current_file.rb in our
example). These techniques are not 100% equivalent though. Ruby's implementation
of require_relative actually references the call stack (not __FILE__) in
order to determine which file the references are relative to. So in a config.ru
file which is used to start a Rack application, it is not currently possible to
use require_relative as the contents of this file are run through the eval
method instead of being run directly. That means that Ruby can't determine which
file to be relative to.
Also note that even if you are running current_file.rb from it's containing
directory (so that the current working directory of the process is the same as
the location of current_file.rb), it is not possible to do something like
require 'myfile' although require './myfile' would work. This is because
since Ruby 1.9.2, the current working directory is no longer part of the load
path. In the UNIX world, this is considered a good security practice since it
avoids the issue of accidentally running an unexpected script by simply
executing a program from a directory that just happens to have a file with the
same name. Since require_relative is requiring relative to the current file
instead of the current process, it doesn't have this security issue. That's one
more reason to use require_relative for relative requires.
One last note about require_relative: Have you ever noticed that using
require_relative doesn't work on a Ruby REPL like pry? Think about it. If
require_relative is requiring relative to a file, which file would that be in
a REPL? Well, there isn't really a file that fits that bill. Your alternatives
in this situation are to use load or require. If you use require though,
make sure to remember the leading './' or '../' and don't do this in
production code!. And if you ever want to determine what the current working
directory of your Ruby process is, you can always output Dir.pwd.
Conclusion
So, we have seen that require, require_relative and load all fulfil
different use cases when used as intended. require is generally for
referencing libraries, require_relative is for making one off local references
within an application (typically deployed applications, not within libraries)
and about all load is good for is re-loading a script in a REPL. Well, that's
how I see it anyway :).