An environment is a computer system where an application runs. We develop the web applications in our machines, called the development environment, and are commonly deployed to a publicly accessible server machine, called the production environment.
Typically, applications will not behave similarly in both environments because they will likely have different hardware, software, and configurations. For example, let's say we have the following production server setup:
Name | Type | Version |
---|---|---|
Ubuntu | Operating System | 20.04 |
MariaDB | Database | 10.1 |
PHP (php-fpm) | Runtime environment (webapp) | 7.2 |
Node.js | Runtime environment (tools) | 12 |
Nginx | Web Server | 1.18 |
A web app designed to work with the above setup will work OK on another machine with an exact or nearly identical configuration. However, if it has a different version of any of the required services, libraries, or even OS, you'll probably encounter some issues.
You can directly set up your computer to mirror the production server, which is usual for starters. But what if several colleagues want to join in the development? What if your hard drive suddenly got corrupted or a Windows system update fucked up your setup?
Re-configuring your machine now and then is impractical, especially when developing multiple projects with different frameworks, databases, and languages.
This is where virtual machines come in handy.
Overview
A virtual machine (VM) is an image of an actual computer. With the help of virtualization, a physical computer (host) can emulate one or more computer systems (guest) that have different operating systems and hardware platforms. A software or hardware monitor (hypervisor) creates and controls these VMs, which uses native execution to share and manage resources while maintaining isolation of multiple environments.
Because virtual machines are isolated, you can safely do anything in the guest machine without affecting the host machine, such as running complex and OS-dependent applications, trying out new operating systems, software, network configurations, untrusted applications, etc.
Hypervisors
A hypervisor helps you create, manage, and monitor virtual machines. Here are some known hosted hypervisors that you can use:
- VirtualBox - a free and open-source hypervisor by Oracle.
- VMWare Workstation - a proprietary hypervisor by VMWare.
- Hyper-V - included in certain versions of Windows operating systems.
System Requirements
Since a VM is an actual computer, the host machine should have relatively capable hardware, with its CPU and OS also supporting hardware virtualization for optimal performance. Generally, you need:
- Reasonably powerful CPU. Any recent Intel or AMD processor should suffice.
- Memory that's enough to support the minimum requirements of the host operating system, plus a minimum of 512MB for the guest operating system.
- A free hard disk space of at least 10GB, depending on the operating system and software that you need to install on the guest, as well as the number of projects you need to support. Dynamically-sized disk image files tend to grow considerably large.
For software development
Application developers can make use of virtual machines to run, test, and even compile their projects. Hypervisors allow control of the allocation of most resources to a virtual machine, giving developers the freedom to create an environment that closely matches the specifications of the production machine.
With an environment that is similar to the production environment, bugs can be discovered and replicated much faster, thus improving workflow and quality assurance.
Moreover, hypervisors allow exporting and importing of virtual machines and their data as a disk image, making it possible to share and run an environment that behaves consistently across different development units.
Setting up a virtual machine (called provisioning), however, requires some technical knowledge, such as service and network configuration, and package management, to name a few. Not all project members will have the ability to provision a VM or have the time to do or learn it. And since VM disk images usually consume gigabytes of storage, one will need a high bandwidth network to transfer it within a short period.
Luckily, there's a tool called Vagrant that helps make VM environment management easier.
Vagrant up!
Vagrant is a free, open-source, and cross-platform tool for creating and maintaining portable virtual environments for software development. Its authors claim it is the easiest and fastest way to create a virtual environment.
It works with providers such as VirtualBox, VMWare, Docker, and AWS. Industry-standard provisioning tools like shell scripts, Chef, and Puppet can automate the installation and configuration of software on the virtual machine.
VirtualBox is the default provider because it is free, most accessible, and provides the lowest friction for users to get started with Vagrant.
Getting Started
For brevity, this guide will use Vagrant with VirtualBox. Download and install the following:
You can verify the installation by running vboxmanage
and vagrant
commands in a new console session:
$ vboxmanage -v5.2.34r133893$ vagrant -vVagrant 2.2.6
Vagrantfile
Vagrantfile
(actual filename) is the specific file loaded by Vagrant that describes the type of the machine required by a project, and how to configure and provision these machines. It must be at the root directory of the project.
The configurations are defined using Ruby, but the knowledge of the programming language is not required because it is mostly simple variable assignments.
To initialize a new Vagrant environment, run vagrant init
on a terminal:
$ mkdir my-project$ cd my-project$ vagrant init
You can also run the vagrant init
in the directory root of an existing project.
Below is a basic Vagrantfile:
Vagrant.configure("2") do |config| config.vm.box = "base"end
The Vagrantfile should be committed to version control with your project files if you use one. That way, collaborators can benefit from Vagrant with less effort.
Boxes
Instead of creating a virtual machine from scratch, Vagrant uses base images, called boxes, that you can quickly clone to create a virtual machine. It is always the first step to specify the box to use in the Vagrantfile.
You can always browse HashiCorp's online catalog for the box suitable for your project.
A box may take around hundreds to thousands of MBs of disk space. Free up some hard disk space and make to have a good internet connection when pulling a new box.
Boxes are stored globally for the current user. These are never modified because projects clones a box then uses the copy. So when you have two projects using the ubuntu/bionic64
box, files added in one machine won't affect the other.
To pull a box from the catalog, run vagrant box add <box name>
. Box names have a <username>/<boxname>
format. These are downloaded once and re-used by multiple Vagrant environments.
$ vagrant box add ubuntu/bionic64
If the box for a Vagrant environment is not yet available locally, Vagrant pulls it automatically after running vagrant up
.
Using a Box
To use a box in a project, specify it in the Vagrantfile. For example, if we want to use Ubuntu 18.04:
Vagrant.configure("2") do |config| config.vm.box = "ubuntu/bionic64"end
You can also specify the version if you need a previous one:
Vagrant.configure("2") do |config| config.vm.box = "ubuntu/bionic64" config.vm.box_version = "20200629.0.0"end
Running the virtual machine and SSH access
From a terminal, run the following command at the root directory to start the vagrant environment:
$ vagrant up
The terminal will print the status while the machine is booting up. Once done, the virtual machine will run in the background without any UI (aka headless mode).
One way to interact with the virtual machine is through a Secure Shell (SSH) session, which you can do by running:
$ vagrant ssh
To end the SSH session, press CTRL + D
or run exit
or logout
:
vagrant@bionic64:~$ exitlogoutConnection to 127.0.0.1 closed.
Remember that a running virtual machine consumes resources of the host machine. After you're done using it or don't need it for a while, run vagrant halt
to shut it down.
If you won't use the virtual machine anymore, it's best to run vagrant destroy
on the host machine to free up the consumed disk space.
Synced Folders
Vagrant allows linking of your project directory to the virtual machine, called synced folders. It gives the tools or applications inside the virtual machine access to your project files while you edit them in the host at the same time. These are set up automatically during vagrant up
and vagrant reload
.
Modifying the files in the synced folder from either side (guest or host) will write, read, or delete the same data. So, if you remove a file from inside the virtual machine, it will also disappear from the host machine.
By default, Vagrant shares the project directory to the /vagrant
directory in the guest machine.
To change the shared directory and mount point in the virtual machine, add the following to your Vagrantfile
:
Vagrant.configure("2") do |config| # other config config.vm.synced_folder "project/dir", "/path/in/vm"end
If the first parameter is a relative path, it is relative to the project root. The second parameter must be an absolute path of the project within the guest machine. Before mounting, it creates the folder in the VM if it doesn't exist.
You can also disable synced folders by setting the disabled
option:
Vagrant.configure("2") do |config| # other config config.vm.synced_folder "project/dir", "/path/in/vm", disabled: trueend
Synced folder disk performance
The disk I/O of the default synced folders in VirtualBox is slow compared to the host machine, which is negligible with general use. However, if disk writes frequently occur within the shared folder, you'll see a significant impact.
How slow? It will take a simple page of a Symfony web app in debug mode more than 1s to load in the default sync type and about 300ms in NFS, while only about 80ms when served locally in the host. The framework uses caching, so it frequently writes data in its local var/cache
directory.
Activities like package installation, compiling, and caching that do extensive writes on the disk are affected by this bottleneck.
Optimize by either disabling syncing of directories where heavy write occurs (e.g., node_modules, vendor, log, cache) or use an alternative type of syncing like NFS (macOS/Linux) or SMB (Windows/macOS).
Provisioning
Vagrant has built-in support for automated provisioning, enabling the creation of ready-to-use virtual machines easy and repeatable.
This is useful for creating environments like application servers that require setting up and running multiple services such as web servers, proxy servers, DBMS, etc. so other people that will use it won't need to go through the hassle of setting them up.
Now, let's try setting up an Apache webserver. Create a subdirectory html
in the project root, then add an HTML file named index.html
:
<!DOCTYPE html><html> <body> <h1>Hello world, from Vagrant!</h1> </body></html>
Then make a simple provisioning shell script named bootstrap.sh
and save it in the same directory as your Vagrantfile:
#!/user/bin/envapt-get updateapt-get install -y apache2if ! [ -L /var/www ]; then rm -rf /var/www ln -fs /vagrant /var/wwwfi
A shell script contains a sequence of commands that you usually run in a terminal prompt. The above script is a set of Linux commands that installs an Apache web server and symlinks /vagrant
to /var/www
, the base directory for webserver files.
Next, we update the Vagrantfile to use this script for provisioning:
Vagrant.configure("2") do |config| config.vm.box = "ubuntu/bionic64" config.vm.provision :shell, path: "bootstrap.sh"end
The provision
setting tells Vagrant to use the shell
provisioner to setup the machine with the bootstrap.sh
file. The file path is relative to the project root.
After that, run vagrant up
to boot your machine and Vagrant will automatically provision it. If the machine is already running, run vagrant reload --provision
to reboot and run the provisioner.
If the machine is created but stopped, run vagrant up --provision
instead to boot and provision it since Vagrant does the provisioning only on the first vagrant up
.
Networking
Networking configurations in Vagrant allows your virtual machines to communicate with other virtual machines (given they have the same provider) or with the host machine.
There are three ways to achieve this:
The items above links to the Vagrant docs for the complete details. For this guide, let's try port forwarding.
Port forwarding allows you to access a specific port on the guest machine via a port on the host machine. So when you access that port on the host machine, all network traffic is forwarded to the specified port on the guest machine.
To access the webserver (port 80) on the guest machine from the host machine, we can forward all network traffic from port 8080:
Vagrant.configure("2") do |config| # other config... config.vm.network :forwarded_port, guest: 80, host: 8080end
Run vagrant reload
or vagrant up
to load the changes. Once the machine has loaded, open http://localhost:8080 in your browser to see the webpage hosted from the virtual machine.
Conclusion
Explore more on what you can do with Vagrant by reading the online documentation.
For provisioning, you need to have a working knowledge of manipulating and configuring the platform of your choice. I recommend checking out the ArchWiki if you want to learn linux configuration.
It may feel unnecessary or intimidating at first, but learning to use Vagrant is worth it in the long run because it is efficient and saves a lot of everyone's time and effort.