What is version control?
Version control systems, abbreviated as VCS, enable teams to track changes to code, while enabling fast and clear communication between teams and developers.
They are sometimes also referred to as source code management, revision control, and source control.
Version control systems track changes to code, files, and directories within code repositories to create a historical record of how they evolved over time. Changes to code are logged in a database, either on a server or local machine, providing developers with a clean, transparent, and accessible system of record for their code repositories.
In simpler terms, version control systems create a collection of code “snapshots.” At any point during their work, developers can take a snapshot of their code and add it to the historical record.
How version control works
Version control systems enable code tracking, which creates a record of code changes, and deltas, which highlight the differences between file versions. When teams can see their full history, they are also able to merge, which enables teams to combine new changes with existing code, and revert, which helps them go back to previous versions.
When teams use version control, they can better organize and coordinate changes to their codebase made by everyone on their team. Any changes are visible to the entire team, who can then use their version control system as a way to track and share their code changes with other team members.
Most teams today use distributed version control, which keeps a database of changes on a single server and distributes local copies to each developer. Whenever a developer needs to see the latest changes to the codebase, she can access them on the server and sync them to her local version control. If she makes changes to the codebase on her local machine, she can also push those changes to the server, making them accessible to everyone else on her team.
For example, most engineering teams use Git, a version control system, and GitHub, a repository hosting service. Git keeps a historical record of all changes made to the codebase, while GitHub provides an easy way to sync and share those changes.
Why use version control?
Software development can be a complex process, especially when large teams are involved. Using a version control system is important for collaboration and allows for distributed and asynchronous communication.
According to DORA, Google Cloud's DevOps Research and Assessment team, using version control is a key technical capability of high-performing engineering teams. It enables better quality code, improved visibility, and faster delivery velocity.
Version control systems can make it safer for teams to make changes to their codebase and collaborate with team members.
Version control serves as a single source of truth, helping teams avoid duplicating code changes or accidentally using outdated versions. Every developer works from the same versioning system, compiling all their edits into a single, team-wide record of changes.
Tracking changes with a version control system also provides engineering teams with the ability to resolve code conflicts. They can easily compare changes and identify any conflicts when working concurrently, minimizing the likelihood of breaking changes or unexpected behavior from incompatible changes.
If a code change causes an outage or production failure, teams can quickly revert and identify problematic changes. When debugging code issues, they can also use their version control system to easily and accurately recreate previous versions of their codebase. This allows them to pinpoint specific code changes causing bugs or outages.
Version control systems also improve knowledge transfer across the organization and collaboration among team members.
By using code reviews and approvals, team members can read each other’s code before it’s deployed. This provides every team member with deeper insight into how the codebase is changing over time.
Version control is especially useful for allowing teams to work in distributed and remote workplaces. Engineers can share their changes by requesting code reviews, making their branches visible with a code hosting tool, and staging their work for others to test.
Version control systems enable rapid collaboration and iteration. Developers can more easily create branches and merge their changes to the main branch. They can make changes to the codebase without blocking other team members and can resolve merge conflicts with fewer steps and less risk.
As a result, version control systems also enable faster and safer experimentation for development teams. They can ‘sandbox’ their changes, confining experiments to a new branch without affecting the main branch. When teams can create as many branches as they want, they unlock the freedom to experiment without fear of causing irreparable damage to their codebase.
Most importantly, version control systems enable faster adoption of key DevOps capabilities and practices. Code repositories contain information about tools and processes throughout the software development life cycle—from infrastructure and configuration, to deployment and documentation.
Version control systems help teams keep track of changes to their DevOps workflows, including development environments, continuous integration and deployment pipelines, code dependencies, and more. By integrating DevOps tools with their version control system, engineering teams can improve their overall DevOps performance.
Features to look for in version control systems
Most version control systems share a set of common features, such as concurrent development, automation, team collaboration, tracked changes, and disaster recovery. There are a few key differences between the most popular version control systems, including syncing, licensing, and integrations.
Distributed vs. centralized version control
One of the most significant features in version control systems that teams need to choose between is centralized version control systems (CVCS) and distributed version control systems (DVCS).
Distributed version control tracks changes on a local machine, where each developer has their own copy of the repository. They can then push or pull changes from a code hosting service, like GitHub, to distribute changes to everyone on a team. Every repository clone is a complete backup, containing all the information needed to recreate the repository and its entire history if another copy is lost.
Distributed version control is most popular among engineering teams. Developers can get their work done faster because they don’t need to connect to a central server at all times.
Centralized version control forces all developers to work on a single shared repository on a remote server. Although it can streamline collaboration, centralized version control is a single point of failure during an outage and requires a constant server connection for developers to access their work.
Open source vs. proprietary license
Many popular version control systems are released and maintained as open source software. For example, Git is licensed under GNU General Public License version 2.0, which guarantees the freedom to share and change free software. It is free to use, share, or modify.
Open source software is often less expensive to use because it is maintained by the community. Open source version control is often less risky because it does not depend on a specific vendor to update, maintain, or patch it.
Some version control systems, such as Perforce Helix Core, use a closed source license, which limits how you can use or modify their software. Proprietary version control systems make teams dependent on specific vendors, but they can often guarantee a certain level of support and hands-on assistance, which is particularly helpful for large enterprises or companies in certain industries.
Linear vs. non-linear development
Non-linear development enables teams to create thousands of parallel branches running on different systems. They can create and merge branches, while navigating backward and forward through history to move changes around within their version control system. Most popular version control systems allow teams to adopt non-linear development workflows.
In version control systems with linear history, all commits come one after another. There are no merges with independent commit histories. This keeps a clean history of changes, but can limit collaboration
Atomic vs. non-atomic operations
When using a version control system that uses atomic operations, all distinct changes happen in a single operation. Each commit will either record all the included changes if it's successful or nothing at all if the operation fails.
Version control systems without atomic operations can execute partial operations. For example, when using CVS, it’s possible for a commit to record changes to a few files, but fail on others.
Baselines, labels and tags
Engineering teams should consider how each version control system creates baselines, labels, and tags—distinct snapshots of the codebase to help teams navigate different versions. Creating labels and tags in some version control systems requires additional workflows, such as SVN, while others provide rich functionality out of the box, such as Git.
Most version control systems allow developers to work concurrently. Multiple developers can edit the same file at the same time. When they merge their changes, they will need to fix any conflicts.
Some version control systems lock files so only one developer can work on them at a time. Although this can avoid merge conflicts, it can also decrease team velocity.
Speed and performance
The most popular version control systems are fast and lightweight enough for most development teams.
At extreme scales, teams may notice differences in speed and performance between version control systems. Facebook optimized Mercurial to work with their massive codebase after noticing Git did not scale well for their team. Game developers, who often work with large file sizes and assets, may also require specialized version control systems, such as Perforce Helix Core.
One of the most important features to consider when comparing version control systems is their support for hosting tools and integrations.
Code hosting tools provide a better experience if teams use a supported version control system. It’s easier to use Git, or even SVN, with GitHub than it is to use Mercurial.
Moreover, version control systems often integrate with other tools to streamline development workflows. For example, continuous integration and deployment tools often connect into version control systems or repository hosting services to automatically build, test, and deploy changes.
Top version control systems
The most popular version control systems are Git, Subversion, CVS, Mercurial, and Perforce Helix Core.
Git is a free and open source distributed version control system. It is fast, efficient, and cross-platform, supported by a strong ecosystem of GUIs and CLIs to extend its functionality. Git supports non-linear development with commands for creating branches and rewriting history.
Unlike some VCSs that only track deltas between files, Git takes a full snapshot of all files at each commit, whether or not they have changed. When using Git, every clone of a code repository is a complete backup; it contains all the necessary information to see how the repository has changed over time from the moment it was created. As a result, Git is a resilient and robust version control system.
Git is the most widely used version control system in the world; according to the Stack Overflow Developer Survey, more than 93% of developers use Git. It is popular across industries and company sizes, including open source projects, startups, and large enterprises.
Apache Subversion, known as SVN, is a free and open source centralized version control system developed by the Apache Software Foundation and released under the Apache License.
Created in the early 2000s, SVN was the successor to the widely used Concurrent Versions System (CVS) that dominated version control systems for many years. However, the most common complaints about SVN are its complicated merging model and tedious branching process, which creates branches as directories inside a repository.
Subversion uses a single codeline with no branching command, making it better suited for massive projects. Subversion is most popular in the open source community, but lacks widespread adoption compared to Git.
Concurrent Versions System, abbreviated as CVS, is a free and open source centralized version control system released during the 1990s. Similar to SVN, CVS belongs to the old guard of version control systems and is not widely used today, with just a few risk-averse organizations using it, such as medical device software companies.
Similar to Git, Mercurial is a free and open source distributed version control system. Unlike Git, Mercurial treats history as immutable: there’s no ability to rewrite or change history.
Mercurial is known for its performance, scalability, and powerful CLI. Facebook adopted Mercurial in 2013 and Google uses it on their Piper monorepo.
Overall, Mercurial is far less popular than Git. As a result, it lacks support on many repository hosting services and has a limited ecosystem of tools, add-ons, and integrations. In 2020, Bitbucket announced its web-based version control services would no longer support Mercurial because less than 1% of new projects used it.
Perforce Helix Core
Perforce Helix Core is a client-server and distributed version control system. Unlike other version control systems, it is not open source software, distributed instead as proprietary software. Helix Core is also not free for larger teams. The free tier only includes up to 5 users and 20 workspaces.
However, Helix Core is scalable and fast, used mostly for large-scale development projects. It is commonly used at game development companies and AAA game makers who manage large projects.