What is Version Control?

What is Version Control?

If you're not familiar with version control (also often called revision control or source control), it's time you became friends. Version control is one of the tools that software developers (including game developers) use that makes you far more productive.

You should learn how to use version control (in one flavor or another) and use it for everything you do.

In this first tutorial, we'll look at what version control has to offer to you, outline some basic version control concepts, and we'll take a peek at the version control options that are out there. The rest of these tutorials will teach you how to use a particular type of version control called Mercurial, which is one of many options. I'll explain why I chose to work with Mercurial here as well.

The Goals of Version Control

I'm not exaggerating when I tell you that version control will make you more productive many times over. In software development, there are few tools that have this kind of claim, but this is one of them. It's really one of those things that you need to learn and use everywhere. If you're not using it yet at your software job, or for your own projects at home (homework projects, hobby projects, or that game you're making on your evenings and weekends) it's time to start.

The goal of version control is actually really simple. Here's what version control seeks to do for you:

1. Keep a record of all of the changes you make to your code (a "revision history").
2. Make it easy to have multiple people edit the code simultaneously.

To see the advantages, think about how you manage these two tasks without it…

Revision History

Let's start with that revision history thing. Whenever you get to a point in time where you've got something you might want to go back to, you make a backup by copying the entire source code directory and slapping a name on it like AwesomeGame-2Jan2013. A week later, you're doing it again: AwesomeGame-9Jan2013. Then your friend wants to take a look at your game. So you zip up a copy of your source (AwesomeGame-ForDave, then a few weeks later AwesomeGame2-ForDave). You've got all of these copies floating around. Nine copies on your desktop, plus two on a flash drive, plus one that you emailed to yourself for good measure in case your hard drive crashes.

And of course, even that's not enough. On January 14th, you realize you've been heading in the wrong direction for the last week. You want to throw it all away and go back to where you were on the 7th, but you've only got your copy from the 2nd and the 9th. You have to patch things together still to get something that resembles what you had on the 7th. It's a mess.

Collaborating

Let's look at that second goal: sharing with others. Without version control, how do you do this? Let's say you've been working on a game for a couple of weeks, and Dave (the same guy as in the previous example) decides he wants to help you build it. What do you do? Well, perhaps you find a shared server somewhere, where you can dump files. Or perhaps you just email him the latest version of your code. (Either way, things aren't going to be much different.)

He says he'll add in a loading screen and create a few new 3D models. You plan on fixing some bugs in the collision detection screen. You both go your own ways and create epic amounts of beautiful code.

The next day, having completed your respective missions, you go to put your copies together so that you both have all of the awesomeness for the next round of changes. You take his new 3D models and dump them in to the appropriate content directory. You take his new LoadingScreen.cs file and dump it into your code in the appropriate place. So far so good.

But now things start to get a little sticky. To make the loading screen appear, Dave also tweaked the state machine code that keeps track of what screen should be drawn and when. This isn't a new file. It's changing an old file. You do a quick sanity check; you decide that you didn't make any changes to that file, so you take Dave's version and just overwrite your copy. Close call, but that wasn't the end of the world.

"Oh, wait," says Dave. "I also added my name to the credits." This time, though, the sanity check fails. You say, "Hmm… I added a bunch of comments in to that file to explain some of the ugly code I had put in there."

You've encountered a wild CONFLICT! (Que Pokemon battle music.)

(Ah crap. Now I've got Pokemon battle music and visions of Super Smash Bros. stuck in my head….)

You're now faced with the decision to either give up your comments or Dave's name in the credits and have to redo that work a second time, or painstakingly open both versions of the file up and manually merge both of the changes together.

Either way, you come away from that saying, "Before either of us make changes to a file, let's tell the other person so that we don't have to manually merge things together like this." And you build your game using those rules, and you both create your own versions of helper code so that you can tweak it without having to step on the other person's toes.

The result is strong separation between "Dave's Code" and "My Code", and a whole lot of cautious movement forward, worried that you'll somehow mess each other up. This isn't the situation that you want to be in.

Either Reason is Good Enough

Hopefully you can see from these examples that either reason should be good enough. You don't have to be in a situation that requires keeping track of a version history and working with others to justify version control.

It's worth it if you're working alone. It's worth it if you don't think you need version history. (Oh, by the way, you always eventually need version history.)

There are Other Reasons Too

At any rate, I hope I've convinced you now that version control will be worth it to you. These two reasons should be enough to convince almost anyone. But with it comes a lot of subtle secondary reasons that provide their own benefits. Here are a few examples:

  • Being able to quickly locate the state the code was in when you released it to production.
  • Off-site (or at least, off-box) backups.
  • Figuring out who wrote a specific block of code. (Warning: finding out who to "blame" for a broken line of code is not always productive.)
  • Allowing you to make bug fixes in one place, while new features are being added elsewhere, then merging the bug fixes back into your new development line without needing to write it a second time.

The Basics of Version Control

Version control is a fairly simple concept to understand. In this section, I'll outline the basics that are shared across all of the various incarnations of version control systems (there are lots of them).

To start with, you have data in the form of files. (Technically, this could be something besides files, but with respect to what we're trying to accomplish, we'll assume we're working with files.)

You make changes to these files over time. This may include things like edits to a file, as well as adding and removing files. A group of changes across multiple files are logically lumped together as a change set.

A history of these change sets are placed in a special place called a repository.

We'll talk about more of this later, when we start actually doing this ourselves, but there are a lot of things you can do with a repository.

For instance, you can checkout a repository, which will pull down all of the latest versions of the files there to somewhere locally. This set of files outside of the repository that you can edit is called a working copy. (You can also checkout a version of the files from a point in time earlier than the latest, though that is rarer.)

You can then make some changes within your working copy. When you've got something worth keeping, small or large, you can commit the change to the repository. Committing changes is like telling the version control system to take a snapshot of the current code and save it.

I said "small or large", but as a general rule, you should prefer smaller to larger. You don't want to necessarily commit code that is broken or unstable, but when you've got things in a reasonable state, you'll want to get a snapshot of it. For sure, you want to be committing multiple times a day, not every few weeks.

Of course, if you did something dumb, and decided the whole thing was a waste, you can push the revert button and go back to your last snapshot. There will be no need to manually undo all of your changes.

And that actually summarizes the whole process in a nutshell: Make some changes, commit them if you like them, revert if you don't, and repeat!

Flavors of Version Control

There are lots and lots of version control systems that are out there, and new ones coming out all the time. Some are general purpose, and some are limited and specialized. (Even Wikipedia has it's own built-in version control system.)

I can almost guarantee you that if you change jobs from one company to another, you'll end up using a different version control system than you were using before. In the three jobs I've had, I've used three different version control systems, plus a couple others on the side in my spare time.

Centralized vs. Distributed Version Control

The biggest delineating feature of modern version control systems is whether they are centralized or distributed. In the past, everything was centralized. You had exactly one central repository that everyone worked out of.

Not long ago, a new form of version control was invented: distributed version control. Under a distributed design, there can be lots of repositories that all have equal weight (from the version control system's perspective). In practice, there's usually still one of these that is considered the "main" or "central" repository. But you're not boxed into just one repository.

This, by the way, is the same shift in paradigms that happens when you switch from a client-server architecture (everybody talks to the same centralized server) to a peer-to-peer architecture (everybody is equal, and each peer can talk to any of the other peers at any time).

Nearly everybody out there will tell you that distributed version control is a massive improvement, but there are still lots of companies that are using centralized systems and it may be a while before they switch.

We'll get into distributed version control in more detail in a few tutorials.

SVN, Mercurial, and Git

I'm going to quickly point out three different popular version control systems that are worthy of consideration. I'm sure there are others out there that are worth considering, but these three are all quite popular, quite easy to set up, and as a starting point, you should pick one of these to begin with.

SVN is the only one of these three that is centralized. It's very powerful. If you're committed to a centralized version control system, this is the one I'd recommend. I'd suggest starting with VisualSVN Server and TortoiseSVN for tools (both free). But after this paragraph, I'm not going to say much about SVN. My personal opinion is that there are too many advantages to distributed version control to go back to an older centralized system.

Mercurial and Git are both distributed version control systems. I suspect you've heard of at least one of these. In fact I suspect the one you've heard of is Git, at least through its association with GitHub.

I have a hard time telling you that one of these is better than the other. They each have different underlying philosophies. Git is set up as a collection of individual tools that you string together to accomplish what you need. It's the same kind of philosophy that Linux's command line has. Lots of tiny special purpose tools that the user puts together to make what he or she wants.

Mercurial is perhaps not quite as powerful, but what it gives up in power, it gains in sleekness and simplicity.

I read once that Git is like MacGyver, while Mercurial is like James Bond.

But the last piece of this puzzle is that Git and Mercurial have evolved towards either other, not away from each other. They're becoming more similar, not more different. So again, I'm not going to sit here and tell you one is better than the other. You should feel free to pick whichever you like best. And after learning one, the other will be easy to pick up.

But what I will tell you is that the rest of these tutorials will teach you how to use Mercurial.

I'm picking this for two reasons. First, I have a personal preference to the simplicity and sleekness route over the MacGyver route. I've never felt limited by Mercurial's features, so I don't feel like I need Git's extra power.

Second, Mercurial has a stronger association with BitBucket while Git has a stronger association with GitHub. (We'll come back to this later, I promise.) In some senses, this is kind of a historical reason, because once upon a time, GitHub didn't support Mercurial and BitBucket didn't support Git, and I have a preference for BitBucket. (I'll explain my preference for BitBucket in a few tutorials.) If I'm not mistaken, they both now support both, so you can basically choose your online repository independent of choosing your version control system.

Like I said, this decision is somewhat arbitrary now. But it's what I'm most familiar with, so it's what I will show you in the upcoming tutorials.