Visual Studio Toolbox
Source Code Control with Git and Mercurial
Resources for using the popular distributed source code control and collaboration tools Git and Mercurial on Windows.
- By Terrence Dorsey
Do you use source control tools to manage your software development process? Source control was the very first item on Joel Spolsky's Joel Test for the quality of a development team back in 2000. Source code control is just as important today, and not just for teams; individual programmers benefit as much from quality source code control software (and good code check-in practices) as do teams or worldwide open source projects.
What's source code control? In ye olde days, you may have struggled with Visual SourceSafe: Microsoft's Source Destruction System or systems like it. There was a central code repository, you'd wait for other developers to check in their work, you'd check out files to edit them, you'd check edited files back in -- and the database would corrupt all of the week's work.
Source code control systems -- also known as version control systems (VCSes) -- have come a long way in recent years. Today, data corruption is almost unheard of. There are a variety of different systems to choose from to suit different needs and preferences. Most are effectively platform-agnostic. They're also generally lightweight, easy to install, easy to learn and incredibly flexible.
Modern source control software such as Git and Mercurial are examples of distributed version control systems (DVCSes). They're easy to use for local development, providing simple version control for saving or rolling back changes, managing development branches and even a sort of simplified package management system for codebases. Two significant features set them apart from previous VCSes you may have encountered.
First, these systems record "atomic" changes to your codebase. That means, when you check in changes, rather than making a new version of your entire code file, software like Git and Mercurial only record the specific changes -- character insertions and deletions. This makes the change history fast, lightweight and extremely specific. It also makes conflicts far easier to sort out.
The "distributed" part of a DVCS is also important. Rather than thinking in terms of a central codebase and local working repositories, a DVCS treats all repositories -- local, on a team server or in the cloud -- as snapshots of a codebase at any point in time or development. For a given DVCS codebase, any repository can be synchronized with any other repository by merging the recorded changes. Depending on how well changes have been synchronized previously, you may encounter conflicts that must be rectified, but Git and Mercurial can help you pinpoint those conflicts down to individual lines of code.
Kalid Azad at BetterExplained has written a fantastic "Intro to Distributed Version Control" that walks you through the big ideas that differentiate centralized and distributed VCSes.
The highest-profile -- if not the most popular -- DVCS software today is Git, originally created by Linus Torvalds in 2005 for coordinating Linux development. Perhaps most famously, it's the software behind GitHub, the online code hosting and collaboration service.
To get started, you'll need to download and install Git, which lets you run Git commands through either cmd.exe or Git Bash. Git for Windows includes those options, plus a GUI app, making it a better option for first-time users on Windows.
Posh-Git provides better integration between Git and Windows PowerShell and is a popular and powerful CLI option for using Git on Windows (see
Check out Phil Haack's "Better Git with PowerShell" or Joshua Gall's "Streamline Git with Powershell" for setup instructions, tips and tricks.
In my experience, it's far easier to learn Git by using it than by reading about it, so I recommend working through a tutorial and developing some muscle memory. GitHub has an online, interactive tryGit tutorial. The Git Immersion tutorial from Neo Innovation Inc. is also very good.
Both of these focus on the command-line interface (CLI) to Git. Although there are some excellent GUI apps for Git, I think it's well worth becoming familiar with the actual Git commands so you understand how it works. I've found that that CLI provides very helpful error messages and hints for dealing with problems like merge conflicts. If you prefer using a GUI, knock yourself out, but at least you'll better understand the magic happening behind the scenes. (This advice stands for Mercurial as well. Use the GUI as a tool, not a crutch.)
For GUI Git front-ends, Windows developers actually have quite a few high-quality choices. GitHub for Windows was one of the first and remains a popular choice (see Figure 2). A few other free Git clients include Altassian's SourceTree, CollabNet's GitEye or the open source Git Extensions. There are a number of commercial apps as well.
Git Extensions includes an extension for Visual Studio versions going back to 2005. There's also the Visual Studio Tools for Git extension created by the Team Foundation Server (TFS) Power Tools Team to work with Team Explorer, and a Git Source Control Provider extension.
Git is a wonderful tool, but it is possible to get in trouble if you commit without thinking first. fournova Software Gmbh (who make Tower, which is a nice Git client for Mac OS) put together a concise but extremely useful outline of eight Version Control Best Practices that everyone should read.
Diving in a bit deeper, I highly recommend Atlassian's Git Workflows and Tutorials. These guides walk you through some more complex scenarios, including strategies for avoiding merge conflicts (and how to fix them when, inevitably, they happen), forking, feature branches and more.
Seth Robertson's "Commit Often, Perfect Later, Publish Once: Git Best Practices" tackles the more tactical side of using Git, from "commit early and often" and writing useful commit messages to organizing your work, the "sausage making" aspects of working with other developers and integrating with external tools.
Finally, Vincent Driessen's A Successful Git Branching Model does a great job of explaining how to successfully use tags, branching and merging in a real-world software development workflow.
Git's biggest rival is probably Mercurial -- often called "Hg" by the cool kids. In most respects it provides almost exactly the same functionality with slightly different syntax. Mercurial is written mostly in Python, and extensions are written in Python. I'll get into the more significant reasons you might choose one over the other later.
Much like Git on Windows, once installed you can run Mercurial commands in the standard cmd.exe prompt or Windows PowerShell. Jeremy Skinner created a set of Posh-Hg scripts, inspired by Posh-Git, for better integration with Windows PowerShell.
The Mercurial site provides a good introduction to basic workflows as well as more extensive tutorials. Bryan O'Sullivan's excellent "Mercurial: The Definitive Guide" is available as a free, online Web book -- something you should definitely bookmark as a learning tool and reference.
TortoiseHg is a popular GUI front-end for Mercurial that integrates directly with Windows Explorer (see Figure 3). "A Quick Start Guide to TortoiseHg" is available, and it provides a good, basic introduction to basic Mercurial features, as well.
Spolsky, of course, is a longtime advocate of not only source control software, but Mercurial specifically. His "Hg Init: a Mercurial Tutorial" walks you through the basics in a very Windows-centric manner.
While it isn't free, Spolsky's Fog Creek Software offers Kiln, which provides software version control and review features for both Mercurial and Git repositories. But this video <em>is</em> free: DVCS University: Distributed Source Control with Mercurial. It covers Mercurial basics as well as more advanced workflows.
Atlassian's SourceTree, mentioned earlier, works with Mercurial repositories. There's also a VisualHG Mercurial extension for Visual Studio 2005 and later.
Which One Is Best?
Which should you use? If you're joining an existing team or project, the choice has probably been made for you.
If you're starting something new, however, the choice is less obvious. In practice, they're very, very similar. Here's a quick illustration, starting a new project repository with Git:
(Create and edit index.html)
git add index.html
git commit -m "Created index.html"
You created and navigated to a new directory for your project, then initialized the Git repository for that project (current folder, all its files and all subfolders and files). At that point you can code away. When finished, add the file to staging, then commit staged edits to record the changes into history.
If you wanted to grab a remote repository, make some changes, then push those changes back to the remote repo, it works something like this in Git:
git clone http://contoso.com/myproject
(Repo initialized by default. Edit index.html.)
git add index.html
git commit -m "Removed blink tags"
git push origin master
Here, you start by cloning the remote repo (you can also pull and fetch as explained by Mike Pearce). Then you edit, add and commit your changes. Finally, you push these changes back (in this case from your master branch to the origin repo).
In Mercurial, the process is almost identical:
(Create and edit index.html)
hg commit -m "Created index.html"
No, seriously. It really is the same. How about cloning and editing an existing project in Mercurial?
hg clone http://contoso.com/myproject
(Repo initialized by default. Edit index.html.)
hg add index.html
hg commit -m "Removed blink tags"
There are differences between Git and Mercurial, of course, but those differences occur deeper within the programs and their associated workflows.
Steve Losh suggests it comes down to command structure in his post, "The Real Difference Between Mercurial and Git." Git uses fewer commands and more options, while Mercurial has more explicit commands to learn.
For a slightly more humorous approach to the differences between the two DVCS rivals, take a look at Patrick Thomson's "Git vs. Mercurial: Please Relax," in which he compares Git and Mercurial to MacGyver and James Bond. It actually makes sense.
James Woodyatt's post, "Why I Like Mercurial More Than Git," suggests the important difference is in how Git and Mercurial handle branching and merging projects. I agree that branching, merging, rebasing, fast-forwarding, and dealing with conflicts appear to be more art than science, so whether you choose Git or Mercurial, it's worth reading through Woodyatt's post and the ensuing comment discussion (as well as the Driessen post mentioned earlier) to understand this weedy topic.
There are two major online services that support DVCS repositories: GitHub and Bitbucket.
GitHub, as you can probably guess from the name, focuses on Git-based repos. They also focus on public open source projects. As a result, their free tier of service offers unlimited public repos. Private repos are available at an extra cost.
An Hg-Git Mercurial Plug-in is available to provide interaction between Mercurial repos and Git servers, including GitHub. Hg-Git was originally created by the GitHub team, and is still under current development, though now hosted over on Bitbucket.
Speaking of which, Bitbucket (run by the folks at Atlassian) is pretty much the same idea as GitHub -- a host for remote repos -- but in this case it was originally conceived as a Mercurial-focused service. Today Bitbucket supports both Git and Mercurial. The biggest difference from GitHub is that Bitbucket provides free, <em>private</em> repos, as well as public repos. Paid levels of service allow you to give additional developers access to your private repos.
Git and TFS
I already mentioned some of the Visual Studio integration options for Git. These are great if you're using it as a local VCS, working with a team Git server, or pulling and pushing to services like GitHub and Bitbucket.
But what if your team employs TFS? By default, TFS uses a centralized VCS. Early last year, however, the Visual Studio team fully embraced Git as a VCS option for TFS (see Figure 4).
It's not a version control free-for-all: You have to choose which VCS you're going to use, though it's set up on a project-by-project basis. As Visual Studio ALM MVP Esteban Garcia explained on the MVP blog, "when you create a new Team Project, you are able to decide what source control repository you will use," either Git or TFS. Scott Hanselman has a good post, "Git Support for Visual Studio -- Git, TFS and VS Put into Context," and I also recommend reading Kristofer Liljeblad's "The Git Command Line 101 for Windows Users," which does a great job explaining Git basics in the context of Visual Studio and TFS.
I should probably mention that Windows Azure also supports publishing with Git, and this little Windows Azure tutorial shows you how.
A Fork in the Code
As you've probably noticed, Git and Mercurial are pretty much evenly matched choices if you need a DVCS. In most cases, unless someone has already settled on one or the other for the project to which you're contributing, you can choose based on preference. Try both on a project-by-project basis and see which one creates less friction in your development workflow.
Git certainly has an edge overall in terms of support via client software, development tool integration and cloud services. In the world of Windows-based development, direct integration with Visual Studio and TFS is a good thing.
I wouldn't call either Git or Mercurial "an elegant weapon for a more civilized age," but I can't imagine writing code (or documentation) without one of these DVCS tools today.