Home > Source Control, Tools > Setting up a Mercurial Repository Server using hgwebdir.cgi on IIS6

Setting up a Mercurial Repository Server using hgwebdir.cgi on IIS6

My exams are over now and I will hopefully have enough time to cross some points off my personal to-do list. One thing I have always wanted to do is setting my home server up to let it serve my Mercurial Repositories via http. Given the distributed nature of Mercurial I don’t find myself in the situation to access my repos remotely but it can be quite handy from times to times. In the past I have accessed them via VPN (using OpenVPN), having them sit on a Windows Network Share. Needless to say performance was suboptimal but actually quite acceptable if my internet connection was fast and stable. However, accessing it via 3G is a completely different story.

When Google Code evaluated their plans to support at least one major DVCS, they did a detailed analysis with a special focus on supported protocols and performance. They finally decided to support Mercurial because it has a very efficient HTTP protocol implementation that was a good fit with Google’s Infrastructure. Besides supporting file shares and HTTP, Mercurial does also support SSH, which is not an option for me since my server runs a Microsoft Stack. There’s a good list of supported protocols and sharing methods on the Mercurial Wiki. Since I want to share multiple repositories, I decided to go with the hgwebdir solution. Basically it is just a smallish Python CGI application that can also serve a nice repository browser etc.

Set up is fairly easy, I found Jeremy Skinners tutorial very helpful, even though it’s using IIS 7 (I’m running IIS 6).  What caused me some headaches, was implementing a nice URL rewriting. Because all access is done via hgwebdir.cgi, it’s an ugly part of the URL like so:


The Idea with using a URL rewriting is to let the user access repositories via a clean URL like http://myserver/hg/LDT-trunk/,which is then rewritten to the former. Hgwebdir.cgi supports this method because one can change the way it generates links on the web interface by providing a new base-Url  (/hg instead of /hg/hgwebdir.cgi). This and URL Rewriting for Apache using a .htaccess file is described on the Mercurial Wiki.

To be honest, it’s a fairly optional step but being the perfectionist I am I couldn’t resist to try it with IIS. Jeremy Skinner proposes Microsoft’s official URL Rewrite, but that’s only compatible with IIS 7.  The best tool I could find for IIS 6 is the excellently documented Ionic’s ISAPI Rewrite Filter, which features a very similar and powerful regular expression based configuration as Apache .htaccess. Set up was very fast but getting the configuration right was not very easy and took a little debugging. The IIRF tooling (TestDriver.exe) was really helpful here. Here’s my IIRF.ini for your reference:

# Send requests for files that exist to those files.
RewriteCond %{REQUEST_FILENAME} !-f
# Send requests for directories that exist to those directories.
RewriteCond %{REQUEST_FILENAME} !-d

# prevents processing of already correct urls
RewriteRule ^/hg/hgwebdir.cgi/(.*)$ - [L,I]
# redirect every request to the hgwebdir.cgi script
RewriteRule ^/hg/(.*)$ /hg/hgwebdir.cgi/$1
# redirect root directory access to the website root
RewriteRule ^/hg /hg/hgwebdir.cgi/

Everything was working fine, except the URL rewriting didn’t work out for URL’s that contained encoded Umlauts (like Ü becomes %DC). On top of that, file contents weren’t displayed using the correct encoding (which is UTF-8). The latter was easy to fix by simply uncommenting the first two lines in hgwebdir.cgi that will set the HGENCODING environment variable, which tells Mercurial what encoding to use, to UTF-8:

import os
os.environ["HGENCODING"] = "UTF-8"

Mercurial internally makes no assumptions about the file contents or file types it tracks, so in this case encoding is simply a matter of presentation. All other internal data (changelogs, tags etc.) are stored as UTF-8 according to the wiki. There are still some minor anomalies in the web interface, but I will investigate them later.

The issue with incorrect URL rewritten URL’s was much harder to track down. I used Fiddler to analyse the GET/RESPONSE stream but couldn’t find anything that was obviously wrong. The IIRF log contained the following lines:

Sun Mar 28 16:31:57 -  3260 - DoRewrites: Url (decoded): '/hg/LDT-trunk/file/c2e75c0ef33c/src/Übc.xx'
Sun Mar 28 16:31:57 -  3260 - DoRewrites: Rewrite Url to: '/hg/hgwebdir.cgi/LDT-trunk/file/c2e75c0ef33c/src/Übc.xx'

That looked somewhat unsuspicious to me, until I remembered that special characters in URL’s need to be specially encoded. By default IIRF attempts to decode the URLs when a request is received and does only process the decoded URLs (hence the small hint in the log file). This behavior can easily be configured via the IIRF configuration using UrlDecoding Off.

Categories: Source Control, Tools
  1. May 23, 2011 at 22:23

    Thanks for this write up. I had to modify it slightly to get it to work for me. I had to make all three rules have “[L]” after them to get this to work. I have no idea why that was the case though!
    Tried it on IIS6/Windows Server 2003 and also IIS7/Windows 7, and I needed to do that on both machines. I also tried it with Mercurial setup at the root of a web site and also under a virtual directory… still needed [L] on all three rules.
    Any ideas why that might be?

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: