Mercurial SVN Integration Using Erlang

I wanted to make it possible to keep a Mercurial (Hg) repository in sync with a series of Subversion (SVN) source repositories. Due to the way certain Subversion repositories are structured it is not always convenient to have a common Mercurial and Subversion root. I am actually using Mercurial to coalesce a collection of related modules into a single checkout for an agile development team. Consequently, the Python hgsvn scripts are unworkable for the scenario I am in. So I rolled my own hgsvn in Erlang and it was surprisingly easy to do.

The Erlang hgsvn module follows this process:

  1. Get the current revision from svn info
  2. Get all future revisions with changes from svn log -rBASE:HEAD
  3. Parse out the Author, Date, and Comment from svn log
  4. Update to the next revision using svn up -r #
  5. Add any new files added and remove any file deleted by that revision to Mercurial using hg addremove
  6. Commit the revision to Mercurial as that Author on that Date with that Comment
  7. Repeat from (2) until no more revisions are available

Using Erlang for this made the code very straight-forward. I previously attempted this in Groovy and beyond failing to get it working I found the code incredibly difficult to understand without detailed comments.

To run this code you would invoke the following command in the root of a Subversion working copy in your Mercurial repository:

erl -pa <path to hgsvn.beam parent dir> -noshell -s hgsvn -s init stop

You can also specify a set of related repositories (ex: from a common server) and update them together in revision order:

erl -pa <path to hgsvn.beam parent dir> -noshell -s hgsvn -s init stop -repo_set path/to/repo1 path/to/repo2

You can even specify multiple sets of related repositories and update them set by set in revision order:

erl -pa <path to hgsvn.beam parent dir> -noshell -s hgsvn -s init stop -repo_set path/to/repo1 path/to/repo2 -repo_set path/to/repo3 -repo_set path/to/repo4

Further to that you can specify stop revisions for each repository in a set:

erl -pa <path to hgsvn.beam parent dir> -noshell -s hgsvn -s init stop -repo_set path/to/repo1 path/to/repo2 -stop_set 1023 1432

If you specify a stop_set for one repo_set you must provide one for every repo_set. It is perfectly workable to put nothing after the -repo_set as hgsvn will interpret that to mean HEAD for every repository in the repo_set.

You can have an unlimited number of repository sets and an unlimited number of Subversion paths per repository set.

This code was developed to work with the output format from Subversion 1.4.4 (r25188) on Mac OS X 10.5 and Subversion 1.4.6 (r28521) on Ubuntu Hardy Heron. It should work with any 1.4.x Subversion though. Make sure to post any issues here as comments (please include the output of svn info).

UPDATE (2008/07/15): now handles case where one or more SVN WCs have no new changes

UPDATE (2008/07/15): adds SVN watcher that automatically collects changes (ex: -watch 30min)

UPDATE (2008/08/01): this is now erlang_hgsvn on GitHub