BestPractices

Best Practices for Darcs

What repositories are useful?

Recommendations include keeping copies of your repositories under ~/public_html and on a laptop as well as in your working directory so it's always available; having a project directory, then putting darcs repositories under that for each separate area of work.

When should you branch?

In darcs, each repository is a branch.

You can create a branch anytime you want independent lines of development; on one branch you might work on a bug fix, on another you might work on a feature. Or you might implement different features on different branches, or you might even try implementing the same feature in different ways on different branches. You might want a separate branch for maintenance updates to the "1.0" series, while you work on the "2.0" series elsewhere.

Once you've created a branch, changes are merged back and forth by use of darcs pull and darcs push.

This mail about using darcs in a similar way to CVS might also be helpful.

How to create a branch?

In darcs, every repository that shares common ancestry of patches with another can be considered a branch. Its commands don't talk about branches specifically, but simply about repositories and patches. So the short answer is "use the same commands you would for creating a new repository and moving patches back and forth to it."

One feature of branches in other source control systems is the ability to save disk space by not totally duplicating the storage of the source between the trunk and its branches. darcs can help with this by creating hard links to branches that exist on the same file system. When creating a new repo on the same file system, darcs get will create hard links. Otherwise, you can use

darcs optimize --relink --sibling /repo/dir

after the fact. But if both branches are pushed to two repositories on another machine, the hard link is not recreated on the new machine. See the docs for those commands for more details.

Besides this, darcs does not do any kind of branch management. Since a repository is exactly one branch, it is not possible for a single "server" repository published at a given URL to contain multiple branches. Discovery of the various branches in a published project must be done by other means.

How should you copy working directories between machines?

If you've recorded all your changes, just use darcs push/pull.

If you have unrecorded changes, you can record them, then push/pull, and then unrecord them on both machines after the synchronisation.

Some people use unison to propagate unrecorded changes between machines. If you do that, make sure that you never copy any of the files under _darcs. You should also make sure that Darcs itself is not accessing the repositories at the same time (Unison is oblivious of the locks that Darcs uses). rsync or cp -a could also be used, with similar caveats. (See, however, the page about unison for some more hints about using Unison with Darcs.)

What is the best way to name patches?

The author has seen two approaches which makes a lot of sense in this area. However, remember that this is a highly subjective matter.

The first approach is to follow an existing coding convention from another organization or project. For example, in an IRC discussion, "twb" demonstrated how he uses a variation of the GNU changelog conventions for his patch names: see http://twb.ath.cx/words/darcs-kludges.html for the full details. In summary:

  • No single patch alters more than one file.
  • A patch name can take one of two forms
PatternTypically used for...Example
path: commentfile-wide changes.main.c: renamed file_io typedef to FILE_IO for consistency
path (function [, function [...]] ): commentchanges that affects one or more specific functions or methods.main.c (initializeScreen): Changed SDL_HWSURFACE to SDL_SWSURFACE

A second approach, which is very similar in retrospect, prefixes each patch name with a support ticket identifier (e.g., a bug ID number from Bugzilla). This is convenient because you can isolate changes relavent to a specific bug via regex-matches using darcs changes -p for example. The author discovered this technique while perusing the darcs mailing lists. In particular, check out: http://lists.osuosl.org/pipermail/darcs-users/2005-March/006113.html

How should you handle version numbering and releasing?

Tag your repository with the version number of your release. Use whatever numbering scheme you feel comfortable with; darcs stores version numbers as plain-text strings anyway.

Run "darcs dist" to create a source tarball.

If you forget what the last tag name was, you can use the following recipe to find out: darcs show tags | head -1. If you wish to see all the tagged versions, you can leave off the head command. If you want to see the dates associated with each tag as well, you can use darcs changes|grep -B 1 tagged.

Another way to list all tags, with date and author, is: darcs changes -t . The period is a regexp that matches all tags. You could for example list all rc tags with darcs changes -t rc. The very useful command darcs changes --from-tag . shows the last tag and all "extra" patches not tagged by it.

What do you do when you've got lots of patches and releases in a repository?

To get a repository without copying the whole history, use darcs get --lazy. The patches will then be downloaded on demand.

If you want to support older versions in a limited fashion (security updates, bug fixes) you want to create branches of the old version.

Avoiding Trouble

Following these suggestions may help to minimize frustrations with darcs.

Avoid very large files

darcs has trouble dealing with very large patches. If you have very large files that you absolutely want to store in the repository (corpora, etc); it may often be useful to gzip them first. I managed to reduce an 8M text file into 1M, for example.

See http://bugs.darcs.net/issue80

Avoid external merges when pulling unresolved conflicts

Avoid using --external-merge when pulling patches containing unresolved conflicts from a repository since information may be lost in this scenario. The existing conflicts in the remote repository will not be passed to the external merge tool since doing this at the same time as passing conflicts with the local repository is either impossible or too complicated. Instead the version of the file from the remote repository will be passed to the external merge tool in its pre-conflict state.

Don't change patches that have left your working repo

Once a patch has left your working repo, it could cause confusion if you then unrecord or amend-record that patch. Instead, create a new patch to resolve the issue.

Setting up a secure repo on the Internet

Say you have a server on the Internet, and you want to put a darcs repo there for several people to contribute to. If they all have accounts on the machine and can SSH to it, then no problem. But what if they don't have accounts on the machine? This recipe is for you!

See: Workflows and RepoViaSSH

Recovering from conflicts

If you discover a conflict between patches, it is not immediately obvious to a beginner how to retrieve the situation. Let's say you made some local changes that you forgot to record before pulling from another repo, and that you would like to resolve the conflict by discarding your unrecorded local changes. Here is the way to do it:

  • darcs revert # removes conflict-markers in the files
  • darcs unpull # unpull enough patches till before your 'pull'
  • darcs pull # re-pull all the patches again

Recovering from darcs hanging

On some occasions darcs has been known to 'spin' and take a tremendously long time to complete a task (hours!). Although rare, it's useful to know how to recover from such a situation.

If you can't wait for darcs to finish, you should be able to safely use Control-C to cancel the operation and try another approach. Unless you are doing one of the un- commands at the time, you shouldn't be able to mess up your repo. You may need to clean up a lock file named _darcs/_lock (documented elsewhere on the wiki), and run darcs check and darcs repair to check and repair any inconsistencies that may have developed.

Automated Testing

If you have some automatic tests for your code, you can make darcs run the test suite each time a darcs record command is issued:

darcs setpref test 'make test'  # darcs setpref test 'the command to run when testing'

If the test suite fails, the patch will not be recorded. If it passes, it will.

Dump from DRoundy

Some text from the LaTeX manual didn't really fit anymore during a rewrite, and I'm reluctant to delete it outright. Therefore, I am moving it here.

darcs apply --reply

The --reply feature of apply is intended primarily for two uses. When used by itself, it is handy for when you want to apply patches sent to you by other developers so that they will know when their patch has been applied. For example, in my .muttrc (the config file for my mailer) I have:

macro pager A "<pipe-entry>darcs apply --verbose \
        --reply droundy@abridgegame.org --repodir ~/darcs

which allows me to apply a patch to darcs directly from my mailer, with the originator of that patch being sent a confirmation when the patch is successfully applied. NOTE: In an attempt to make sure no one else can read your email, mutt seems to set the umask such that patches created with the above macro are not world-readable, so use it with care.

When used in combination with the --verify option, the --reply option allows for a nice pushable repository. When these two options are used together, any patches that don't pass the verify will be forwarded to the FROM address of the --reply option. This allows you to set up a repository so that anyone who is authorized can push to it and have it automatically applied, but if a stranger pushes to it, the patch will be forwarded to you. Please (for your own sake) be certain that the --reply FROM address is different from the one used to send patches to a pushable repository, since otherwise an unsigned patch will be forwarded to the repository in an infinite loop.

If you use darcs apply --verify PUBRING --reply to create a pushable repository by applying patches automatically as they are received by email, you will also want to use the --dont-allow-conflicts option.

Problems with old repositories or old versions of darcs

No stupid find tricks (or upgrade to darcs 2)

Watch out for operations which work recursively on your directory. For example, something like this will definitely cause inconsistencies in darcs if you're using an old-fashioned repository format:

for i in ``find . -name '*.lhs'``; do
   sed -e 's/^import Foo/import Bar.Foo/' $i > $i.2; mv $i.2 $i
done

In darcs 1, the problem was that this command would also affect the files in _darcs/pristine. Darcs 2 makes this unlikely if you use either the hashed format (which is compatible with darcs 1) or the new darcs 2 format.

Avoid conflicts (and upgrade to darcs 2)

In darcs 1, large conflicts and complex conflicts can cause darcs to use an exponential amount of CPU power to solve the problem, giving the appearance that darcs is "spinning" or "hanging". Darcs 2 mostly avoids the problem in practice, but you should use the new darcs 2 format repositories instead. See ConflictsFAQ for details