Performance
This is a new page to discuss the performance of Darcs. It will focus on specific cases were darcs has known to be slower-than-ideal, and suggestions for improvement. See also StandardDarcsBenchmarks
Slow performance cases
Testing Linux kernel work
Catalin reports on the mailing list: Unfortunately, I cannot use it with the Linux kernel (that's what I mainly do) because it is very slow (3 hours for a big patch merging) and uses a lot of memory (over 600MB).
The author suggests: You could either make more memory available for darcs, or compile darcs with the --enable-antimemoize option. The initial record is the command which most stresses darcs' memory, as it requires holding the entire tree in parsed and in memory. The linux kernel probably requires close to a gigabyte of memory. I'm not sure how much it will take with --enable-antimemoize.
)
--enable-antimemoize, a possible solution to help the kernel case
This is a darcs compile time option. The author explains:
The antimemoize trick helps *all* such [patch memory usage] situations, albeit to a limited extent, and at the potential cost of extra CPU time. The trouble is that antimemoization hasn't been well tested, so it's not the default (which is why it isn't well-tested). Perhaps I should make it default on the 1.1 branch of darcs, just to get more testing.
If it had more testing, perhaps it could be the default, but it almost always slows darcs down (except, of course, when swap is hit), and so until it has proven large memory benefits, it'll have to remain optional. There is a trick that may make it possible to often have antimemoization cost nothing in cpu time, but I'm not sure whether it'll work (or help).
(Kevin Ollivier reported that -enable-antimemoize helped with his performance issues
> The antimemoize appears to have been removed between darcs 1.0.2 and darcs 1.0.5, with nary a changelog entry. I use darcs for some pretty large projects, and I was hoping this would help. What's going on? --JasonFelice
--look-for-adds slow in some cases
using --look-for-adds was reported to be slower than it should be in one case.
darcs diff
Up to 1.0.2, darcs was unnecessarily slow when computing a diff between a limited number of files in the working directory and the most recent copy in the repo. In 1.0.3 and beyond, a patch has been added to optimize this case.
If you just want to compare what changes you've made since the last record you made, this workaround may give the same result and be very fast:
diff _darcs/current/file.txt file.txt
The Emacs vc-darcs.el and the darcs.vim Vim plug-in both use this shortcut to provide better performance here. See the Front Page for details about those plug-ins.
Also, 'darcs whatsnew file.txt' will give a similar result.
darcs pull
'pull' can get slow when pulling from a remote repo with a long patch history to read (mostly because the network layer is unable to pipeline requests for remote files). You can work around this issue by checkpointing the remote repo (darcs optimize --checkpoint), and using get --partial.
Of related interest: if you're pulling over HTTP, you should know that Darcs will operate properly over a caching HTTP proxy (especially if it supports HTTP/1.1 Cache-Control directives). This might save you a lot of time if you're repeatedly pulling from the same remote repo.
I'm also having trouble with pulling large patches. Not simply long patch histories, but short patch histories (three changes) which include lots of data (565 files, 51M). I want to pull the three patches to another repo which has two patches. I've let it run for several hours and the target repository is still reading 5.6M, which was its original size. What's going on here?
The other situation is that I have a versioned SugarCRM, which is pretty heavy. I added a major point release patch to a repository where I had pulled just the vendor patches (initial import, security patch) from my repo, then I attempted to pull my other changes, and let it run overnight and it didn't get anywhere. I tried pushing instead, and I ended up making diffs and applying them, and re-recording them. (This machine doesn't have much memory. The other has tons, and processor to burn.)
Are my scenarios unusual? --JasonFelice
Comments and Questions
Has anyone considered trying the arrow interface for patch operations? I haven't stared at Patch.lhs in enough detail to see whether that idea has obvious flaws, but arrows might allow statically available properties to be cached during multiple patch operations --ShaeErisson
