Development/Benchmarks
The standard darcs benchmarks can be obtained by
cabal update && cabal install darcs-benchmark
cd /tmp/ && mkdir bench && cd bench
darcs-benchmark get
darcs-benchmark run darcs_binary1 darcs_binary2
The current darcs repository lives on darcsden (please darcs send patches).
See also Benchmarks for the current list of published benchmarks.
Improvements needed (help wanted!)
To be migrated to bugs-everywhere tracker
Benchmarks to add:
- obliterate
Note that these are all very small jobs...
- Make it possible to toggle profiling from the config file
- Make it possible to run just the latest version of darcs (and then compare with stored performance numbers for older darcs)
- Use (Table (BetterOrWorse String)) instead of (Table String) so that we can configure the HTML renderer to mark regressions red
- Implement a timeout mechanism to kill darcs after some fixed amount of time
- Determine which graph-producing library to use (need something fairly portable, easy to install)
- Write code to produce shootout-like visualisation of results (maybe hsparklines to produce at-a-glance-overview)
- Verbose mode so that we can get at the .prof files for fine grained performance data.
General tasks
- RND: what are blktrace and seekwatcher, and how can they help us?
- RND: http://bugs.darcs.net/issue1631 could criterion by useful?
Test repositories
We are looking for repositories that have particularly interesting characteristics or behaviours.
- Petr's pathological case for http://bugs.darcs.net/issue973
- much slower on darcs 2
- http://ftp.frugalware.org/pub/archive/other/darcs/frugalware-current/
- http://ftp.frugalware.org/pub/archive/other/darcs/frugalware-current2.tar
Problematic repositories
- http://code.haskell.org/eclipsefp/cohatoe has a number of huge binary patches
- GHC
- One of the side effects of issue1022 was that we found it impossible to pull from a local copy of GHC HEAD to http://darcs.haskell.org/ghc-STABLE-2007-09-11-ghc-corelibs-testsuite.tar.bz2
- darcs add -r linux kernel sources
- darcs annotate in GHC
Characteristics
- many files in one patch
- many files
- huge files
- many contributors
- many patches
- ...on a branch (conflicts?)
In the future
- benchmarking against other revision control systems issue1538
- use of http://code.google.com/p/maybench to get darcs scalability benchmarks (showing how darcs performance degrades as N increases)
Ideas for tracking usage
- use test memory suite to get mmap statistics. We can bake this into darcs and use a reporting technique like ghc uses. Print "<<mmap: bytes allocated :mmap>>" to stderr when --track-mmap is passed to darcs.
- Working with special case repositories above will give us a faster turn around on having a useful test suite. Measuring scalability is less important than measuring regressions on known hard cases.
