Ideas/ShortSecureId

Context

Scenarios and use cases

Fetching a repository by identifier (Zooko)

Alice has a repository. She generates an identifier. It less than 40 characters long. She sends it to Bob. Bob has read-access to a different repo, owned by Charles. He then tells darcs “Using Charles’s repo, get the version identified by this identifier. Darcs then (very very quickly) does one of two things: tells him it can’t because Charles’s repo doesn’t have all the patches that it would need, or does so, resulting in a local repo (on Bob’s computer) which has exactly the set of patches in it that Alice intended when she generated the identifier.

The “secure” part simply means that if darcs does the second case (fetches the patches) instead of the first case (says that it can’t), then Bob doesn’t have to consider Charles as part of the equation of what set of patches that he got. Bob knows that he got the set of patches Alice intended, regardless of Charles’s intention. (Charles could of course force darcs into the “I can’t do that” branch, for example by deleting his repository or turning off his network connection.)

Desiderata

  • short, secure, fast
  • identify pristine state (ie guarantees we have same pristine)
  • identify patch set (guarantee we have same patch set)
  • accepts patch reordering (since for darcs it does not matter)
  • rejects false patch contents (patch info for A but something else in contents)
  • rejects false ordering even of true patch contents (clever and malicious)

Proposals

id pristine id patchset commute- friendly spots fake patches spots fake ordering fast to compute
pristine hash yes no yes no no yes (available)
weak hash yes yes yes no no yes (O(patches))
context hash yes yes no no no yes (hash darcs log –context)
naive inventory hash yes yes no yes yes yes (hash hashed_inventory)
minimal context hash yes yes yes yes yes no

Notes:

  • if an identifier can identify a patchset, then it can identify a pristine state since a patchset implies a pristine state (independently of order)
  • pristine hash is at the first line of _darcs/hashed_inventory
  • weak hash is shown in in darcs show repo since Darcs 2.10.3, it is the XOR of the metadata hash of all patches of the repository
  • the output of darcs log --context differs from darcs log by its format, also it is only shows all patches since the last tag.
  • setprefs are not taken into account

Notes and questions

  • Do we have a scenario for false ordering with true patch contents?
  • I (bf) think the most useful identifier for the stated purpose is the naive inventory hash; or, for efficiency, a hash of that and the pristine hash.
  • This will come most naturally as a result of adding branches, since a branch is a name that we associate with such a repository state.
  • The fact that these identifiers are not commutation-invariant can be regarded as a feature, as long as we have to deal with buggy patch semantics (darcs-1 and darcs-2), for which attempting certain valid commutations can fail and even crash darcs.