source:
tracdarcs/README.rst
@
189
| Revision 189, 8.9 KB checked in by lele@…, 3 years ago (diff) |
|---|
Darcs backend for Trac
This package implements a darcs backend for Trac supporting the new multirepository feature, that has been merged in the current Trac 0.12.
It used to work on 0.11, and even with 0.10, at least up to version 0.8 of the backend... No time to even try, so no guarantee, sorry!
To use the module you can either install it or make an egg and copy it in the right place.
Installation
You can install the module the usual way:
$ python setup.py install [--prefix /usr/local]
Otherwise you can make an egg:
$ python setup.py bdist_egg
and either install it globally with:
$ easy_install dist/TracDarcs-someversion.egg
or manually copy the egg from the "dist" subdir into the environment's "plugins" subdirectory
In general, follow the directions in TracPlugins.
Specific configuration options
Some feature can be altered by using the following trac+darcs specific options in the [darcs] section of the configuration:
- command : string
- This is the effective darcs executable that will be used. By default its darcs, but you could set it to /usr/local/bin/darcs to use a newer version...
- dont_escape_8bit : boolean
- False by default, maps to the darcs $DARCS_DONT_ESCAPE_8BIT behaviour.
- possible_encodings : string
- By default 'utf-8,iso8859-1', its a comma-separated-value list of possible string encodings to try one after the other, should a decode error occur while parsing darcs changesets.
- max_concurrent_darcses : integer
- By default 0 to mean no limits, otherwise it is the maximum number of concurrent running darcs processes at the same time.
- eager_annotations : boolean
- False by default, when true the content and the annotation cache of each modified file get computed immediately after a changeset gets added. This will move the heavy computation at pull time, rather than at first visit time. Of course, it will also enlarge the trac database...
Using a postcommit hook
The recommended way to trigger the sync between the repository and the Trac instance is by using a darcs post hook on its apply: in this way the database will be updated as soon as darcs finish applying any new changeset.
This can be accomplished by putting something like the following setting into the repository _darcs/prefs/defaults file:
apply posthook trac-admin TRAC_ENV changeset added $(pwd) $(python -m tracdarcs.changesparser) apply run-posthook
where of course you should replace TRAC_ENV with the full path of the related trac instance.
Note
python -m tracdarcs.changesparser is just a quick way of extracting the list of changeset hashes from the the darcs changes --xml format: it accepts the input either as the $DARCS_PATCHES_XML or from standard input:
$ darcs changes --xml | python -m tracdarcs.changesparser | head -3 20100611081300-97f81-bc5c1f7acf0c168bbfa9fb911e3cc2a4e71d5eef 20100610150339-97f81-cd1b73f2ba1b1d98c28542ecbd1d5e2bd9052056 20100512164420-97f81-de3fbc73d7c401fb92503ef1b25e19e0f48d2ad1
At that point, you could deactivate the per request sync that Trac still does by default, by setting repository_sync_per_request to an empty value in the [trac] section of the configuration.
Internals
The entire darcs change history is imported into the database, using the output of darcs changes --xml-output --summary --reverse.
A check for newer patches is performed everytime the DarcsRepository object is created, and any new patches are immediately imported into the database.
After that the darcs repository is used only for fetching the contents of a file: with darcs 2.x we use darcs query contents, while with darcs 1.x we have to do ugly tricks; at the extreme, darcs annotate output is massaged by ann2ascii.py to fetch the contents of a file at any given point in time.
Each changeset is assigned a revision number according to their order in the output of darcs changes --xml-output --summary --reverse. The first patch gets a revision number of 1, and second revision number 2 etc...
This assumes that the patches in a darcs repository NEVER get reordered or deleted. This condition is satisfied as long as commands such as darcs unpull or darcs optimize are not performed.
Cache
For performance reasons, the backend creates and maintains a few other tables, where it keeps darcs specific information. The following tables are automatically created at upgrade time and populated by sync (see components.py).
darcs_changesets
Each row represents a darcs changeset:
create table darcs_changesets (
repo_id text,
rev integer,
hash text,
name text,
primary key (repo_id, rev));
- repo_id
- repository containing this changeset
- rev
- the revision number assigned
- hash
- the unique patch identifier assigned by darcs
- name
- the name of the darcs patch
darcs_nodes
Each row represents a single node: a node is either a file or a directory which has its history stored in the repository.
Note
a node doesn't have a particular name or content but, for a given revision, its name and content will be well defined.
create table darcs_nodes (
repo_id text,
node_id integer,
node_type text,
add_rev integer,
remove_rev integer,
primary key (repo_id, node_id) );
- node_type
- is one of (dbutil.NODE_FILE_TYPE, dbutil.NODE_DIR_TYPE)
- add_rev
- is the revision that added this node
- remove_rev
- is the revision that removed this node (possibly NULL)
darcs_node_changes
Each row represents a node change for a particular revision. Only one entry can exist for a node in each revision. Of course, if there are no changes to the node then no entries will be present! :)
create table darcs_node_changes (
repo_id text,
node_id integer,
rev integer,
path text,
parent_id integer,
the_change text,
primary key (repo_id, node_id,rev) );
- the_change
- one of following (defined in dbutil.py): CHANGE_ADDED, CHANGE_REMOVED, CHANGE_MOVED, CHANGE_EDITED, CHANGE_MOVED_EDITED
- parent_id
- the node id for the node's parent directory
- path
- the path of the node at the end of revision 'rev': when change is CHANGE_REMOVED then 'path' is the previous path.
darcs_cache
A cache of file contents: as soon as the content of any file at any particular revision is requested for the first time, it's computed and stored here, so succeeding requests won't require executing darcs at all.
Warning
this may quickly grow in size! OTOH, you can just delete all the rows at any time, the content will be recomputed when reasked.
create table darcs_cache (
repo_id text,
node_id integer,
rev integer,
content blob,
size integer,
primary key (repo_id, node_id,rev) );
Some sample queries
Get all existing nodes as of revision r
select dnc.node_id as node_id, max(dnc.rev) as rev from darcs_node_changes as dnc, darcs_nodes as dn where dnc.node_id = dn.node_id and dnc.rev <= r and dnc.repo_id = dn.repo_id and dnc.repo_id = 'somerepo' and (dn.remove_rev is null or dn.remove_rev > r) group by dnc.node_id
Get all latest nodes
select dnc.node_id as node_id, max(dnc.rev) as rev from darcs_node_changes as dnc, darcs_nodes as dn where dnc.node_id = dn.node_id and dn.remove_rev is null and dnc.repo_id = dn.repo_id and dnc.repo_id = 'somerepo' group by dnc.node_id
Get node_id of /some/path p, as of revision r
select dnc.node_id as node_id from darcs_node_changes as dnc, (node_rev(r)) as nr where dnc.node_id = nr.node_id and dnc.rev = nr.rev and dnc.repo_id = nr.repo_id and dnc.repo_id = 'somerepo' and dnc.path = p
Get history of node_id nid, till revision r
select * from darcs_node_changes as dnc where dnc.node_id = nid and dnc.rev <= r and dnc.repo_id = 'somerepo'
Get children of node_id nid, as of revision r
select dnc.node_id as node_id from darcs_node_changes as dnc, (node_rev(r)) as nr where dnc.node_id = nr.node_id and dnc.rev = nr.rev and dnc.parent_id = nid and dnc.repo_id = nr.repo_id and dnc.repo_id = 'somerepo'