Changeset 190 in tracdarcs for README.rst


Ignore:
Timestamp:
06/11/10 17:35:44 (3 years ago)
Author:
lele@…
Hash name:
20100611153544-97f81-35054ad47fecd187166b44abce2f3a0dd889d9c3
Message:

Split out the internals note from the README.rst, now used as the long_description of the package

File:
1 edited

Legend:

Unmodified
Added
Removed
  • README.rst

    r189 r190  
    100100__ http://trac.edgewall.org/wiki/0.12/TracRepositoryAdmin#ExplicitSync 
    101101__ http://darcs.net/manual/node7.html#SECTION00712000000000000000 
    102  
    103 Internals 
    104 ========= 
    105  
    106 The entire darcs change history is imported into the database, using 
    107 the output of ``darcs changes --xml-output --summary --reverse``. 
    108  
    109 A check for newer patches is performed everytime the DarcsRepository 
    110 object is created, and any new patches are immediately imported into 
    111 the database. 
    112  
    113 After that the darcs repository is used only for fetching the contents 
    114 of a file: with darcs 2.x we use ``darcs query contents``, while with 
    115 darcs 1.x we have to do ugly tricks; at the extreme, ``darcs 
    116 annotate`` output is massaged by ann2ascii.py to fetch the contents of 
    117 a file at any given point in time. 
    118  
    119 Each changeset is assigned a revision number according to their order 
    120 in the output of ``darcs changes --xml-output --summary --reverse``. 
    121 The first patch gets a revision number of 1, and second revision 
    122 number 2 etc... 
    123  
    124 This assumes that the patches in a darcs repository **NEVER** get 
    125 reordered or deleted. This condition is satisfied as long as commands 
    126 such as ``darcs unpull`` or ``darcs optimize`` are not performed. 
    127  
    128 Cache 
    129 ----- 
    130  
    131 For performance reasons, the backend creates and maintains a few other 
    132 tables, where it keeps darcs specific information. The following 
    133 tables are automatically created at `upgrade` time and populated by 
    134 `sync` (see components.py). 
    135  
    136 darcs_changesets 
    137 ~~~~~~~~~~~~~~~~ 
    138  
    139 Each row represents a darcs changeset:: 
    140  
    141     create table darcs_changesets ( 
    142         repo_id text, 
    143         rev integer, 
    144         hash text, 
    145         name text, 
    146         primary key (repo_id, rev)); 
    147  
    148 repo_id 
    149   repository containing this changeset 
    150  
    151 rev 
    152   the revision number assigned 
    153  
    154 hash 
    155   the unique patch identifier assigned by darcs 
    156  
    157 name 
    158   the name of the darcs patch 
    159  
    160 darcs_nodes 
    161 ~~~~~~~~~~~ 
    162  
    163 Each row represents a single node: a node is either a file or a 
    164 directory which has its history stored in the repository. 
    165  
    166 .. note:: a node doesn't have a particular name or content but, for a 
    167           given revision, its name and content will be well defined. 
    168  
    169 :: 
    170  
    171     create table darcs_nodes ( 
    172         repo_id text, 
    173         node_id integer, 
    174         node_type text, 
    175         add_rev integer, 
    176         remove_rev integer, 
    177         primary key (repo_id, node_id) ); 
    178  
    179 node_type 
    180   is one of (dbutil.NODE_FILE_TYPE, dbutil.NODE_DIR_TYPE) 
    181  
    182 add_rev 
    183   is the revision that added this node 
    184  
    185 remove_rev 
    186   is the revision that removed this node (possibly NULL) 
    187  
    188 darcs_node_changes 
    189 ~~~~~~~~~~~~~~~~~~ 
    190  
    191 Each row represents a node change for a particular revision. Only one 
    192 entry can exist for a node in each revision. Of course, if there are 
    193 no changes to the node then no entries will be present! :) 
    194  
    195 :: 
    196  
    197     create table darcs_node_changes ( 
    198         repo_id text, 
    199         node_id integer, 
    200         rev integer, 
    201         path text, 
    202         parent_id integer, 
    203         the_change text, 
    204         primary key (repo_id, node_id,rev) ); 
    205  
    206  
    207 the_change 
    208   one of following (defined in dbutil.py): CHANGE_ADDED, 
    209   CHANGE_REMOVED, CHANGE_MOVED, CHANGE_EDITED, CHANGE_MOVED_EDITED 
    210  
    211 parent_id 
    212   the node id for the node's parent directory 
    213  
    214 path 
    215   the path of the node at the end of revision 'rev': when change is 
    216   CHANGE_REMOVED then 'path' is the previous path. 
    217  
    218 darcs_cache 
    219 ~~~~~~~~~~~ 
    220  
    221 A cache of file contents: as soon as the content of any file at any 
    222 particular revision is requested for the first time, it's computed and 
    223 stored here, so succeeding requests won't require executing darcs at 
    224 all. 
    225  
    226 .. warning:: this may quickly grow in size! OTOH, you can just delete 
    227              all the rows at any time, the content will be recomputed 
    228              when reasked. 
    229  
    230 :: 
    231  
    232     create table darcs_cache ( 
    233         repo_id text, 
    234         node_id integer, 
    235         rev integer, 
    236         content blob, 
    237         size integer, 
    238         primary key (repo_id, node_id,rev) ); 
    239  
    240 Some sample queries 
    241 ------------------- 
    242  
    243 Get all existing nodes as of revision r 
    244 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
    245  
    246 :: 
    247  
    248     select dnc.node_id as node_id, max(dnc.rev) as rev 
    249     from darcs_node_changes as dnc, darcs_nodes as dn 
    250     where dnc.node_id = dn.node_id 
    251       and dnc.rev <= r 
    252       and dnc.repo_id = dn.repo_id and dnc.repo_id = 'somerepo' 
    253       and (dn.remove_rev is null or dn.remove_rev > r) 
    254     group by dnc.node_id 
    255  
    256 Get all latest nodes 
    257 ~~~~~~~~~~~~~~~~~~~~ 
    258  
    259 :: 
    260  
    261     select dnc.node_id as node_id, max(dnc.rev) as rev 
    262     from darcs_node_changes as dnc, darcs_nodes as dn 
    263     where dnc.node_id = dn.node_id 
    264       and dn.remove_rev is null 
    265       and dnc.repo_id = dn.repo_id and dnc.repo_id = 'somerepo' 
    266     group by dnc.node_id 
    267  
    268 Get node_id of /some/path p, as of revision r 
    269 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
    270  
    271 .. XXX: here "node_rev(r)" means a subquery, see 
    272 ..      ``_nodeid_rev_for_revision()`` in dbutil.py 
    273  
    274 :: 
    275  
    276     select dnc.node_id as node_id 
    277     from darcs_node_changes as dnc, (node_rev(r)) as nr 
    278     where dnc.node_id = nr.node_id 
    279       and dnc.rev = nr.rev 
    280       and dnc.repo_id = nr.repo_id and dnc.repo_id = 'somerepo' 
    281       and dnc.path = p 
    282  
    283 Get history of node_id nid, till revision r 
    284 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
    285  
    286 :: 
    287  
    288     select * from darcs_node_changes as dnc 
    289     where dnc.node_id = nid and dnc.rev <= r 
    290       and dnc.repo_id = 'somerepo' 
    291  
    292 Get children of node_id nid, as of revision r 
    293 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
    294  
    295 :: 
    296  
    297     select dnc.node_id as node_id 
    298     from darcs_node_changes as dnc, (node_rev(r)) as nr 
    299     where dnc.node_id = nr.node_id 
    300       and dnc.rev = nr.rev 
    301       and dnc.parent_id = nid 
    302       and dnc.repo_id = nr.repo_id and dnc.repo_id = 'somerepo' 
Note: See TracChangeset for help on using the changeset viewer.