Concurrent Versioning System
CVS is one of the oldest revision control systems handled by Tailor. It was the most used system in the second millenium, when there were actually only a few and little known alternatives. Now there is an hoard of competitors that try to replace it, with one, SubversionRepository? that explicitly took CVS as a model for its user experience, effectively mitigating CVS shortcomings.
Usage
This backend is selected by using the prefix cvs: in the name of the repository, and can be used both as SourceRepository and as TargetRepository:
[pxlib] root-directory = /wip/sf.net/pxlib target = darcs:pxlib source = cvs:pxlib subdir = pxlib [darcs:pxlib] [cvs:pxlib] repository = :pserver:anonymous@cvs.sf.net:/cvsroot/pxlib module = pxlib encoding = iso-8859-1
Specific options
- freeze-keywords : bool
- With this enabled (it is off by default) tailor will use -kk flag on checkouts and updates to turn off the keyword expansion. This may help minimizing the chance of spurious conflicts with later merges between different branches.
False by default.
- tag-entries : bool
- CVS and CVSPS repositories may turn off automatic tagging of entries, that tailor does by default to prevent manual interventions in the CVS working copy, using tag_entries = False.
True by default.
Why is tailor so slow dealing with CVS?
Tailor uses the standard command line interface to actually fetch information from the repository. Several people complained this is slow and suggested using a tool written in C, but there i none yet. The current only alternative, CvspsRepository, which uses cvsps to group commits back into changesets, but the bottleneck with CVS is surely not the log parser. Faster approaches require using the CVS protocol directly to avoid the cost of forking cvs commands and creating connections to the same server over and over again (see below), or direct access to the CVS repository and to its RCS files (see for example cvs2svn, or parsecvs).
For direct protocol access it would be better to use an existing implementation of the cvs protocol, but other tools (eg. git-cvsimport, written in perl), implement their own cvs client. There only python implementation I could find is PyCVS, whose developement seems to have staled a long ago. I could find no C implementation. OTOH, there is at least one perl library under more active development at (libcvs-perl), which could posibly be glued into python using PyPerl).
At the end of May 2006 Jeremy Barnes sent me an email that seems to prove that the slowness isn't of tailor, or Python, as many tought, but is instead wired in CVS sources. He changed his own copy applying the following single line patch:
--- BUILD/cvs-1.11.19/src/update.c 2005-01-31 17:18:01.000000000 -0500
+++ BUILD/cvs-1.11.19-nowait/src/update.c 2006-05-27 21:52:27.621272500 -0400
@@ -525,11 +525,11 @@
#endif
/* see if we need to sleep before returning to avoid time-stamp races */
if (last_register_time)
{
- sleep_past (last_register_time);
+ /* sleep_past (last_register_time); */
}
return err;
}
The patch fixes problems with a slow CVS import. The problem is with CVS, not Tailor, but is of interest to Tailor users as it can speed up imports from CVS tremendously. By default (with the version of CVS that I use), CVS will ensure that any update command takes at least one second, by sleeping if necessary until the second counter has ticked over before releasing its locks. I believe that it's due to possible races on filesystems that only have one second precision on their timestamp fields, and CVS tries to be safe in this regard. In interactive use of CVS, it's not a problem. When using an automated tool like Tailor, however, it can add significantly to the import time. For example, in our CVS repository that includes 100,000 different CVS changesets, 100,000 seconds (about 30 hours) are added unnecessarily to the import time. Without the patch, it takes about 30 hours to import, whereas it takes 2 hours without.
You can see the difference in the following logs:
before (unpatched CVS):
09:48:44 [I] Bootstrap completed 09:48:44 [I] Updating "jeremy-papers" in "/export/indexes/import/jeremy-papers" 09:48:44 [I] Applying pending upstream changesets 09:48:44 [I] Changeset "2001-02-01 06:23:28 by jeremy" 09:48:44 [I] Log message: Initial revision 09:48:44 [I] .../jeremy-papers $ cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/Makefile 09:48:45 [I] [Ok] 09:48:45 [I] .../jeremy-papers $ cvs -d ...r/mycvsroot/ -q update -kk -d -r 1.1 al-benefits/al-benefits.tex 09:48:46 [I] [Ok] 09:48:46 [I] .../jeremy-papers $ cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/cum-freq-senses.m 09:48:47 [I] [Ok] 09:48:47 [I] .../jeremy-papers $ cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/rationalist-empirical.fig 09:48:48 [I] [Ok] 09:48:48 [I] .../jeremy-papers $ cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/sense_stats.cc 09:48:49 [I] [Ok] 09:48:49 [I] .../jeremy-papers $ cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 bibliography/papers.bib 09:48:50 [I] [Ok] 09:48:50 [I] .../jeremy-papers $ cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 build/papers-include.mk 09:48:51 [I] [Ok]
Notice that exactly one CVS command per second occurs (look at the timestamp at the start of the line).
after (patched CVS):
09:52:11 [I] Bootstrap completed 09:52:11 [I] Updating "jeremy-papers" in "/export/indexes/import/jeremy-papers" 09:52:11 [I] Applying pending upstream changesets 09:52:11 [I] Changeset "2001-02-01 06:23:28 by jeremy" 09:52:11 [I] Log message: Initial revision 09:52:11 [I] .../jeremy-papers $ ~/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/Makefile 09:52:11 [I] [Ok] 09:52:11 [I] .../jeremy-papers $ ~/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/al-benefits.tex 09:52:11 [I] [Ok] 09:52:11 [I] .../jeremy-papers $ ~m/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/cum-freq-senses.m 09:52:11 [I] [Ok] 09:52:11 [I] .../jeremy-papers $ ~/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/rationalist-empirical.fig 09:52:11 [I] [Ok] 09:52:11 [I] .../jeremy-papers $ ~/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 al-benefits/sense_stats.cc 09:52:11 [I] [Ok] 09:52:11 [I] .../jeremy-papers $ ~/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 bibliography/papers.bib 09:52:11 [I] [Ok] 09:52:11 [I] .../jeremy-papers $ ~/BUILD/cvs-1.11.19/src/cvs -d .../mycvsroot/ -q update -kk -d -r 1.1 build/papers-include.mk 09:52:11 [I] [Ok]
Notice that the commands all occur within the same second.
Comparisons with other tools to convert from cvs
Comparison with git-cvsimport
This point only applies to the tailor cvs: backend. git-cvsimport has this behaviour because of a limitation in cvsps 2.1, so tailor will have the same problem when using the cvsps: backend:
- tailor can give strange results in some cases, but I believe it is more correct than git-cvsimport. An example is when the cvs branch you import does not exist in some files for any reason (I had a branch only for the src/ dir - corrected that now): in that case tailor correctly shows deletions for those files in the branch initial commit, whereas git-cvsimport only works on changes, and does not do anything special (this may be cvsps doing magic behind his back).
