Ticket #1 (new enhancement)

Opened 6 years ago

Last modified 5 years ago

Better two-way sync

Reported by: lele@… Owned by: lele
Priority: major Milestone: VersionOne
Component: tailor Version: 1.0
Keywords: Cc:

Description (last modified by lele) (diff)

Nathan Gray <kolibrie@…> suggested the following enhancement:

I would update from CVS to darcs. Do my work in darcs. When I am ready to commit back to CVS, I tell tailor to sync. Tailor would first check to see if there had been any updates to CVS, and pull those, creating darcs patches for each changeset. Then immediately after committing each darcs patch to CVS, tailor would check to see if there had been any updates to CVS. There would be at least one (from the patch just committed), which would be pulled back. CVS would now be happy, and darcs would detect no changes, and would also be happy. Tailor could then continue with the next darcs patch to commit to CVS.

Change History

comment:1 Changed 6 years ago by anonymous

  • Summary changed from Better two-way sync to Better two-way sync

Brian Warner and I figured out how this could be done. It requires two tailor directories -- one for svn->darcs and one for darcs->svn.

Things tailor needs to support this:

  1. Single-step: perform the "suck it in from source, commit it to target" step for a single patch (the next new patch in source) and then stop.
  2. Conflict detection: if the attempt to commit the patch to target would have caused a conflict, then report this situation back to the user/caller of tailor.

comment:2 Changed 6 years ago by zooko@…

Hopefully Brian Warner will read this and correct any flaws in this design. This design is considerably more complicated than the one Brian and I discussed, and I'm not sure if it is that I have discovered and fixed unforseen problems or if I have forgotten a useful simplification.

Let there be two tailor directories, once which ports patches from svn to darcs (called "SD" from here on) and one which ports patches from darcs to svn (called "DS" from here on). Each of these tailor directories contains an svn working dir subdirectory (called "SDs" or "DSs") and a darcs repo subdirectory (called "SDd" or "DSd").

In addition there is (of course) the SVN repo "SO" and an "official" darcs repo "DO".

Let "the internally consistent condition" be such that the contents of all four of the subdirectories SDs, SDd, DSs, and DSd are up-to-date with each other, and furthermore that SDs and DSs are each a subset of SO (i.e., svn diff would show no changes) and SDd and DSd are each a subset of DO (i.e. darcs push DO would have no effect).

Subroutine C will return to the internally consistent state when it completes, if possible. If it is not possible to return to the internally consistent state then there is a conflict which has to be resolved by the user.

Let "the up-to-date condition" be such that the contents of all six of SDs, SDd, DSs, DSd, SO, and DO are up-to-date with each other. The up-to-date condition implies the internally-consistent condition. The complete Algorithm -- Algorithm D will result in the up-to-date state unless there is a conflict or unless new patches are added to SO or to DO after the algorithm has checked for new patches and before it terminates.

I'll first show three subroutines and then show the complete algorithm:

Subroutine A: translate the next patch from SVN to darcs A1. Use tailor in the SD directory to translate the next patch from SVN to darcs. This will update SDs to that revision and add a new patch into SDd. A2. Use darcs push --match='hash $HASH' to push the resulting darcs patch from the SDd to DO (where HASH is equal to the hash of the patch which was just created). If the push fails for the reason that pushing would cause a conflict in DO then this algorithm must stop and notify the user of a conflict condition. A3. Run svn up -r$REVNUMBER in DSs (note: not SDs), with REVNUMBER equal to the revision number of the patch that was translated from SVN world in step A1. A4. Run darcs pull --match='hash $HASH' in DSd (note: not SDd), with HASH equal to the hash of the patch which was created in darcs world in step A1.

Observe that if the internally-consistent condition held before Subroutine A, and there was no conflict, then the internally-consistent condition must hold after Subroutine A.

Subroutine B: translate the next patch from darcs to SVN B1. Use tailor in the DS directory to translate the next patch from darcs to SVN. This will pull the next patch from DO into DSd and create a new revision in DSs and attempt to commit it to SO. Let REVNUMBER be the SVN revision number of this patch. B2. If the attempt to commit the new patch to SO fails due to merge conflict then this algorithm must stop and notify the user of a conflict condition. B3. Verify that there were no changes in SO between the previous REVNUMBER from the last time step B4 was run (if any) and the current REVNUMBER. If there were changes in SVN between those two revisions then flag a conflict and stop. B4. Run svn up -r$REVNUMBER in SDs (note: not DSs). B5. Run darcs pull --match='hash $HASH' in SDd (note: not DSd), with HASH equal to the hash of the patch that was translated from darcs world in step b1.

Observe that the starting from the internally-consistent condition is necessary but not sufficient for Subroutine B to result in the internally-consistent condition. In addition to the internally-consistent condition, another condition is required, namely that SDs is required to have been up-to-date with SO. If SDs was not up-to-date with SO before Subroutine B, then step B4 will, in addition to adding the new patch into SDs also add other patches into SDs which are not also in SDd, violating the internally-consistent condition. This is why step B3 has to be inserted in Subroutine B.

That motivates us to invent this subroutine:

Subroutine C: translate all patches from SVN to darcs and then translate the next patch from darcs to SVN

C1. Iterate Subroutine A until there are no more patches in SO that are not already in SDs, i.e. that SDs is up-to-date with SO. C2. Run Subroutine B.

So here is Algorithm D: translate all patches between worlds

D1. Iterate Subroutine C until there are no more patches in DO that are not already in DSd, i.e. that DSd is up-to-date with DO.

Now it is safe to invoke Algorithm D whenever you want to synchronize the two worlds. One of three things will happen:

Outcome 1: the worlds will be synchronized Outcome 2: it will turn out that there is a merge conflict, Algorithm D will stop at that point (either Step A2 or else step B2) and notify the user. Outcome 3: it will turn out that another patch was committed to SO between Step C1 and Step C2. Algorithm D will stop at that point and notify the user.

Limitations:

Darcs patches which were originally svn patches created by svn commit to SO and then transferred into darcs world by this algorithm can have the SVN revision number inserted into their patch name (e.g. by using the following two tailor options: patch-name-format = [r%(revision)s] %(firstlogline)s and remove-first-log-line = True). That's nice! However, darcs patches originally created by darcs record and then transferred into svn world by this algorithm will not have the svn revision number in their patch name in darcs world, although of course they will get a revision number assigned to them when translated into svn world. Too bad that there isn't a way to assign revision numbers to the original darcs-world patch without too many sync/conflict issues cropping up.

comment:3 Changed 6 years ago by zooko@…

darn wiki formatting. reformatted:

Hopefully Brian Warner will read this and correct any flaws in this design. This design is considerably more complicated than the one Brian and I discussed, and I'm not sure if it is that I have discovered and fixed unforseen problems or if I have forgotten a useful simplification.

Let there be two tailor directories, once which ports patches from svn to darcs (called "SD" from here on) and one which ports patches from darcs to svn (called "DS" from here on). Each of these tailor directories contains an svn working dir subdirectory (called "SDs" or "DSs") and a darcs repo subdirectory (called "SDd" or "DSd").

In addition there is (of course) the SVN repo "SO" and an "official" darcs repo "DO".

Let "the internally consistent condition" be such that the contents of all four of the subdirectories SDs, SDd, DSs, and DSd are up-to-date with each other, and furthermore that SDs and DSs are each a subset of SO (i.e., svn diff would show no changes) and SDd and DSd are each a subset of DO (i.e. darcs push DO would have no effect).

Subroutine C will return to the internally consistent state when it completes, if possible. If it is not possible to return to the internally consistent state then there is a conflict which has to be resolved by the user.

Let "the up-to-date condition" be such that the contents of all six of SDs, SDd, DSs, DSd, SO, and DO are up-to-date with each other. The up-to-date condition implies the internally-consistent condition. The complete Algorithm -- Algorithm D will result in the up-to-date state unless there is a conflict or unless new patches are added to SO or to DO after the algorithm has checked for new patches and before it terminates.

I'll first show three subroutines and then show the complete algorithm:

Subroutine A: translate the next patch from SVN to darcs

A1. Use tailor in the SD directory to translate the next patch from SVN to darcs. This will update SDs to that revision and add a new patch into SDd.

A2. Use darcs push --match='hash $HASH' to push the resulting darcs patch from the SDd to DO (where HASH is equal to the hash of the patch which was just created). If the push fails for the reason that pushing would cause a conflict in DO then this algorithm must stop and notify the user of a conflict condition.

A3. Run svn up -r$REVNUMBER in DSs (note: not SDs), with REVNUMBER equal to the revision number of the patch that was translated from SVN world in step A1.

A4. Run darcs pull --match='hash $HASH' in DSd (note: not SDd), with HASH equal to the hash of the patch which was created in darcs world in step A1.

Observe that if the internally-consistent condition held before Subroutine A, and there was no conflict, then the internally-consistent condition must hold after Subroutine A.

Subroutine B: translate the next patch from darcs to SVN

B1. Use tailor in the DS directory to translate the next patch from darcs to SVN. This will pull the next patch from DO into DSd and create a new revision in DSs and attempt to commit it to SO. Let REVNUMBER be the SVN revision number of this patch.

B2. If the attempt to commit the new patch to SO fails due to merge conflict then this algorithm must stop and notify the user of a conflict condition.

B3. Verify that there were no changes in SO between the previous REVNUMBER from the last time step B4 was run (if any) and the current REVNUMBER. If there were changes in SVN between those two revisions then flag a conflict and stop.

B4. Run svn up -r$REVNUMBER in SDs (note: not DSs).

B5. Run darcs pull --match='hash $HASH' in SDd (note: not DSd), with HASH equal to the hash of the patch that was translated from darcs world in step b1.

Observe that the starting from the internally-consistent condition is necessary but not sufficient for Subroutine B to result in the internally-consistent condition. In addition to the internally-consistent condition, another condition is required, namely that SDs is required to have been up-to-date with SO. If SDs was not up-to-date with SO before Subroutine B, then step B4 will, in addition to adding the new patch into SDs also add other patches into SDs which are not also in SDd, violating the internally-consistent condition. This is why step B3 has to be inserted in Subroutine B.

That motivates us to invent this subroutine:

Subroutine C: translate all patches from SVN to darcs and then translate the next patch from darcs to SVN

C1. Iterate Subroutine A until there are no more patches in SO that are not already in SDs, i.e. that SDs is up-to-date with SO.

C2. Run Subroutine B.

So here is Algorithm D: translate all patches between worlds

D1. Iterate Subroutine C until there are no more patches in DO that are not already in DSd, i.e. that DSd is up-to-date with DO.

Now it is safe to invoke Algorithm D whenever you want to synchronize the two worlds. One of three things will happen:

Outcome 1: the worlds will be synchronized

Outcome 2: it will turn out that there is a merge conflict, Algorithm D will stop at that point (either Step A2 or else step B2) and notify the user.

Outcome 3: it will turn out that another patch was committed to SO between Step C1 and Step C2. Algorithm D will stop at that point and notify the user.

Limitations:

Darcs patches which were originally svn patches created by svn commit to SO and then transferred into darcs world by this algorithm can have the SVN revision number inserted into their patch name (e.g. by using the following two tailor options: patch-name-format = [r%(revision)s] %(firstlogline)s and remove-first-log-line = True). That's nice! However, darcs patches originally created by darcs record and then transferred into svn world by this algorithm will not have the svn revision number in their patch name in darcs world, although of course they will get a revision number assigned to them when translated into svn world. Too bad that there isn't a way to assign revision numbers to the original darcs-world patch without too many sync/conflict issues cropping up.

comment:4 Changed 6 years ago by zooko@…

Hm... Actually steps B2 and B3 could handle the exceptional conditions better!

In *either* case step B2 sees that svn commit fails, then it can svn revert, darcs unpull the patch from DSd, and return to Subroutine C. That will mean that either the merge conflict is automatically fixed (because darcs can resolve it automatically were subversion couldn't) or at least that the user will see the merge conflict in darcs world instead of SVN world. (As a darcs patch that has been created in SDd and cannot be pushed to DO due to conflict.)

More importantly, doing that in response to conflict in step B2 means that we avoid some of the "out of sync" problem which is discovered in case B3. That is, the merge conflict in B2 is always, or almost always, due to an out-of-sync problem. That is, a new patch was committed to SO since the last time we ran Step C1, and that new patch conflicts with something in the current patch that we are attempting to translate into SVN world. By having step B2 do a fallback of svn revert and darcs unpull, then we stay in sync.

However, if the svn commit succeeds even though we are out of sync, then we have a worse problem, as we discover in Step B3. Basically, the only nice automated way out of this state would be if we could tell tailor that when we get up to that patch in SVN not to translate that one back into darcs. Well, that doesn't sound so hard to implement. But still, this is getting complicated, considering that the problem would be solved much easier if we could atomically test-and-commit at the SVN repo.

For example, suppose by hook or by crook we could cause Step B2 to say "Verify that the SVN repo SO is currently at the same revision number that it was last time we translated patches from SVN to darcs, then svn commit while preventing anyone else from svn committing to SO before we do.".

Then the out-of-sync problem that we discovered in Step B3 (and indeed also the merge conflict problem that we discovered in Step B2) would go away.

We *could* implement that. For example, turn off all svn access from anyone but our "tailor" user, and then do Step B1, and then turn svn back on.

comment:5 Changed 6 years ago by anonymous

A-ha! svn 1.2.0 introduces "svn lock" and "svn unlock"! Now we can fix step B1 so that steps B2 and B3 are unnecessary! :-)

comment:6 Changed 6 years ago by anonymous

Damn! svn lock works only on individual files. Cannot lock directories, does not naturally offer --recursive.

So we would have to use an ugly workaround such as "sudo mv /usr/bin/svn /usr/bin/svn-locked". :-/

comment:7 Changed 6 years ago by zooko@…

Thanks to maxb on irc.freenode.net #svn for this idea: instead of mv'ing aside the svn executable, install a start-commit-hook that always fails, except for us. :-)

comment:8 Changed 6 years ago by lele

  • Description modified (diff)

Let's follow up on TwoWaySync

Note: See TracTickets for help on using tickets.