Ticket #171 (new defect)
converting darcs' darcs to git results in a corrupted repo
| Reported by: | vmiklos | Owned by: | lele |
|---|---|---|---|
| Priority: | major | Milestone: | VersionOne |
| Component: | tailor | Version: | 0.9 |
| Keywords: | Cc: |
Description (last modified by lele) (diff)
Hi,
Here is the config I used:
$ cat config
[DEFAULT] encoding-errors-policy = replace [sandbox] source = darcs:sandbox target = git:sandbox [darcs:sandbox] subdir = darcs repository = /path/to/sandbox [git:sandbox] subdir = git repository = /path/to/sandbox.git
Where sandbox is http://code.haskell.org/darcs/big-zoo/darcs-repo-2008-10-31.tar.bz2
I started the conversion in non-verbose mode, it converted 6422 of 6548 changesets and exited without any error. As I guessed, the result does not match the original repo.
Given that the repo is public, I hope you can reproduce the error.
Sadly I'm not sure where the error occures, the conversion took 373 minutes on my machine.
$ darcs --version 2.1.2 (+ 266 patches)
$ git --version git version 1.6.0.1
I am using darcs from the darcs repo (and not the latest release) as I had other problems and the suggested fix on the mailing list was to use the version from the repo.
If I missed any important info, please let me know.
Thanks.
Change History
comment:2 Changed 3 years ago by vmiklos
OK, here is a way I think I can quickly reproduce a similar problem:
git clone git://vmiklos.hu/darcs-fast-export
cd darcs-fast-export/t
sh test2-git.sh
this will create a darcs2 repo under 'test2'.
when I convert this to git with the same config as above, the result differs as well.
Hope this helps. :-)
comment:3 Changed 3 years ago by vmiklos
OK, I have a bad and a good news.
The good one is that I figured out what is the problem in a small testcase.
The bad one is that I really have no idea how to solve it.
Create the following repo:
dr init echo a > a dr add a dr rec -a -m i dr mv a b rm b dr rec echo a > a echo b > b dr add a b dr rec -a -m "add a b" rm b dr rec dr mv a b dr amend-rec
and if now you do a dr chan --xml -s, you get the same output for two totally different cases:
1) rename A B, and remove B
2) remove A, rename A B
and there is no way to figure out which one did you want to do. In other words as long as darcs puts those "move" lines on top of the xml output and tailor uses only the xml output for info, it can't properly convert this repo.
Feel free to prove me wrong. :-S
Thanks.
comment:4 Changed 3 years ago by vmiklos
Just before I forget it, let me add that in fact this seem to be a darcs bug, so a method could be to fix it in darcs, then no workaround will be needed in tailor.
The relevant darcs bug is http://bugs.darcs.net/issue1281.

I tried this out, and I clearly see something went wrong, although I can't say in which way...
First of all, it took more than 20 hours here to complete the migration, on a dual core AMD64 with 2Gb of RAM...
Tailor completed its task without errors, effectively producing 6422 git changesets out of 6548 darcs changesets. I do not have time right now to investigate further, but I bet that many of the "missing" changes are related to darcs operations that have no impact on the source tree, such as setprefs for example.
What makes me sad is seeing the huge difference in the resulting trees: looking at the root directory alone, the darcs side contains just 20 entries while there are 45 in the git side. Quickly inspecting one entry, "/darcs-createrepo.lhs" I see that
while
and apparently the last patch that touched it, effectively removing the file, did not have the right effect on the git side: