6

we have a a pretty large SVN repository (50 GB, Over 100000 revisions). Working with it is pretty slow and my guess is that the reason for this is the flat directory structure in db/revs and db/revprops (Where each revision is one file).

We use the FSFS format with SVN 1.5 (On a linux server) but the repo was created with an older SVN version. Now I read that SVN 1.5 supports "sharding" and I understood that this feature distributes the revisions into multiple directories so a single directory doesn't contain so many files. This sounds pretty useful but unfortunately it looks like this feature is only used with repositories which are freshly created with SVN 1.5.

How can I convert a large existing linear repo to a sharded repo? The manual mentions the tool "fsfs-reshard.py" but this script says "This script is unfinished and not ready to be used on live data. Trust us.". So I definitely don't want to use that. Is there an alternative?

bahrep
  • 26,679
  • 12
  • 95
  • 136
kayahr
  • 18,296
  • 27
  • 93
  • 139
  • Although changing the to the new repository format may help, I'm doubtful that it will resolve the performance problems. I'd be interested in knowing if it helps once you've tried it. – Kaleb Pederson Oct 27 '10 at 15:41
  • I think we already use the newest format (or at least the newest available for 1.5 (We don't use 1.6 yet)). If I remember correctly we already did a "svnadmin upgrade" to have the new merge feature thingy. So that may be the reason why svnadmin upgrade exits immediately without changing anything. Looks like "upgrade" doesn't upgrade the directory structure. I'll try out if a full dump/load helps. – kayahr Oct 27 '10 at 16:30

3 Answers3

8

Will an svnadmin dump and svnadmin load do the trick? http://subversion.apache.org/faq.html#dumpload

the_mandrill
  • 27,460
  • 4
  • 58
  • 90
3

The best way is as mentioned dump/load cylcle. But you can try the upgrade.

svnadmin upgrade

Make a copy of your repo first try the upgrade and test it....(don't miss to make a backup).

khmarbaise
  • 81,692
  • 23
  • 160
  • 199
  • The upgrade step is pretty quick, and worth doing. For a repository of that size the dump/load cycle would probably take the best part of a weekend, which may be impractical. – the_mandrill Oct 27 '10 at 15:55
  • Upgrade does nothing. Exits immediately and says it is completed but the repo is still in linear format. I'll try out if dump/load helps. May take some hours. – kayahr Oct 27 '10 at 16:24
1

Because dump/restore process requires lot of disk space and processing time, I have published (in 2010) an improved version of fsfs-reshard.py which includes support for Subversion 1.6 FSFS format 5: https://github.com/ymartin59/svn-fsfs-reshard

It supports switch between linear to sharded layouts, unpacking shards when required. Thanks to shard statistics computation, you may anticipate packed revision sizes selecting appropriate shard size.

Of course it must be used with care:

  • First test procedure on a repository copy if possible
  • Get a backup ready to be restored
  • Prevent access to repository when processing
  • Run svnadmin verify before put it live
Yves Martin
  • 9,755
  • 2
  • 32
  • 74