4

I've been searching & googling a lot about this question, and I already know how to compare two files (hashes, checksums, etc.). But it's not quite what I need. What I need is described below.

Lets assume I have a file and I've backuped it. Later I've made some changes to this file, so I want to apply changes to the backup version. Since two files can be big enought and changes can be small, I don't want to rewrite all the file, because I'm planning to backup it though the internet (maybe FTP) wich can take a lot of time.

How I see this (sample):

Backup version of file (bytes)

134 253 637 151

Newer version of file (bytes)

134 624 151 890

Instead of rewriting all bytes, we should:

  1. change 253 to 624 (change bytes)
  2. remove 637 bytes (remove bytes)
  3. write 890 at the end of file (insert bytes)

The 1,2,3 options do not necessarily appear at once in each case. Note that the backup file could be located somewhere else, and I only have acces to it through the internet (server could return something so we can compare files).

How can I achive this? I know it's possible cause I know software where it's implemented (but couldn't find out how). Any hints, tutorials, etc. is welcomed and highly appriciated. Thanks in advance.

GaaRa
  • 510
  • 6
  • 21
  • 1
    Have you checked out the implementation of `diff`? (http://en.wikipedia.org/wiki/Diff) – Avner Shahar-Kashtan Jun 18 '12 at 14:40
  • yep, I've seen this, and I know I can find the difference, but I don't actually understand how sohuld I transfer some bytes and force them to be inserted in specific place... should I transfer these bytes with additional place number? – GaaRa Jun 18 '12 at 14:43
  • Are you comparing text or binary info - you might want to look into the algorithms behind binary diff tools like http://www.daemonology.net/bsdiff/ – Slugart Jun 18 '12 at 14:43
  • Um, isn't this the same as just overwriting the backup with the new version? – Raymond Chen Jun 18 '12 at 14:47
  • Slugart - both files can be present | Raymond Chen - nope, it's just swapping changed bytes. if file of 2GB changed with few bytes, I'm not forced to transfer 2GB, just these few bytes – GaaRa Jun 18 '12 at 14:55

1 Answers1

8

You're trying to solve the same problem that every MMORPG has solved... creating and applying small patch files to update older versions of large binaries.

This is a well-studied problem and there are a number of solutions out there. For several existing options, see

Binary patch-generation in C#

Community
  • 1
  • 1
Eric J.
  • 139,555
  • 58
  • 313
  • 529
  • I understand that I can also apply it to make my software updatable? – GaaRa Jun 18 '12 at 14:48
  • Yes, in fact Windows Installer can generate patch files (as can a number of other open source and commercial solutions). – Eric J. Jun 18 '12 at 15:24