2

I currently use SCCS for source control, but this question applies to version control systems in general.

In order to determine when a piece of code originated, I sometimes open up SCCS s. files to poke around and match up the insertion commands around the code with the version headers. Of course, after a while these files become pretty hard to read that way.

Is there a way to 're-origin' an SCCS file so that the first stored version is whatever was there, say, 5 years ago - and only those deltas since then are preserved? I suppose that could be done by checking out each version you want and re-delta'ing them into a new repo, but somebody must have automated that process, no?

Or, better yet(?), is there a utility that will show the current version of a file annotated with the version # associated with each line? I just came across documentation for an 'annotate' command in bitkeeper. That the kind of thing I'm asking about. Oops. I see that the "sccs get -m" command does that (and I actually have an option in my sccs wrapper script to do it). Sorry - what a dope.

The first 're-origin' question still stands, though...

littlenoodles
  • 379
  • 2
  • 11

2 Answers2

1

You're mixing together some separate concepts (as you found out with the annotate sub-command). Annotation (git annotate, svn annotate, and so on) or sometimes called "blame" asks the VCS to show you information about both the source itself and the earliest revision that had that particular source fragment—usually a line, since we mostly deal with "lines of code"—in the same form as it does in one particular revision.

But that's not what you are still looking for, so let's dive in. All version control systems have a problem: if you are going to store every version of every file, you could require huge amounts of storage-space.

Most VCSes therefore use some kind of compression, and most immediately jump straight to using delta encoding, also called delta compression: store one version fully intact, and instead of storing a second version intact, store only a set of instructions that can be used to modify the stored-intact version in order to obtain the second version.

Repeat this for many versions and you get chained deltas, where you start with some initial version and repeatedly modify it to get a final version. Initially, SCCS used this in a "forward" direction: store the initial version, then store the first change to it as the second version, then store the change to the second version as the third version, and so on. This, clearly, is somewhat slow. RCS therefore uses reverse deltas: store the latest version intact while storing deltas to compute earlier versions from their successors. This makes retrieving the most recent version fast, and the oldest version slow—usually what one wants—but it has a different drawback: it doesn't work properly with branches. RCS therefore actually uses reverse deltas on its trunk and forward deltas on its branches.

Since CVS is (or was) built atop the RCS file format, it too uses this reverse-yet-also-forward delta storage format.

For performance, modern SCCS uses a concept known as interleaved deltas, where the repository stores what amounts to a sort of union of all versions. A linear pass through a stored file suffices to extract any particular version. (The linked Wikipedia page says Bitkeeper also uses interleaved deltas.)

I don't personally know what Subversion uses internally, but How exactly does subversion store files in the repository? suggests it uses reverse deltas plus snapshots. Baking in a snapshot—another "fully intact" version—now and then provides a way to set a limit on the number of deltas to apply.

Git does something fairly different. Instead of storing each file separately and delta-compressing against previous versions of that file, Git stores what it calls objects. Each object is, at least logically, stand-alone (a snapshot)—so that no version of any file is ever delta-compressed, though each snapshot is zlib-compressed. To save additional space, Git occasionally compresses multiple objects into a single "pack file", and here the pack file internal objects do get delta-compressed ... but against any other objects, not just ones that represent the same source file. (In particular this allows Git to compress tree objects using other tree objects.)

In the end, what this means for Git is that it uses delta-compression, in potentially any direction—this might be mixed forward and reverse—with snapshots to keep chain lengths limited (--depth or pack.depth), but of objects against objects, not necessarily just one file against another version of the same file. For practical purposes, Git does not choose these objects totally arbitrarily (it's too hard to know which objects will compress well with other ones in advance) but instead does it by object type, object size, file name as found in a tree object, and age (I'm not sure where it gets these ages from), so that Git often winds up with the equivalent of reverse deltas, but it's not guaranteed. (Git also re-uses precomputed delta chains from previous packs unless you tell it not to; see --no-reuse-delta aka -f.)

As a general rule, there's no way to tell a revision system to completely revamp its internal storage format. Some VCSes may have exceptions. For instance, Git is a bit unusual since git repack does exactly that, but not in a way that's useful to your particular use case!

torek
  • 330,127
  • 43
  • 437
  • 552
0

(I have to wonder why on Earth you're using SCCS.)

Such commands vary across different source control systems. SCCS is positively ancient, and is unlikely to have such a command.

In RCS (which is also ancient, but slightly more modern than SCCS), you can use the "-o" option to remove ("outdate") old versions of files. For example, if you have a file with stored revisions 1.1, 1.2, 1.3, ..., then you can use

rcs -o1.1 filename

to remove version 1.1 from the history, or

rcs -o1.1:1.3

to remove revisions 1.1, 1.2, and 1.3.

is there a utility that will show the current version of a file annotated with the version # associated with each line

For most modern systems, yes:

  • CVS: cvs annotate filename
  • SVN: svn blame filename (also praise, annotate, ann)
  • Git: git blame filename
  • Mercurial: hg annotate or hg blame

I don't believe either SCCS or RCS has such a command. It's easy to import RCS files into CVS (just copy filename,v into the CVS repo), and there are probably automated tools to convert an SCCS repo into an RCS repo.

These commands show a line-by-line annotation of the current version of the file, which means they won't give you any information about deleted lines. For that, you can use some diff-like command to compare specified versions (rcsdiff, cvs diff, svn diff, git diff, hg diff).

torek
  • 330,127
  • 43
  • 437
  • 552
Keith Thompson
  • 230,326
  • 38
  • 368
  • 578
  • I thought those 'remove versions' options removed the associated changes as well. Is that not true. I want to keep everything that's in the current version and just lose ancient history of how it got there. – littlenoodles Sep 28 '17 at 19:53
  • 1
    Using SCCS, because we're on AIX and that's what was there when we started writing our system in the 80's. I've got some decent scripts that wrap up SCCS commands to make it easy to use across a project, tailored to the way we've been working - and migration always seemed too hard. But yeah, we're dinosaurs.. ;-) – littlenoodles Sep 28 '17 at 19:55
  • @littlenoodles: I'm not quite sure what you mean by "removed the associated changes". If you have revisions 1.1, 1.2, 1.3, and 1.4 in your repo, then `rcs -o1.1:1.2` will leave you with revisions 1.3 and 1.4 -- and they'll be unchanged from what they were before the command. You'll just lose the old history. I suggest you create a small test repo to verify this. – Keith Thompson Sep 28 '17 at 19:56
  • Oh, then it does do what I asked. The documentation wasn't clear enough - or I was too dense to get it. It read like 'undo that set of changes' - but who would practically want to do that. Thanks for the clarification, though. – littlenoodles Sep 28 '17 at 19:59