19

This maybe has been answered, but I didn't find a good answer.
I come from centralized repositories, such as SVN, where usually you only perform checkouts, updates, commits, reverts, merges and not much more.

Git is driving me crazy. There are tons of commands, but the most difficult to understand is why many things work as they do.

According to "What is a bare git repository?":

Repositories created with git init --bare are called bare repos. They are structured a bit differently from working directories. First off, they contain no working or checked out copy of your source files.

A bare repository created with git init --bare is for… sharing. …developers will clone the shared bare repo, make changes locally in their working copies of the repo, then push back to the shared bare repo to make their changes available to other users.
– Jon Saints, http://www.saintsjd.com/2011/01/what-is-a-bare-git-repository/

However, from the accepted answer to "what's the difference between github repository and git bare repository?":

Git repos on GitHub are bare, like any remote repo to which you want to push to [sic].
– VonC, https://stackoverflow.com/a/20855207

However, in GitHub there are source files. I can see them. If I create a bare repository, there are no source files, only the contents for .git directory of a working repository.

How is this possible? What don't I understand?

Can you give an example about why I would need a bare repository and its motivation to work that way?

UPDATE

Edward Thomson's answer is, in part, what I wanted to know. Nevertheless, I will rephrase my question:

First link I posted states("What is a bare git repository?"):

they [bare repositories] contain no working or checked out copy of your source files.

VonC's answer:

Git repos on GitHub are bare

Both statements implies

Github has no working copy.

Edward Thomson says:

it renders the web page based on the data as you navigate through it - pulling the data directly out of the repo and out to your web browser, not writing it to a disk on the fileserver first

Somehow, a bare repository has to contain all data and source code. If not, it wouldn't be impossible to render anything, because I can see all source code updated (commited), all branches (with their respective source), the whole log of a repo, etc.

Is there the whole data of a repository always within .git directory (or in a bare repo), in some kind of format which is able to render all files at any time? Is this the reason of bare repository, while working copy only has the files at a given time?

Community
  • 1
  • 1
Albert
  • 995
  • 12
  • 25
  • the link you provided, http://stackoverflow.com/a/20855207, directly answers your question – Uku Loskit Jun 23 '16 at 13:02
  • A bare Git repository is more like your SVN repository on the server (shared, centralized), while a normal Git repository is more like your SVN working copy. – crashmstr Jun 23 '16 at 13:08
  • @larsks Don't understand why you marked as duplicate. I already posted the link you refer, and it does not answer the question I did. – Albert Jun 23 '16 at 13:49
  • @UkuLoskit, it does not. It says repositories in `github` are bare, but it contains all source code, which contradicts _they contain no working or checked out copy of your source files_ . – Albert Jun 23 '16 at 13:52
  • Okay, you want to work with your friends on a project and all of you agreed to use Git as you source control. You first setup a repo on a server (if you have your server, if not, you use git hosting sites like github) then you and your friends clone this repo. This repo doesn't need a working directory! All it needs is just the .git directory. The bare repo is one without working directory. And this is an example when you need a repo with no working directory (a bare one). – joker Jun 23 '16 at 14:41
  • 1
    @Albert There _is_ no checked out or working copy. GitHub doesn't have a working directory for your repository, it renders the web page based on the data as you navigate through it - pulling the data directly out of the repo and out to your web browser, not writing it to a disk on the fileserver first. – Edward Thomson Jun 23 '16 at 16:20
  • @EdwardThomson, I updated my question to make it more clear. Your comment helped me to rephrase. Thanks. – Albert Jun 23 '16 at 19:45
  • Possible duplicate of [What's the -practical- difference between a Bare and non-Bare repository?](http://stackoverflow.com/questions/5540883/whats-the-practical-difference-between-a-bare-and-non-bare-repository) – Dietrich Epp Jun 23 '16 at 21:23

2 Answers2

10

Is there the whole data of a repository always within .git directory (or in a bare repo), in some kind of format which is able to render all files at any time?

Yes, those files and their complete history are stored in .git/packed-refs and .git/refs, and .git/objects.

When you clone a repo (bare or not), you always have the .git folder (or a folder with a .git extension for bare repo, by naming convention) with its Git administrative and control files. (see glossary)

Git can unpack at any time what it needs with git unpack-objects.

The trick is:

From a bare repo, you can query the logs (git log in a git bare repo works just fine: no need for a working tree), or list files in a bare repo.
Or show the content of a file from a bare repo.
That is how GitHub can render a page with files without having to checkout the full repo.

I don't know that GitHub does exactly that though, as the sheer number of repos forces GitHub engineering team to do all kind of optimization.
See for instance how they optimized cloning/fetching a repo.
With DGit, those bare repos are actually replicated across multiple servers.

Is this the reason of bare repository, while working copy only has the files at a given time?

For GitHub, maintaining a working tree would cost too much in disk space, and in update (when each user request a different branch). It is best to extract from the unique bare repo what you need to render a page.

In general (outside of GitHub constraint), a bare repo is used for pushing, in order to avoid having a working tree out of sync with what has just been pushed. See "but why do I need a bare repo?" for a concrete example.

That being said:

But that would not be possible for GitHub, which cannot maintain one (or server) working tree(s) for each repo it has to store.


The article "Using a bare Git repo to get version control for my dotfiles " from Greg Owen, originally reported by aifusenno1 adds:

A bare repository is a Git repository that does not have a snapshot.
It just stores the history. It also happens to store the history in a slightly different way (directly at the project root), but that’s not nearly as important.

A bare repository will still store your files (remember, the history has enough data to reconstruct the state of your files at any commit).
You can even create a non-bare repository from a bare repository: if you git clone a bare repository, Git will automatically create a snapshot for you in the new repository (if you want a bare repository, use git clone --bare).

And Greg adds:

So why would we use a bare Git repository?Permalink

Almost every explanation I found of bare repositories mentioned that they’re used for centralized storage of a repository that you want to share between multiple users.

See Git repository layout:

a <project>.git directory that is a bare repository (i.e. without its own working tree), that is typically used for exchanging histories with others by pushing into it and fetching from it.

Basically, if you wanted to write your own GitHub/GitLab/BitBucket, your centralized service would store each repo as a bare repository.
But why? How does not having a snapshot connect to sharing?

The answer is that there’s no need to have a snapshot if the only service that’s interacting with your repo is Git.
Basically, the snapshot is a convenience for humans and non-Git tools, but Git only interacts with the history. Your centralized Git hosting service will only interact with the repos through Git commands, so why bother materializing snapshots all the time? The snapshots only take up extra space for no gain.

GitHub generates that snapshot on the fly when you access that page, rather than storing it permanently with the repo (this means that GitHub only needs to generate a snapshot when you ask for it, rather than keeping one updated every time anybody pushes any changes).

VonC
  • 1,042,979
  • 435
  • 3,649
  • 4,283
  • There is a lot of reading... but I'm sure all these links you provided will make me understand how git works, because it is much more different than a centralized version system than I thought. Thanks a lot. – Albert Jun 24 '16 at 02:34
  • @Albert a couple of years ago, when I was starting to learn Git, I found this post on [a successful git branching model](http://nvie.com/posts/a-successful-git-branching-model/) was quite useful . – Paul Rougieux Jun 24 '16 at 13:06
3

Why would I need one ?

The link "but why do I need a bare repo?" from the VonC answer could be completed with two use cases I have found recently.

The first is essential to know imho, while the second could be criticized.

A - To sync you home dot files

No more symlinks which point to you git repo. Just use:

git init --bare $HOME/.myconf
alias config='/usr/bin/git --git-dir=$HOME/.myconf/ --work-tree=$HOME'
config config status.showUntrackedFiles no

where my ~/.myconf directory is a git bare repository. Then any file within the home folder can be versioned with normal commands like:

    config status
    config add .vimrc
    config commit -m "Add vimrc"
    config add .config/redshift.conf
    config commit -m "Add redshift config"
    config push

One of the major benefits is that it prevents nested git repos. More details on the source

B - To host a Git project inside a cloud-synced folder

It is not a good idea to create a .git/ dir inside a cloud-synced folder because the synchronization could mess everything up. But using the same technique as above you can use a bare repository outside the synced dir to use versioning and still have the comfort of a synced dir.

pietrodito
  • 1,288
  • 9
  • 19