0

I have a repo into which I inadvertently put a number of large files into the source directory and didn't notice the error until I checked out to another location. It is huge. It used to download in two seconds and now takes three minutes.

I have removed directories but don't see how to filter out files of a specific type -- by extension or name pattern.

Is there a standard way to do this? Or a hack?

Stephen Boston
  • 580
  • 8
  • 15
  • 1
    This is about private data, but it's the same principle. https://help.github.com/en/github/authenticating-to-github/removing-sensitive-data-from-a-repository – EncryptedWatermelon Nov 21 '19 at 19:01

2 Answers2

3

The problem is that even though you removed the files in your last commit, the files still exist in the repository under the commit when they were added.

BFG

The easiest option to remove the files completely from your Git repository is to use BFG Repo-Cleaner (Reference).

bfg --delete-files name_of_file.txt

Git Filter-Branch

The other option is to use git filter-branch (Reference). The following example completely removes the ./text_files directory from a Git repository.

git filter-branch --tree-filter "rm -rf text_files" --prune-empty HEAD
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d

Afterwards

After you remove the files, you may want to configure Git to ignore them.

echo text_files/ >> .gitignore
git add .gitignore
git commit -m 'Ignore text_files directory'

You will also need to override your remote repository.

git push origin master --force

Note: If anyone else has cloned this Git repository, they will need to reset their local copy to point to the new HEAD.

Garrett Hyde
  • 4,815
  • 7
  • 46
  • 50
  • I had alredy removed some unwanted directories using filter-tree. That was not a problem . However I had large files in the `source` directory so I could not remove the directories in that way. However I have at EncryptedWatermelon's suggestiiong run bfg. But since you've put it as an answer I'll give you the tick. – Stephen Boston Nov 21 '19 at 21:16
0

You can use a wildcard to do all files with an extension. For instance,

git rm *.txt

would remove all files with the .txt extension. The wildcard can replace any text so you might be able to use it for the name pattern too.

  • 1
    Thank you but I believe this will remove only from the current commit. I want to remove all target files from the history. I will try the suggestion from @EncryptedWatermelon – Stephen Boston Nov 21 '19 at 19:18