On the question Why is my git repository so big?, is found this script to list the repository big files:
git rev-list --all --objects | \
sed -n $(git rev-list --objects --all | \
cut -f1 -d' ' | \
git cat-file --batch-check | \
grep blob | \
sort -n -k 3 | \
tail -n40 | \
while read hash type size; do
echo -n "-e s/$hash/$size/p ";
done) | \
sort -n -k1
But is outputs the file sizes on a bad readable way as:
89076 images/screenshots/properties.png
103472 images/screenshots/signals.png
9434202 video/parasite-intro.avi
I would like it to display the sizes within a more reasonable way as:
89.076 KB - images/screenshots/properties.png
103.472 KB - images/screenshots/signals.png
9,434.202 KB - video/parasite-intro.avi
For example, this last one 9,434.202 KB
should mean 9.434 MB
or 0.9434 GB
. But I am not sure whether is the best to use 9.434,202 KB
, i.e., just replacing the comma with the dot and vice-versa.
Initially to do it I could think of generating the whole list and afterwards process it. But I think would already be nice to this while the list is being generated. Therefore would not be possible to predict the right side
justification, however would already be fine to print the list like this below, without the right side
justification:
89.076 KB - images/screenshots/properties.png
103.472 KB - images/screenshots/signals.png
9,434.202 KB - video/parasite-intro.avi
I think the printing is being performed by this line:
echo -n "-e s/$hash/$size/p ";
However I do not understand how to format the $size
parameter.