Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

My repository size has increased after running BFG to remove large files

cpedersen April 20, 2025

Hi support,

I used BFG Repo-Cleaner to permanently remove large files from my repository’s history, following both Bitbucket’s and BFG’s guidelines. However, after pushing the cleaned repo, the reported repository size increased significantly—from around 2.x GB to 3.65 GB.

After the git push, I received a message saying: “large file detection timed out”, which I understand means Bitbucket couldn’t finish scanning the repo within the allowed time. Could this be why the size is being misreported?

Is it possible to request a full cleanup of the repository on Bitbucket’s side to ensure the correct size is shown? Or could there still be dangling large files that weren’t fully removed? I’ve included all the BFG commands I used further down.

Also, this is the fifth time I’ve tried reaching out. Nothing has happened any of the times I posted — my question hasn’t shown up in the Atlassian Community, there’s nothing under my profile, and I haven’t received any emails explaining what might be wrong. Could you please let me know how I should proceed to get help?

Thanks in advance!

Best,
Camilla

git clone --mirror https://myuser@bitbucket.org/myuser/myrepo.git

java -jar bfg-1.15.0.jar --delete-files "{5_Metro_OLD.unity,5_Metro_ShunterWorks.unity,7_HotelElevator.unity,9_Hotel_OLD.unity,9_Hotel_OLDRemove.unity,AmbienceInfirmary.wav,AmbienceOutsideMildWindWithWater.aif,AmbiencePrisonHallway2Remove.wav,AmbiencePrisonHallway3Remove.wav,AmbienceShowerRoom.wav,AmbienceStaffKitchen.wav,AmbienceStorageRoom-Duplicate.wav,AmbienceStorageRoom.wav,ApartmentOld.unity,ChateauTimelineDemo.unity,PrisonStairs.unity,ApartmentFootstepsTest.unity,SoundtrackBoatREMOVE.wav,AmbienceOutsideMildWindNoBirds.aif,TVTicketOffice 1.wav,Amb outdoors.wav,Amb outdoors no animals.wav,Amb outdoor mix w bass.wav,ApartmentBasementAmbience 1.wav,ambience_washing_room.wav,SoundtrackBoat.wav,ambience_hallway.wav,CryptStaircaseAmbience.wav,AmbienceShowerLoop.wav,11_ApartmentRevisitedCille.unity,Crypt ambience.wav,BoatMastRigCalmWaterLoop.wav,blunt-smoke-png.png,apartment_bass1.wav,layer_3.png,AmbienceChurchOrganREMOVE.wav}" myrepo.git

cd myrepo.git

git gc

git reflog expire --expire=now --all && git gc --prune=now --aggressive

git push

4 answers

3 accepted

0 votes
Answer accepted
cpedersen April 24, 2025

Hi Theodora,


I’ve tried a couple of things to clean up the repository, but haven’t succeeded yet. I now have two questions:


1. Git push failed after BFG cleanup
I attempted to delete more files from the repository using BFG, but when I pushed the changes, I received this error:

git push

Output:

Enumerating objects: 144720, done.
error: RPC failed; curl 55 Recv failure: Connection reset by peer
send-pack: unexpected disconnect while reading sideband packet
Writing objects: 100% (144720/144720), 1.41 GiB | 626.00 KiB/s, done.
Total 144720 (delta 0), reused 0 (delta 0), pack-reused 144720
fatal: the remote end hung up unexpectedly
Everything up-to-date

It doesn't seem like it has pushed anything. Do you know why this happens? Is there anything I can do to prevent it so I can push successfully tonight?

2. Git LFS migration didn’t reduce repo size
I also tried moving large files to Git LFS, but the repository size stayed the same. I wanted to check if I did it correctly and understand why the size didn’t change.


Here are the steps I followed:
git clone --mirror https://myuser@bitbucket.org/myuser/myrepo.git

java -jar bfg-1.15.0.jar --convert-to-git-lfs "{Bigfile1.unity,Bigfile2.unity,Bigfile3.unity}" --no-blob-protection myrepo.git

cd myrepo.git
git gc
git count-objects -Hv

Output:

count: 30859
size: 3.10 GiB
in-pack: 145301
packs: 1
size-pack: 1.58 GiB
prune-packable: 0
garbage: 0
size-garbage: 0 bytes

The size of the repo was 1.59 GB before trying to move the files and the files I try to move are big and has a long history.

Looking forward to your reply!

Best,
Camilla

cpedersen April 24, 2025

Hi Theodora,


Just wanted to update that I’ve now succeeded in pushing the cleaned repository.
I’ll make a separate post to ask for a cleanup on your side— in case you aren't  available right now.


I’d still really appreciate an answer to question 2 about Git LFS. I plan to clean up the repository further, and understanding why the size didn’t change would help a lot.

Best,

Camilla

Theodora Boudale
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 25, 2025

Hi Camilla,

I see that you created a new post yesterday to request a cleanup of the repo.

Regarding your questions:

Question 1: I've seen this error occur with large repos and/or repos with large binary (non-text files) and it usually indicates a network related issue while making a large push. Since you created a post asking to clean up your repo, I assume you were able to push at a later time?

Question 2: There is no way to answer this question definitively without looking at the repo's content (which I cannot access) and BFG logs, and running commands to list large files in the repo.

One suggestion I can make is to run the following command in the mirror clone after the cleanup with BFG:

git reflog expire --expire=now --all && git gc --prune=now --aggressive

instead of only git gc, for a full cleanup. Git gc alone will not do an effective cleanup of the old objects.

After running this command, you can check the size again with

git count-objects -Hv

If you still see a large repo size, then next step is to run the following command in the mirror clone

git rev-list --objects --all \
| git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
| sed -n 's/^blob //p' \
| sort --numeric-sort --key=2 \
| cut -c 1-12,41- \
| $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest

and check a) if the files you removed or migrated to LFS with the BFG command are still listed and b) if any other large files exist in the repo.

Kind regards,
Theodora

cpedersen April 26, 2025

Hi Theodora,


I managed to get the repository size to decrease. It seems like it helped that I installed git-lfs before using the --convert-to-git-lfs command with BFG. Unfortunately, it didn’t decrease as much as I had hoped, probably because some of the files had changed names earlier in the history.


I will make a new post to request another cleanup on your side, just in case you’re not available this weekend.


Thanks for your help — I really appreciate it!


Best,
Camilla

0 votes
Answer accepted
cpedersen April 22, 2025

1. I think the BFG command worked fine. I told it to use the mirror clone that was called "myrepo.git". The report I got from BFG looked right.

2. The command didn't work. I got this reply: 

zsh: command not found: numfmt

3. I got this reply: 

count: 0

size: 0 bytes

in-pack: 144784

packs: 1

size-pack: 1.59 GiB

prune-packable: 0

garbage: 0

size-garbage: 0 bytes

 

I know about the the 1 GB limit. I will try to move files to GIT LFS. I think there is a post about how to do that somewhere.

 

 

Theodora Boudale
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 22, 2025

Hi Camilla,

Since the mirror clone has a size similar to the one in Bitbucket, then the size is because of the content of the core Git repo.

There may be additional large files other than the ones you removed with BFG, or it could be also the same files, for the following reason: BFG doesn't modify the contents of your latest commit on your master (or 'HEAD') branch. If the large files exist in that commit, BFG won't touch them, unless you run it with the flag --no-blob-protection. More info on BFG's documentation, section "Your current files are sacred..."

You can try running BFG again with the flag --no-blob-protection, and after you do this and run git gc on the mirror clone where you ran BFG, check its size again with git count-objects -Hv.

Regarding the command that didn't work with the error

zsh: command not found: numfmt

you'll need to install coreutils on your computer for numfmt to work.

On MacOS you can install coreutils by running

brew install coreutils

On Linux, the command depends on the Linux distribution and the package manager supported in this Linux distribution.

With apt-get:

apt-get -y install coreutils

With apk:

apk add coreutils

With yum:

yum install coreutils

After coreutils is installed, you can run the command again to identify the largest files in the repo and run BFG for additional files, if needed.

 

If you want to use Git LFS instead of removing the files from the repo's history completely, you can check this guide:

Please keep in mind the available LFS space for each workspace depending on the billing plan:

Regardless of the billing plan, you can always add an overage to a workspace for more LFS and get 100 GB of Git LFS as needed for $10 / month.

Please feel free to reach out if you have any questions.

Kind regards,
Theodora

cpedersen April 22, 2025

Thanks a lot for your help.

Theodora Boudale
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 23, 2025

You are very welcome. If you reduce the repo size further (either by removing files from the repo's history or by moving them to LFS), please feel free to let me know so that I run another git gc for your repo.

Kind regards,
Theodora

cpedersen April 23, 2025

Thanks. Then I will just post it on this thread.

0 votes
Answer accepted
Theodora Boudale
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 22, 2025

Hi Camilla and welcome to the community!

When you clean up a repo with BFG and push to Bitbucket, the remote repo's size may go up because the repo will have both old commits (as dangling commits) and new commits (after history rewrite). A git gc is needed to clear the dangling commits. A git gc runs automatically on every push, but with different parameters every time depending on many different conditions. If the automated git gc doesn't reduce your repo's size, you can always create a question in community and ask that we run a manual one.

I ran a git gc for your large repo and its size has decreased. Please keep in mind though that it is still over 1 GB and your workspace will be put in read-only mode on 28th April if it's not reduced further or if you don't upgrade to a paid billing plan:

Could you please let me know the following:

1. Looking at the commands you provided, it looks like you took a mirror clone, and then you ran the BFG command without doing a cd to the directory of the mirror clone. Can you please confirm if this is the order that you executed these commands?

You would need to cd to the mirror clone first and then run the BFG command there.

2. Can you please take a new mirror clone of the repo, change directory to the mirror clone, and then run the following command:

git rev-list --objects --all \
| git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
| sed -n 's/^blob //p' \
| sort --numeric-sort --key=2 \
| cut -c 1-12,41- \
| $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest

Do you see any large files listed in the output of this command?

3. In the same directory of the new mirror clone, can you please run the command

git count-objects -Hv

and let me know what output do you get? This is to understand if there is a mismatch in size between the new mirror clone and what Bitbucket reports as size now.

Kind regards,
Theodora

cpedersen April 22, 2025

I just used another script to check if the files were removed from the repository, and it looks like they’ve all been successfully deleted. Thank you.

0 votes
cpedersen April 20, 2025

Yay, it finally posted! If anyone else runs into this—make sure there are no invisible tags in your text. You don’t get any warning, it just silently fails.

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PERMISSIONS LEVEL
Product Admin Site Admin
TAGS
AUG Leaders

Atlassian Community Events