Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Considerable change in repository size when importing

Michael Ligotino March 20, 2023

Hello,

We are about to migrate to Bitbucket from GitLab. In some preliminary tests, we noticed a large difference in repository storage size when importing a project to Bitbucket from GitLab (using Bitbucket's import functionality). As an example, a 640 MB repository becomes 192 MB on Bitbucket. Everything is imported correctly, and the history seems intact. This drastic size difference remains however a bit mysterious. We would like to make sure that really nothing is lost in the process.

I would be happy to understand more rigorously the reason of this change, and would gladly take any input / advice on how to debug this.

Best regards,

p.s. we are not using Git LFS

1 answer

1 accepted

0 votes
Answer accepted
Theodora Boudale
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 21, 2023

Hi @Michael Ligotino and welcome to the community!

When you import a repo, all the branches and history of the .git repo will be imported (the only exception is Git LFS objects, that you are not using).

One possible reason for this discrepancy may be inaccessible commits in the GitLab repo. If you do any of the following:

  • create a branch, add commits to it, push it, and then delete the branch without merging it
  • alter history and push it to the remote repo (e.g. with commands like git reset or git rebase)

then certain commits will become inaccessible. These inaccessible commits remain in the remote repo unless a garbage collection runs (git gc). When a git gc runs for a repo, it can run with different parameters that determine which inaccessible objects will be deleted.

A git gc also compresses file revisions and there are parameters that control the depth of compression.

So, the size of the repo in GitLab depends on when the last garbage collection ran for that repo and with what arguments.

When you import a repo these unreachable objects will not be imported.

You can try taking a bare clone of the GitLab repo on your machine, with the command

git clone --bare <gitlab_repo_url>

Then navigate to the directory of the bare clone, and run the command

git count-objects -Hv

The sum of the fields size and size-pack in the output will show you the size of the bare clone.

Kind regards,
Theodora

Michael Ligotino March 22, 2023

Dear Theodora,

Thank you so much for your answer. This will definitely help me understand more what is going on.

Kind regards,

Michael

Theodora Boudale
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 22, 2023

You are very welcome, Michael. Please feel free to reach out if you ever need anything else!

Kind regards,
Theodora

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PERMISSIONS LEVEL
Product Admin
TAGS
AUG Leaders

Atlassian Community Events