How to Remove a Large File from Commit History in Git?

If you develop a game in Unity and save assets like images or audio files on Github without using Git FS, your repo’s size might become bigger and bigger when you make any changes to Github. Given the fact that Git History will record all the changes of codes, images, audio files, or even binary files.

This could make the building time of games become longer and delay your work process because it’s will take too much time in CI/CD Pipeline while downloading assets from Github and even building the games.

How to clean up those large files in Git Commit History?

In this post, we will provide a guide and practical example to resolve this problem.

Step 1: Check the Size of Git History

du -hs .git/objects

Step 2: Create a script to find larges files

Create git_find_big.sh script

vim git_find_big.sh#!/bin/bash
#set -x
# Shows you the largest objects in your repo's pack file.
# Written for osx.
#
# @see http://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/
# @author Antony Stubbs
# set the internal field spereator to line break, so that we can iterate easily over the verify-pack output
IFS=$'\n';
# list all objects including their size, sort by size, take top 10
objects=`git verify-pack -v .git/objects/pack/pack-*.idx | grep -v chain | sort -k3nr | head`
echo "All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file."output="size,pack,SHA,location"
for y in $objects
do
# extract the size in bytes
size=$((`echo $y | cut -f 5 -d ' '`/1024))
# extract the compressed size in bytes
compressedSize=$((`echo $y | cut -f 6 -d ' '`/1024))
# extract the SHA
sha=`echo $y | cut -f 1 -d ' '`
# find the objects location in the repository tree
other=`git rev-list --all --objects | grep $sha`
#lineBreak=`echo -e "\n"`
output="${output}\n${size},${compressedSize},${other}"
done
echo -e $output | column -t -s ', '

Step 3: Find Top 10 Large files

Run

./git_find_big.sh

Step 4. Install git-filter-repo

Run

pip3 install git-filter-repo

Step 5: Remove large files from the .git history

You can get the Top 10 large file names from Step 3.

Then run below to delete them in .git history.

git filter-repo --force --path <large file path> --invert-paths

This command will rewrite your .git history and delete it.

Step 6: Clean Unused Fils

git reflog expire --expire=now --allgit gc --prune=now

Step 7. Add Git Remote URL and Push Back the Result

# Add Your Github URL
git remote add origin <Your Github URL>
# Push the result to all the branches
git push --all --force
# Push the result to origin repo
git push -u origin --all
git push -u origin --tags -f

Conclusion

This tutorial provides a solution to optimize your repo’s size on Github. Remember you should use Git FS to save a large size of contents.

If not, don’t forget to regularly check whether there are some large files in Git History or not.

Making your game codes as small as possible not only can boost work efficiency, but also provide a high-quality game on time and on budget. Because it’s easy to maintain codes and add new features with fewer efforts.

--

--

--

Senior Full Stack Engineer & Solution architecture | AWS, GCP | Cloud, Unity Game Development, SDK, DevOps, and more.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Kubernetes Overview

4 Recommendations of Arduino Types for Beginners

Uno R3 DIP Arduino

Google like a pro

Advanced Googling in 2021

Using Gritter in Rails 6

Final page design

Enabling JavaScript/TypeScript Code Coverage in a VSTS/TFS Build

Where’s My Fish? Revisited Part 3

Gatsby with Cockpit Headless CMS

Flutter vs Reactjs: Which One to Choose in 2022?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Eric Wei

Eric Wei

Senior Full Stack Engineer & Solution architecture | AWS, GCP | Cloud, Unity Game Development, SDK, DevOps, and more.

More from Medium

Initializing A Git Repository in Unity

Setting up GIT for Unity

Adding the speed boost powerup