It is very common for new or learning developers to accidentally add files with large size or sensitive data to a Git repository. This can happen for a variety of reasons, such as not understanding how Git works, not knowing what files should be ignored, or not realizing that certain files contain sensitive information.
If you accidentally committed a file with sensitive data on GitHub, it is important to act quickly to remove the data from the repository.
Let us look at some scenarios for deletion of files from Git repository.
Delete File in Git Repository and Filesystem
Easiest way to delete a file from git repository is using git rm command. This will delete the file from Git Repository as well as the filesystem
git rm <filename> git commit -m "Deleted the file from the git repository" git push
After running the git rm command to delete a file from your Git repository, the file is removed from your local working directory, but it is not yet removed from the Git index. You need to commit your changes using the git commit command. This will create a new commit that reflects the deletion of the file. And finally do a git push.
Note: It’s important to note that the git rm command can be dangerous if used improperly, as it can permanently delete files from your repository. Always make sure to double-check before running the command, and make sure you have a backup copy of any important files you want to delete.
Delete Files Recursively on Git
To delete files recursively on Git, you can use the git rm command with the -r option, which tells Git to recursively remove all files and directories inside the specified directory.
It will be deleted from local working directory as well as the Git repository.
Here’s how you can delete files recursively on Git:
Use the git rm command followed by the -r option and the name of the directory to be deleted.
git rm -r <foldername> git commit -m "Deleted the folder from the repository" git push
Similar to git rm, you need to commit the changes followed by push command.
Note: Keep in mind that deleting files recursively on Git will permanently delete all files and directories inside the specified directory, and they cannot be recovered unless you have a backup copy of the files. Therefore, make sure to double-check before executing the git rm -r command, and take appropriate precautions to avoid accidentally deleting important files.
Delete Files on Git Repository only
In some cases, you want to delete files from the Git repository but not from the filesystem like if you accidentally committed a file that contains sensitive information such as passwords, API keys, or other confidential data. In this case, you want to remove the file from the Git repository to prevent the sensitive information from being accessed by others, but you want to keep the file on your local filesystem so that you can continue to use it OR when you have committed large files or files that are not necessary for the project but take up a lot of space. In this case, you want to remove the files from the Git repository to reduce the size of the repository and improve performance, but you want to keep the files on your local filesystem so that you can continue to work with them if needed.
By using the –cached option with the git rm command, you can remove the files from the Git repository without deleting them from the local filesystem. This allows you to maintain a clean Git history without losing any important files.
git rm --cached <filename> git commit -m "Deleted file from repository only" git push
Note: If there is any file which you don’t want to be added to the Git Repository, then add it to the git ignore file.
touch .gitignore # Content of the gitignore file filename
Commit your gitignore file and you should be good to go!
Delete Files From Git History
There are several reasons why you may want to delete files from Git history:
- To remove sensitive information
- To reduce the repository size
- To improve readability
- To correct mistakes
In summary, deleting files from Git history can help keep the repository clean, improve performance, and protect sensitive information. However, it is essential to be cautious when deleting files from Git history, as it can affect the project’s history and the work of other team members.
In order to delete file from Git history, you have to use the “git filter-branch” command and specify the command to be executed on all the branches of your Git history.
$ git filter-branch -f --prune-empty --index-filter "git rm -r --cached --ignore-unmatch <path of file>" HEAD
This command can take a while, if your history contains many files, but in the end, the deleted file won’t be in your Git history anymore.
After deleting a file from Git history, you can use the git log command to verify that the commits linked to the deleted file have been pruned.
To remove the commits that reference the deleted file, you need to run the git gc command to clean up the repository and remove any unreferenced objects. This command will prune the commits that no longer have any references in the repository.
After running git gc, you can use git log to verify that the commits linked to the deleted file have been pruned. If any of the commits are still present in the repository’s history, you may need to run the git reflog expire command to expire the reflog entries that reference the deleted file. This command will ensure that the deleted file is no longer accessible from the repository’s history.
So deleting a file from a Git repository is a straightforward process, and there are several ways to do it depending on your specific use case.
In any case, it’s always a good idea to commit your changes after deleting a file from a Git repository to ensure that the change is permanent and recorded in the Git history.