Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
How can I find git branches where all branch-local commits are from specific people?
We have a bunch of dead branches in our git repository, and I'd like to clean them up. Ones that were merged (but not deleted at the time) are easy; we can see those in the branch list on Bitbucket. But a lot were abandoned and not deleted, some by people who no longer work here. I want to find and delete those, as nobody else will ever care.
Specifically, I would like to find all branches where the only branch-local commits were from specific people. I can do this one at a time by inspecting the commits on a branch, but I don't want to have to look at them one at a time: I'd like to be able to run a command that will return a list of candidate branches, and then I'll go look at those by hand to confirm their state and delete them.
Is there something I can do, either from Bitbucket or the git command line, to find those branches?
2 answers
From the command line, the following command will give you a list of all authors who have made local-only commits to a branch some-branch
:
git log some-branch --not --remotes --format="%an"
And to get a clean list of branches suitable for scripting:
git branch --format="%(refname:short)"
If you are working in a Bash-compatible shell (which you are if you're on macOS or Linux, in all likelihood), you can use these to make a scriptlet that prints every branch which doesn't have any local-only commits not authored by some set of names:
for b in $(git branch --format="%(refname:short)")
do git log "$b" --not --remotes --format="%an" | grep -qvE 'Jane Doe|Bobby Tables' || echo "$b"
done
(Note that this may include branches that don't have any local-only commits at all! But those are probably also decent candidates for deletion.)
0 comment threads
Let's consider the following git graph.
- The initial commit is 1.
- The colors identify the authors of each commit.
- Orange has left the project.
- Green, Blue and Pink are still active.
- chat, master and tags are the branches.
- master is to be kept.
- tags is to be kept because Pink is working on it.
- chat is candidate for removal because only Orange was working on it.
The following script will list the branches which are candidates to be deleted.
In order to use it two text files need to be created.
- A file with the live branches. That is, branches we know that should not be deleted. In our previous example the file would contain just "master" (without the quotes). The branches in the file are to be separated with spaces and/or new lines.
- A file with the e-mails of people who are considered to have left the project. Also separated with spaces and/or new lines.
Then invoke the script (which I've named dead.sh) with:
bash dead.sh /route/to/file_with_live_branches /route/to/file_with_emails
It will display the list of branches which are candidates for removal. This script uses version 4.0 bash features. So you need to check that your bash interpreter is at least 4.0 version with:
user@machine:~$ bash --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
The script :
#!/bin/bash
# Bash script to locate dead branches
# $1 = File with live branches separated by spaces and/or in different lines
# $2 = File with email addresses of people who "have left" separated by spaces and/or in different lines
#
# Definitions:
# A commit is said to be alive if its the commit directly associated to a live branch or is an ancestor of a live branch.
# The hanging commits of a branch is the set : (ancestors_of_the_branch PLUS commit_directly_associated_to_the_branch) MINUS commits_which_are_alive
# A branch is dead if it has at least 1 hanging commit and all its hanging commits have been written by people in the list of email addresses of people who have left
help() {
echo "Usage:"
echo "dead.sh <live_branches_file> <people_who_left_file>"
echo ""
echo "Locates branches where all hanging commits are from people who have left."
echo "live_branches_file : A file with a list of branches which are considered live. All commits outside these branches are considered as hanging. Branch names should be separated with spaces and/or new lines."
echo "people_who_left_file : A file with a list of emails of people who \"have left\". If all the hanging commits of a non-live branch are authored exclusively by people in this list the branch is considered dead and will be listed by this command. emails should be separated with spaces and/or newlines"
exit 1
}
check_args() {
if [ $# -ne 2 ]; then
help
exit 1
fi
}
declare -A LIVE
live_commits() {
for live_branch in $(cat "$1"); do
THISLIVE=$(git log --pretty=format:"%H" $live_branch)
if [ $? -ne 0 ]; then
echo "Error running command : git log --pretty=format:\"%H\" $live_branch"
exit 1
fi
for l in $THISLIVE; do
LIVE[$l]=0
done
done
}
collect_branches() {
BRANCHES=$(git for-each-ref --format='%(refname:short)' refs/heads/)
RES=$?
if [ $RES -ne 0 ]; then exit $RES; fi
}
declare -A PEOPLE_LEFT
collect_goners() {
# Note that emails (like any URL) are guaranteed to have no spaces
for p in $(cat "$1"); do
PEOPLE_LEFT["$p"]=0
done
}
is_abandoned() {
BRANCH_COMMITS=$(git log --pretty=format:"%H" $1)
is_hanging=1 # False
for commit in $BRANCH_COMMITS; do
if [[ -v LIVE[$commit] ]]; then
continue
fi
AUTHOR=$(git show -s --format='%ae' $b)
if [[ -v PEOPLE_LEFT["$AUTHOR"] ]]; then
is_hanging=0 # True
else
return 1 # Not abandoned because commit author is not among those who left
fi
done
return $is_hanging
}
list_abandoned_branches() {
# Note that git branches are guaranteed to not contain spaces
for b in $BRANCHES; do
is_abandoned $b
if [ $? -eq 0 ]; then
echo $b
fi
done
}
check_args "$@"
collect_branches
live_commits "$1"
collect_goners "$2"
list_abandoned_branches
0 comment threads