Lost in the Git Orcus

UPDATE: There is a new post about a much improved version of this script: Lost in the Git Orcus (Part 3)

The Problem

Recently I lost a file in the Git orcus (aka './git/objects').
I knew it had been there some time ago.
But now it was gone and the Git on-board tools ('rev-list', 'fsck', 'reflog', ...) didn't help.
They showed a lot of dangling commits, trees and blobs but not the file I was searching for.

The Solution (maybe not the best)

I was doomed to dive into the .git/object-slough.
Based on a shell script on stackoverflow.com from willkil
I created a small Perl script:
  • It has one command line argument: the file to search for (as a regex).
  • In the first phase it looks into each tree object and checks whether it contains the file.
  • In the second phase it looks into each commit object and checks whether it points to one of the tree objects found in phase 1.
  • At last it creates a branch ('found/0', 'found/1', ...) for each commit found in phase 2.
  • It does not look into packed objects.
Here is the script:

#!/opt/local/bin/perl

my $pattern = $ARGV[0]; # filename to search for as regex

##################################################################
# find all objects (blobs, trees, commits, tags) in .git/objects
##################################################################

my @AllFiles = `find .git/objects/`;
my @AllSha1 = ();

for my $object (@AllFiles) {
	if($object =~ /([0-9a-f][0-9a-f])\/([0-9a-f]{38})/) {
		my $sha1 = $1 . $2;
		push(@AllSha1, $sha1);
	}
}

##################################################################
# find all trees containing the file
##################################################################

my @FoundTrees = ();

for my $treeSha1 (@AllSha1) {
	chomp $treeSha1;
	my $type = `git cat-file -t $treeSha1`;
	chomp $type;

	if($type eq "tree") {

		my @Lines = `git cat-file -p $treeSha1`;
		for my $line (@Lines) {
			if($line =~ /$pattern/) {
				printf "found tree: %s\n", $treeSha1;
				push(@FoundTrees, $treeSha1);
				break;
			}
		}
	}
}

##################################################################
# find all commits pointing to one of the trees found above
##################################################################

my @FoundCommits = ();

for my $commitSha1 (@AllSha1) {
	chomp $commitSha1;
	my $type = `git cat-file -t $commitSha1`;
	chomp $type;

	if($type eq "commit") {

		my @Lines = `git cat-file -p $commitSha1`;
		for my $line (@Lines) {
			for my $foundTreeSha1 (@FoundTrees) {
				if($line =~ /$foundTreeSha1/) {
					printf "found commit: %s\n", $commitSha1;
					push(@FoundCommits, $commitSha1);
					break;
				}
			}
		}
	}
}

##################################################################
# create a branch for each commit found
##################################################################

my $i = 0;
for my $foundSha1 (@FoundCommits) {
	my $branchName = "found/" . $i;
	`git branch $branchName $foundSha1`;
	printf "branch created: %s pointing to %s\n", $branchName, $foundSha1;
	$i++;
}


  • Be aware! This script creates a branch for each commit to a tree with the file you're searching for. Could be some!!!
  • Possible improvement: Only create a branch for the most recent commit (or track the chain of commits and search for the head of it)

The Question

Isn't there an easier way? Let me know!