Anyone who's tried modifying a git repository with code is likely to have discovered a few problems with the experience.
The most common one I encounter is the result of changes having to be done in the git working directory.
This occurs mostly when there are files in a state of non-commit, or files are present in the directory, in an unrevisioned state, and the same file exists on another branch, in a revisioned state.
Here you have a blocking scenario when you try check out that branch.
You can't checkout the other branch, because of the file system collision.
Also, you have a problem that occurs when 2 processes simultaneously try working with the same code check out, ie: You have a cron job that copies files into the master branch, or a cron job that copies files out of some other branch, etc, etc.., etc, or for some reason, programs which rely on a given file from a given branch being the one it sees all the time.
Generally, the easy way to get around the file system collision fun is with
The most common one I encounter is the result of changes having to be done in the git working directory.
This occurs mostly when there are files in a state of non-commit, or files are present in the directory, in an unrevisioned state, and the same file exists on another branch, in a revisioned state.
Here you have a blocking scenario when you try check out that branch.
You can't checkout the other branch, because of the file system collision.
Also, you have a problem that occurs when 2 processes simultaneously try working with the same code check out, ie: You have a cron job that copies files into the master branch, or a cron job that copies files out of some other branch, etc, etc.., etc, or for some reason, programs which rely on a given file from a given branch being the one it sees all the time.
Generally, the easy way to get around the file system collision fun is with
git stash
, then git checkout $otherbranch
, do what you need to do , git checkout $original
, and git stash apply
to get it back. That however, is too much of a dance for a human to do when they know everything, let alone a bit of naïve code with a stack load of broken conditional checks. Let alone the code do it right with all the other fun stuff that can occur. Do it in the Rams
The great thing about git, is you don't *need* to do everything in the working directory. If you know the internals well enough, you can perform the whole commit on the other directory, without ever needing to check it out. This is especially handy when one branch is purely automated generation based on another.
With a helping hand from
With a helping hand from
Git::PurePerl
its possible to build everything from scratch without needing to modify files in the working directory. Unfortunately, the documentation is a bit sparse, the whole dist could be a bit enhanced documentation wise, but the good news is, for the most part, it works great, and its structured well enough you can work it out by reading the code.The Phases of commit generation
A commit is composed of 3 main pieces of data:
- Commit Metadata
- Comprised of authors ( the author, and the committer), timestamps ( Commit timestamps, Author timestamps ), commit message, and an optional commit parent
- A tree object
- Every commit refers to a tree object. A tree object essentially is a list of files, and metadata about files. Tree objects can also refer to tree objects, and this forms a sort of directory structure.
- A File object
- Essentially a blob of data.
git
command line in the ProGit book, Pro Git, Chapter 9: Git Internals, but this guide will focus more on how to do it with Git::PurePerl
, and entirely in-memory.Injecting the files objects
DIY Commits have to be composed in reverse order, you need to create the files, then create the trees, then create the commits.
Congratulations. You just stored a file blob in your git database. Although nothing refers to it at present, its not unlike a detached node in a graph, and the next time somebody calls
Now, lets try that again with a scattering of objects:
You'll now note if you execute
use strict;
use warnings;
use Git::PurePerl;
use Git::PurePerl::NewObject::Blob;
my $git = Git::PurePerl->new(
gitdir => '/some/dir/foo/.git' # Or use the less direct 'directory' form.
);
my $blob = Git::PurePerl::NewObject::Blob->new(
content => 'String Of File Content',
);
$git->put_object( $blob );
Congratulations. You just stored a file blob in your git database. Although nothing refers to it at present, its not unlike a detached node in a graph, and the next time somebody calls
git prune
in the repository, that file blob will vanish again.Now, lets try that again with a scattering of objects:
use strict;
use warnings;
use Git::PurePerl;
use Git::PurePerl::NewObject::Blob;
my $git = Git::PurePerl->new(
gitdir => '/some/dir/foo/.git' # Or use the less direct 'directory' form.
);
# 10 Blobs please.
my @blobs = map {
my $blob = Git::PurePerl::NewObject::Blob->new(
content => 'String Of File Content, no' . $_ ,
);
} 1 .. 10;
$git->put_object( $_ ) for @blobs;
You'll now note if you execute
git prune -n
in your working directory, that there are 10 objects that are not attached to anything pending prune. This number should NOT change if you re-run the above code multiple times, as content with identical SHA1's are only added to the data store once.Building Tree Objects
You now have a scattering of files. Well, more data that represents files. there's no file name or permissions metadata yet. Tree objects connect these files with their names and attributes. A tree is basically a blob with a specific content and format. The content of this blob is a series of entries.
Every entry has 3 parts,
The sha1 is the sha1 of a thing, either a Blob, or another Tree object. The filename is a given name for that commit. The 'mode' I don't fully understand yet, all I know is
Now, as with above, there's still no commit these are bound to. They're just floating bits of data.
Also, we probably want a dir or 2.
Horray, all going to plan, you now have a simple digraph of data in git!.
There is still no root node, so
We need:
We strongly recommend you use this field.
and that's it!. There is a new commit in the repository with the data you ascribed =)
There has been only one visible change as far as the filesystem is concerned, and that's the current checked out branch has been changed from whatever it was on, to 'thebranchname'. For our intents, this is not what we want, as this drops the whole thing we were trying to achieve of "nothing outside of this code should have any substantial visible effect".
Every entry has 3 parts,
- mode
- filename
- object sha1
The sha1 is the sha1 of a thing, either a Blob, or another Tree object. The filename is a given name for that commit. The 'mode' I don't fully understand yet, all I know is
Fileswork with
100644
, and Treeswork with
040000
.
use strict;
use warnings;
use Git::PurePerl;
use Git::PurePerl::NewObject::Blob;
use Git::PurePerl::NewObject::Tree;
use Git::PurePerl::NewDirectoryEntry;
my $git = Git::PurePerl->new(
gitdir => '/some/dir/foo/.git' # Or use the less direct 'directory' form.
);
# 10 Blobs please.
my @blobs = map {
my $blob = Git::PurePerl::NewObject::Blob->new(
content => 'String Of File Content, no' . $_ ,
);
} 1 .. 10;
# only put the first blob on the tree.
my $tree = Git::PurePerl::NewObject::Tree->new(
directory_entries => [
Git::PurePerl::NewDirectoyEntry->new(
mode => 100644,
filename => "FooFile",
sha1 => $blobs[0]->sha1,
)
],
);
# stash blobs
$git->put_object( $_ ) for @blobs;
# stash tree
$git->put_object( $tree );
Now, as with above, there's still no commit these are bound to. They're just floating bits of data.
Also, we probably want a dir or 2.
use strict;
use warnings;
use Git::PurePerl;
use Git::PurePerl::NewObject::Blob;
use Git::PurePerl::NewObject::Tree;
use Git::PurePerl::NewDirectoryEntry;
my $git = Git::PurePerl->new(
gitdir => '/some/dir/foo/.git' # Or use the less direct 'directory' form.
);
# 10 Blobs please.
my @blobs = map {
my $blob = Git::PurePerl::NewObject::Blob->new(
content => 'String Of File Content, no' . $_ ,
);
} 1 .. 10;
my $i = 0;
my @direntries = map {
$i++;
Git::PurePerl::NewDirectoyEntry->new(
mode => 100644,
filename => "FooFile_$i",
sha1 => $_->sha1,
)
} @blobs;
my ( @treeblobs );
my (@dira, @dirb);
@dira = splice @direntries, 0, 5, ();
@dirb = splice @direntries, 0, 3, ();
my $tree_dira = Git::PurePerl::NewObject::Tree->new(
directory_entries => \@dira,
);
push @treeblobs, $tree_dira;
my $tree_dirb = Git::PurePerl::NewObject::Tree->new(
directory_entries => \@dirb,
);
push @treeblobs, $tree_dirb;
my $root_tree = Git::PurePerl::NewObject::Tree->new(
directory_entries => [
@direntries,
Git::PurePerl::NewDirectoryEntry->new(
mode => 040000,
filename => 'SubDir_A',
sha1 => $tree_dira->sha1,
),
Git::PurePerl::NewDirectoryEntry->new(
mode => 040000,
filename => 'SubDir_B',
sha1 => $tree_dirb->sha1,
),
]
);
push @treeblobs, $root_tree;
# stash blobs
$git->put_object( $_ ) for @blobs;
$git->put_object( $_ ) for @treeblobs;
Horray, all going to plan, you now have a simple digraph of data in git!.
There is still no root node, so
git prune
will still delete them all, but we're almost there.Commit it!
This is the finishing touch. Once you do this, the object will hold its own in the datastore, and all be written in the metadata.We need:
- Authors
- Timestamps
- Commit messages
- Optional: parent commit
- Target branch
Important Note about parent
. Although parent is optional, you shouldn't treat it as such unless you know what you're doing, or there is in fact no parent ( ie: its a brand spanking new branch, aka, a new symbolic ref). As git is represented as a chain:[branch]->{commit} -V {commit} -V {commit}and a "branch" is pretty-much a pointer to the head commit of a series, creating a singular commit at the end of a branch with no parent behaves the same as if you had DELETED THE WHOLE BRANCH, created a new, history-less symbolic-ref with the same name, and committed the commit to it, leaving a branch history of 1 item
We strongly recommend you use this field.
use strict;
use warnings;
use Git::PurePerl;
use Git::PurePerl::NewObject::Blob;
use Git::PurePerl::NewObject::Tree;
use Git::PurePerl::NewDirectoryEntry;
use Git::PurePerl::NewObject::Commit;
use Git::PurePerl::Actor;
use DateTime;
my $git = Git::PurePerl->new(
gitdir => '/some/dir/foo/.git' # Or use the less direct 'directory' form.
);
# Snip
# ....
# Snip
my $root_tree = something();
# Create the commit author.
my $author = Git::PurePerl::Actor->new(
name => "Bob Smith",
email => "BobSmith@example.com",
);
my $timestamp = DateTime->now();
my @parent;
if ( 0 ){
# Optional code block to determine parent commit id.
my $p = $git->ref_sha1('refs/heads/thebranchname');
@parent = ( parent => $p );
}
my $commit = Git::PurePerl::NewObject::Commit->new(
tree => $root_tree->sha1,
author => $author,
authored_time => $timestamp,
committer => $author,
committed_time => $timestamp,
comment => <<'EOF'; This is a commit message!11!!. EOF ); # stash blobs $git->put_object( $_ ) for @blobs;
$git->put_object( $_ ) for @treeblobs;
$git->put_object( $_, 'thebranchname') for ( $commit );
and that's it!. There is a new commit in the repository with the data you ascribed =)
There has been only one visible change as far as the filesystem is concerned, and that's the current checked out branch has been changed from whatever it was on, to 'thebranchname'. For our intents, this is not what we want, as this drops the whole thing we were trying to achieve of "nothing outside of this code should have any substantial visible effect".
Disabling the branch switch
The comments in the code indicate there may be a future time at which we don't have to work around this behaviour, but until now, here is a good WorksForMe™ way to do it.
sub Git::PurePerl::put_object_noswitch {
my ( $self, $object, $ref ) = @_;
$self->loose->put_object($object);
return unless ( $object->kind eq 'commit' );
$refname = 'master' unless $refname;
$self->update_ref_sane( $ref, $object->sha1 );
my $ref = Path::Class::file( $self->gitdir, 'refs', 'heads', $refname );
$ref->parent->mkpath;
my $ref_fh = $ref->openw;
$ref_fh->print($object->sha1) || die "Error writing to $ref";
}
No comments:
Post a comment