I've contemplated this for a while. I might get a round tuit, and do this myself, so this blog entry is here to jog my brain, jot down ideas, possibly collect info.

If you look at the repositories behind any of my CPAN dists ( well, most of them ), you'll see I maintain both release and source branches for the entire history ( http://github.com/kentfredric/ELF-Extract-Sections/network ), and more recently, maintaining a sort of "pre-release/release" sub-system, where stuff I build just for testing/preview purposes may have a life on their own branch, sort of like release candidates. 

This is essentially to provide a branch that is containing a full copy of all the generated code, as posted on CPAN, as opposed to the source that it is generated from, for posterity reasons mostly, and so I can deprecate versions on CPAN for incompatibility reasons one day , and people won't be left in the lurch to get an identical copy of it somewhere, as it will always be in the git history, just grab the right tag and you're set.

They could always use the backpan, but that has 2 caveats in my experience.

  1.  No diff mechanism. This feature is very important to people who do release maintenance for distributions, as  its the only good way to conclusively see what exactly changed between 2 consecutive versions, in order to update their internal dependency data that controls the shipping of the built copies.

    For this reason also, I loathe every time somebody deletes an older copy of their dist when its not been outdated for < 3 months, because it can take that long to notice that the shipped copy is outdated and for somebody to request a version bump. Not being able to use CPAN's diff feature makes this task much more challenging. ( At least for me, for that is how I do my work-flow, and I kind-of help out lots with gentoo's perl-experimental overlay ).

  2. Sometimes, versions live too short a time to be backed up on backpan. This is very problematic, for the above reason, and for the reason is you have no historical record of what happened outside the Changes, and the original commit history.  You could probably argue there's no reason to ever want these version that never made it to backpan, and you'd probably be right.
So to remedy this problem, here is what I do.
  1. Commit, and tag the exact source tree that was used to generate the released code in the notation %v-source. This theoretically guarantees that anyone can check out that exact same release, run "dzil release", and produce more or less the exact same output, with the only difference possibly being the version numbers emitted if you're using an [AutoVersion] or [AutoVersion::Relative] plugin.

    Here is the code snippet I use to do this that uses Jerome Quelin's [Git] plugin suite.
    filename = Changes


    filename = Changes
    tag_format = %v-source


    This Order is important. In the Build phase, [NextRelease] formats the Changes template into an exportable form, and puts the datestamp in it.

    In the pre-release phase, [Git::Check] makes sure theres nothing in the tree that isn't committed.

    [UploadToCpan]uploads the dist to CPAN, and the post-release phase kicks in.

    [NextRelease]  then kicks in again, and reformats the Changes so it resembles the previously released Changes except with that {{$NEXT}} stuff in it ready for hacking on. 

    [Git::Tag]  tags the last commit ( that is, not the current tree with the modified Changes, that's not committed yet, but the commit that it was at still when we released ) with %v-source, and then [Git::Commit] commits the updated Changes as a new commit ( with a copy of the first segment of the Changes file as its commit message )

  2. Have a separate commit history just for releases to be copied into.
    git symbolic-ref HEAD refs/heads/releases
    The first commit of this is built, generally from the first releases files. At present, I do this first release as so:
    rsync -avp Some-Dist-Name-0.010101/ ./
    then weed out all the files I'm pretty sure weren't in the generated tree by hand. ( I had a some code that did it all with rsync, and had an ignore list so that the --delete-after argument didn't accidentally erase all of .git, which would be very sad, but I accidentally deleted it :[  )

    This tree now represents an exact copy of the generated code, and it is committed as follows:
    git commit -m "Build of 0deadbeef0, version 0.010101 on cpan"
    or similar, to assure that every commit on the release branch, is a direct derivative of another commit on the source branch, and there's an intrinsic link between them.  ( I avoided having a direct link, because that gives cleaner histories ).

  3. That commit is tagged as the released version, ( ie: 0.010101 )
Now all this is wonderfully Tedious. At present, the best I have a script that makes the "commit and tag" phase on the release branch reasonably painless, but what I want to do, is have a nice way, to automate all of the above, every single bit of it, with a plugin.

Here is some proposed syntax.

[Git::CommitBuild / prerelease ]
branch = prereleases
autocreate = 1
phase = build

[Git::CommitBuild / release]
branch = releases
autocreate = 1
phase = after_release

 Why this notation? well, I guess it just seems the right amount of flexible to me.
the text after the  / is totally optional, and its just a way to let dzil differentiate between copies of the same plugin.  

branch is to tell it what git branch to work with. I figgured I could just use the name part after the  /, but it seemed nasty to me ( spaces for instance ). At very best, it could default to that value if branch = is not specified.

autocreate = 1 would magick the branch out of fat air the first time you tried to commit to it and it wasn't there. This would be off by default, as it could be annoying to you if you'd already created another branch with a different name for that purpose, and typoed and it created another branch. This way it fails instead of annoying you.

phase = build is sadly the most scary bit I'm trying to eliminate the stink of. Essentially, I have one plugin that does only one thing, but there are 2 different times I may want to run it at. ( And there are possibly more places people might want to say "stop the build, store this somewhere, then continue" ). 

In the above scenario, 'prerelease' I envisage as only getting run when I call "dzil build" explicitly. NOT 'dzil test' and NOT 'dzil release', only 'dzil build'.

Also, Ideally, the whole commit phase should be done, magically, entirely in memory, with some magical git magic, to eliminate the whole "write it out to the filesystem before creating the actual commit data" part of the equation, so that nowhere anywhere does there transpire something like
git checkout releases
git commit stuff
git checkout master
which causes anarchy in the event anything else happened to be using the file-system.

Thoughts/Suggestions anyone?

No comments:

Post a Comment