2009-12-03

Dist-Zilla-Plugin-Git-CommitBuild

I've contemplated this for a while. I might get a round tuit, and do this myself, so this blog entry is here to jog my brain, jot down ideas, possibly collect info.

If you look at the repositories behind any of my CPAN dists ( well, most of them ), you'll see I maintain both release and source branches for the entire history ( http://github.com/kentfredric/ELF-Extract-Sections/network ), and more recently, maintaining a sort of "pre-release/release" sub-system, where stuff I build just for testing/preview purposes may have a life on their own branch, sort of like release candidates. 

This is essentially to provide a branch that is containing a full copy of all the generated code, as posted on CPAN, as opposed to the source that it is generated from, for posterity reasons mostly, and so I can deprecate versions on CPAN for incompatibility reasons one day , and people won't be left in the lurch to get an identical copy of it somewhere, as it will always be in the git history, just grab the right tag and you're set.

They could always use the backpan, but that has 2 caveats in my experience.

  1.  No diff mechanism. This feature is very important to people who do release maintenance for distributions, as  its the only good way to conclusively see what exactly changed between 2 consecutive versions, in order to update their internal dependency data that controls the shipping of the built copies.

    For this reason also, I loathe every time somebody deletes an older copy of their dist when its not been outdated for < 3 months, because it can take that long to notice that the shipped copy is outdated and for somebody to request a version bump. Not being able to use CPAN's diff feature makes this task much more challenging. ( At least for me, for that is how I do my work-flow, and I kind-of help out lots with gentoo's perl-experimental overlay ).

  2. Sometimes, versions live too short a time to be backed up on backpan. This is very problematic, for the above reason, and for the reason is you have no historical record of what happened outside the Changes, and the original commit history.  You could probably argue there's no reason to ever want these version that never made it to backpan, and you'd probably be right.
So to remedy this problem, here is what I do.
  1. Commit, and tag the exact source tree that was used to generate the released code in the notation %v-source. This theoretically guarantees that anyone can check out that exact same release, run "dzil release", and produce more or less the exact same output, with the only difference possibly being the version numbers emitted if you're using an [AutoVersion] or [AutoVersion::Relative] plugin.


    Here is the code snippet I use to do this that uses Jerome Quelin's [Git] plugin suite.
    [Git::Check]
    filename = Changes

    [NextRelease]

    [Git::Tag]
    filename = Changes
    tag_format = %v-source

    [Git::Commit]

    This Order is important. In the Build phase, [NextRelease] formats the Changes template into an exportable form, and puts the datestamp in it.

    In the pre-release phase, [Git::Check] makes sure theres nothing in the tree that isn't committed.

    [UploadToCpan]uploads the dist to CPAN, and the post-release phase kicks in.

    [NextRelease]  then kicks in again, and reformats the Changes so it resembles the previously released Changes except with that {{$NEXT}} stuff in it ready for hacking on. 

    [Git::Tag]  tags the last commit ( that is, not the current tree with the modified Changes, that's not committed yet, but the commit that it was at still when we released ) with %v-source, and then [Git::Commit] commits the updated Changes as a new commit ( with a copy of the first segment of the Changes file as its commit message )

  2. Have a separate commit history just for releases to be copied into.
    git symbolic-ref HEAD refs/heads/releases
    The first commit of this is built, generally from the first releases files. At present, I do this first release as so:
    rsync -avp Some-Dist-Name-0.010101/ ./
    then weed out all the files I'm pretty sure weren't in the generated tree by hand. ( I had a some code that did it all with rsync, and had an ignore list so that the --delete-after argument didn't accidentally erase all of .git, which would be very sad, but I accidentally deleted it :[  )

    This tree now represents an exact copy of the generated code, and it is committed as follows:
    git commit -m "Build of 0deadbeef0, version 0.010101 on cpan"
    or similar, to assure that every commit on the release branch, is a direct derivative of another commit on the source branch, and there's an intrinsic link between them.  ( I avoided having a direct link, because that gives cleaner histories ).


  3. That commit is tagged as the released version, ( ie: 0.010101 )
Now all this is wonderfully Tedious. At present, the best I have a script that makes the "commit and tag" phase on the release branch reasonably painless, but what I want to do, is have a nice way, to automate all of the above, every single bit of it, with a plugin.

Here is some proposed syntax.

[Git::CommitBuild / prerelease ]
branch = prereleases
autocreate = 1
phase = build

[Git::CommitBuild / release]
branch = releases
autocreate = 1
phase = after_release

 Why this notation? well, I guess it just seems the right amount of flexible to me.
the text after the  / is totally optional, and its just a way to let dzil differentiate between copies of the same plugin.  

branch is to tell it what git branch to work with. I figgured I could just use the name part after the  /, but it seemed nasty to me ( spaces for instance ). At very best, it could default to that value if branch = is not specified.

autocreate = 1 would magick the branch out of fat air the first time you tried to commit to it and it wasn't there. This would be off by default, as it could be annoying to you if you'd already created another branch with a different name for that purpose, and typoed and it created another branch. This way it fails instead of annoying you.

phase = build is sadly the most scary bit I'm trying to eliminate the stink of. Essentially, I have one plugin that does only one thing, but there are 2 different times I may want to run it at. ( And there are possibly more places people might want to say "stop the build, store this somewhere, then continue" ). 

In the above scenario, 'prerelease' I envisage as only getting run when I call "dzil build" explicitly. NOT 'dzil test' and NOT 'dzil release', only 'dzil build'.

Also, Ideally, the whole commit phase should be done, magically, entirely in memory, with some magical git magic, to eliminate the whole "write it out to the filesystem before creating the actual commit data" part of the equation, so that nowhere anywhere does there transpire something like
git checkout releases
git commit stuff
git checkout master
which causes anarchy in the event anything else happened to be using the file-system.

Thoughts/Suggestions anyone?

2009-12-02

Code Wanted: Abstract Syntax Tree to Perl Code compiler

WANTED:

use AST::Assembler qw( :all );

# Code Generation Via AST.

my $code = context( 
  package_def('Foo', context(
        use_declaration('Moose'),
        call_sub('with', package => CURRENTCONTEXT, args => list( 'Some::Role' )),
        def_sub('bar', context( 
           def_var(['x','y','z], 'context' => CURRENTCONTEXT),
           assign(['x','y','z',], STACK ),
           assign('z', add('x','y')),
           return('z'),
        ))
  ))
);

# AST Augmentation.
$code->find('def_sub')->grep(sub{ $_[0] eq 'bar' })->find('assign')->grep(sub{ $_[0] eq 'z' })->before(assign('x',sub(0,'x')));

my $codestr = $code->to_perl; 

# -->
package Foo;
use Moose;
with "Some::Role";
sub bar { 
  my ( $x,$y, $z);
  ($x,$y,$z) = (@_);
  $x = 0 - $x;  # inserted by augmentation.
  $z = ( $x + $y )
  return $z;
}   
$code->optimise->to_perl
package Foo;
use Moose;
with "Some::Role";
sub bar { 
  return ( ( 0 - $_[0] ) + $_[1] )
}
$code->find('package_def', [ 0 , 'eq' , 'Foo' ])->child('context')->append(callsub('bar',args=>list('1','2','3')));

$code->optimise->to_perl
package Foo;
use Moose;
with "Some::Role";
sub bar { 
  return ( ( 0 - $_[0] ) + $_[1] )
}

( ( 0 - '1' ) + '2' )

Its just an insane starting point for code generation. Somebody run with it and make it not suck :)

Once we get a working AST to Perl code thing, maybe somebody can consider doing the inverse ;)

2009-12-01

Initial Metablog/The State of the blogsphere

I heavily encourage people to contradict me, tell me where I am frankly wrong, if you're not even sure I'm wrong and think I might be, and have some reasons, go ahead, point them out, I actively encourage criticism, because I actively wish to correct my own failings
Well, this is a new blog. One of the standard initiation rites is to write a blog about blogging1.

Instead of going on a fork about blogging techniques and whatnot, I'm just going to lament about the sad state of blogging platforms that don't suck and JustWork, and why I eventually chose Blogger.

Do It Yourself.

People I've seen have this penchant to hard-code and rewrite their own blogs from scratch.

This concept to me is Made With Fail™.

Sure, you get exactly what you want, but you also have the blissfully joyful task of maintaining everything yourself, and making all the fun calls about how to handle commenting, aggregation, and all that sort of stuff, and its just too much work.

Doing that sort of stuff for a living is challenging enough, let alone having to do it for a living and yourself at the same time, which is just too much work.

{{ $InsertProjectNameHere }} Hosted On Your Own Server

This is a substantially better option verses Do It Yourself , you don't have to worry so much about code maintenance.

However, you still have to worry about host security, and how safe the code really is. After all, you're paying for that server, and have pissy entry-level grade database support2, and you sort of have the worry much of the time of keeping your host platform up-to-date and secure, and keeping your blogging software up to date and secure.

Additionally, the average designed-for-self-hosting project appears to have lots of associated stupidity, and the arbitrary hoop jumping install phase and arbitrary hoop jumping configuration, and the bizarre do-it-all-by-hand database setup stuff, which I'm really sick of, combined with the fact they often don't even have documentation or support for non-apache web-servers3, or require some magically odd setup for apache which has since been deprecated by the distribution you're trying to install it on.

3rd Party Service

Eliminating the above 2 choices leaves me with only the logical conclusion of utilising some 3rd party service. This absolves me of the need to worry about the hosting and software requirements of the platform, and let their team of dedicated staff handle those problems.

Sure, there's the caveat of "if something doesn't work, you can't fix it yourself", but they know more about their software than I do. You get about as much benefit here as with {{$ProjectName}} built from code you don't grok, and don't have time to grok.

Then we go down to the list of features that matter to you:
1. DNS Support
Being able to map the Blog under your own site of choosing is a must have feature. Some services charge for the luxury of doing this, others, its standard issue.
2. Free
This, is also important for me, if you can't get the most out of a service for free, its not worth it. Pay-for blogging services are a huge turnoff
3. All The Mundane things worked out for you
It should be easy by default, and you can just Start Using It, and have capacity for power later.
4. Feed Production
A must have: not everybody wants to be forced to browse your site via a web-browser, and feeds are much more convenient for them
5. Simple Advertisement Integration
I know ads are a bit nasty, but sometimes a guy isn't making enough money to get by on, and throwing up some much needed adverts can help contribute to some much needed denero. Simplicity is also important, because one day if I decide I want to change them, or rip them out all together, I want minimal migration pain.
6. Simple themes by default that look great
There's nothing worse than a site with a theme that screams "I was born in the geocities era." , and as a coder, not a graphic designer, my eye for style tends towards looking pretty brutal. As I'm not a designer, I want something that looks good, and requires none of my time to maintain it.
7. No HTML Restrictions
Ideally, you should be able to avoid manually hacking html for the most part, but should you feel the need to extend it, there should be no limitations on how you can lay out stuff, theme it, and format it. Anything that puts restrictions on what I can put in the code or tries to "magic" my insertions into something else get epic beatings from me.

And now for the blogging services.
1. LiveJournal
I've been using this off and on for mundane stuff, but its hopeless for a blog of technical merit. Pointing somebody to your LJ blog invokes the whole "hurr" mental result. Its too arcane, too old, and over the top restrictive with decade old use models. It contravenes goals 1,2, 5, 6 and 7 in my experiences, it has this nasty "skin" oriented theme thing which sucks epically hard. And you have to be a premium member ( $$$ ) to get the maximum use out of it. The best you can do is get a semi-premium account by proliferating it with adverts that work for LiveJournal, not for you. You can't get a dns association without being a premium member.
2. WordPress
I don't want to go into this one, but eww. Whats not to hate about Wordpress?. Its far too noisy. Like LJ, it plunders you with adverts, and you have to pay to get rid of them. And it appears to me there's not much in the way of goal 1,5 and 7. The whole plugin architecture they have is just an abuse farm festering under the hood.
3. blogs.perl.org
Too new, and hideously broken for me all the time. Apparently no support for proper per-domain blogs, and you're all like different bloggers using the same site. Sorry, signal to noise far too high for me.
4. use.perl.org blogs
Ugly as sin, user interface like a leather bag over the head. Hopeless RSS. No DNS. No themes. Overcomplicated.
5. TypePad
Pay for only. Won't even attempt to use it. Teired pricing for different features is a bigger turnoff still. I don't own a VISA, and don't want one, which makes it simply impossible to do anything that incurrs international payments ( I might cover this at a latter time ).
6. Vox
This seems the best of breed with regard to the Perl options I've seen so far, but the interface is too heavy, too much chrome that can't be turned off, the site is so dizzying in complexity, that its not "Just a frickn blog" like I want, its a whole damned social blogging network thing, and that's awful. Its looking too much like its trying to be live journal. As for the desired features, I can't tell what it does, what features I really want, it doesn't appear to mention them , anywhere, and I'd only be able to work it out if I signed up. I highly doubt DNS control though.
So I've settled for Blogger. It just seems to suck less for the things I want to do.
I heavily encourage people to contradict me, tell me where I am frankly wrong, if you're not even sure I'm wrong and think I might be, and have some reasons, go ahead, point them out, I actively encourage criticism, because I actively wish to correct my own failings
1. I'm now blogging about blogging about blogging, so meta-level just keeps growing.
2.That is, I'm betting you don't have replication and backups to high-heaven for your average self-hosted project.
3.Which is especially problematic if you cant even run apache because the bloated thing consumes all the available memory on your cheap XEN VPS, and then starts hardcore swapping to disk, and the site becomes inaccessible as soon as a crawler hits it.