Showing posts with label perl. Show all posts
Showing posts with label perl. Show all posts

2011-02-22

Testing your File::ShareDir based dist now possible.

For the last few months, every time I've had a dist that needed File::ShareDir to do its dirty work, I've used various tricks to make it work.
  • Simply not test it:
    Sad as this may seem, this is pretty much the primary approach because it was too confusing
  • Invest a bit of code into overriding sharedir behaviour
    This usually involved having some coderef or lazy-loaded attribute that was normally populated by File::ShareDir, instead provided during test via hard-coding it.

Enter Test::File::ShareDir

I uploaded Test::File::ShareDir this morning to CPAN, which lets you do this:
use Test::More;

use FindBin;

use Test::File::ShareDir
  -root => "$FindBin::Bin/../",
  -share => {
    -module => { 'My::Module' => 'share/MyModule' }
  };
This configuration would be sufficient to use in a test in t/ for a distribution shipping My::Module along with its corresponding shared directory share/MyModule. If you were shipping this dist with Dist::Zilla, you'd have something like this in dist.ini
[ModuleShareDirs]
My::Module = share/MyModule
I plan to add support for 'package' style ShareDir support, but for now, module share dirs are sufficient. Enjoy =).

2011-01-08

Making a Minting Profile as a CPANized Dist.

... or one more reason why Dist::Zilla is awesome.

Minting With Dzil

Minting is a term we use for "Creating a new distribution from a template of sorts". Since Dist::Zilla 2.101230 ( 2010-05-03 23:42:27 America/New_York ), there has been a command for Dist::Zilla to facilitate setting up a new distribution.

$ dzil new Acme-An-Example
# [DZ] making target dir /tmp/Acme-An-Example
# [DZ] writing files to /tmp/Acme-An-Example
# [DZ] dist minted in ./Acme-An-Example
$ find Acme-An-Example/
# Acme-An-Example/
# Acme-An-Example/dist.ini
# Acme-An-Example/lib
# Acme-An-Example/lib/Acme
# Acme-An-Example/lib/Acme/An
# Acme-An-Example/lib/Acme/An/Example.pm
$ cat Acme-An-Example/dist.ini
# name    = Acme-An-Example
# author  = Kent Fredric 
# license = Perl_5
# copyright_holder = Kent Fredric 
# copyright_year   = 2011
# 
# version = 0.001
# 
# [@Basic]
$ cat Acme-An-Example/lib/Acme/An/Example.pm
# use strict;
# use warnings;
# package Acme::An::Example;
# 
# 1;

And while this is a good starting point, once you've gotten into Dist::Zilla, you'll probably find it somewhat lacking and find yourself doing things over and over again.

For this, we have "Minting Profiles".

Basic Minting Profiles

.

Minting Profiles have been in Dist::Zilla since 4.101780 ( 2010-06-27 14:30:55 America/New_York ).

The first thing you'll tend to do, is make a minting profile in ~/.dzil

dzil.org's minting-profile tutorial adequately covers the innards of how this works, but summarised, its lots like dist.ini. You have a profile.ini in a directory in ~/.dzil/profiles/$profilename/ and it will control how your dist is created, and you can then throw together new dists as simply as:

$ dzil new -p $profilename Acme-An-Example

In the absence of -p $profilename, if there is a ~/.dzil/profiles/default/, that will be used instead.

This is pretty convenient.

Me, I had a very simple default profile for a while that did most of what I needed to do:

; ~/.dzil/profiles/default/profile.ini
[Author::KENTNL::DistINI]

That behaves more-or-less identical to how normal dzil new operates, but it generates a custom dist.ini from my custom Author::KENTNL::DistINI plugin. This is based on DistINI just tuned for how I like it. Its more or less a template ;).

; Generated by Dist::Zilla::Plugin::Author::KENTNL::DistINI version 0.01023312 at Fri Jan  7 20:05:56 2011
name             = Acme-An-Example
author           = Kent Fredric 
license          = Perl_5
copyright_holder = Kent Fredric 

; Uncomment this to bootstrap via self 
; [Bootstrap::lib]

[@KENTNL]
version_major     = 0
version_minor     = 1
; the following data denotes when this minor was minted
version_rel_year  = 2011
version_rel_month = 1
version_rel_day   = 7
version_rel_hour  = 20
version_rel_time_zone = Pacific/Auckland
twitter_hash_tags = #perl #cpan

[Prereqs]

CPAN Centralization

However, many people like to use CPAN as their file storage mechanism, and centralise all their perly bits they might need in different locations so they can just get their current favourite configuration with a handy :

cpanm --interactive --verbose Dist::Zilla::MintingProfile::Author::KENTNL 

A Word on Name-spaces

Before we go further, I think it important to plead my rationale behind the "Author::" prefix I'm using now, as I feel its an important concern.

As we have seen with Dist::Zilla there have been a slew of PluginBundles with CPANID's in their name, to the point that there is a copious amount of name-space pollution in the PluginBundle name-space, and more Author bundles than task-bundles, which was really what the name-space was designed for, and I'll freely admit, I am guilty of this crime as well, but I'm petitioning you to help reduce this annoyance in future modules.

From a CPAN testers perspective, the annoyance of lots of CPANID-dists is similar to the annoyance of the whole DPCHRIST:: subspace, and that if this pattern continues, it will mean for the testers who do not wish to test everyones personal modules, that they will have to work hard to avoid this. If DPCHRIST:: had used something like Author::DPCHRIST:: instead, I doubt so many people would be horrified by it, because you can just have a policy/rule that excludes ^Author::, and everyone else who goes that way can be quietly ignored.

Then we could probably rationally add that same restriction to the irc announce bots, the "recent modules" list and soforth, and possibly even apply special indexing restrictions or something so people wouldn't even have to know those modules exist on cpan!

So, for the sake of cleanliness, semantics, and general global sanity, I ask you to join me with my Author:: naming policy to voluntarily segregate modules that are most likely of only personal use from those that have more general application.

Dist::Zilla::Plugin::Foo                 # [Foo]                 dist-zilla plugins for general use
Dist::Zilla::Plugin::Author::KENTNL::Foo # [Author::KENTNL::Foo] foo that only KENTNL will probably have use for
Dist::Zilla::PluginBundle::Classic       # [@Classic]            A bundle that can have practical use by many
Dist::Zilla::PluginBundle::Author::KENTNL #[@Author::KENTNL]     KENTNL's primary plugin bundle
Dist::Zilla::MintingProfile::Default     # A minting profile that is used by all
Dist::Zilla::MintingProfile::Author::KENTNL # A minting profile that only KENTNL will find of use.

Making Dist::Zilla::MintingProfile::Author::YOURID

First, we'll start with the above default empty profile with a basic dist.ini

$ dzil new Dist-Zilla-MintingProfile-Author-YOURID
And then the guts of that Minting Profile is pretty basic.
use strict;
use warnings;

package Dist::Zilla::MintingProfile::Author::YOURID;

# ABSTRACT: YOURID's Minting Profile

use Moose;
use namespace::autoclean;
with 'Dist::Zilla::Role::MintingProfile::ShareDir';


__PACKAGE__->meta->make_immutable;
no Moose;
1;

This is a deceptively simple piece of magic that took me a little while to fully grok.

Next, we make a 'profile' directory in our dist.

$ mkdir -p Dist-Zilla-MintingProfile-Author-YOURID/share/profiles/
And in that, we can put one or many profiles. For simplicity, we'll just start off with a 'default' profile.
$ mkdir -p Dist-Zilla-MintingProfile-Author-YOURID/share/profiles/default/
$ pushd Dist-Zilla-MintingProfile-Author-YOURID/share/profiles/default/

Now, you could just stick the same profile.ini you had above in ~/.dzil there, and you're mostly done with the hard part. However, I'll go a little more into detail so you get an idea of what use it is.

Make a 'skel' dir

Make a 'skel' dir holding a selection of templated-files you want in all new dists

Here is the most notable ones in mine

$ mkdir skel
# stuff directory with files
$ find skel/
# skel/
# skel/.gitignore
# skel/.perltidyrc
# skel/Changes
# skel/perlcritic.rc
# skel/weaver.ini
$ cat skel/Changes
# Release history for {{ $dist->name }}
# 
# {{ '{{$NEXT}}' }}
#
#        First version, released on an unsuspecting world.
#
$ cat skel/.gitignore
# .build
# {{ $dist->name }}-*

{{ $dist->name }} is expanded during 'new' to the name of the intended dist, and {{ '{{$NEXT}}' }} is just a nasty hack so that the literal string {{$NEXT}} turns up in the changelog file so the Changelog plugin keeps working.

For more in-depth understanding, please read The dzil minting profile tutorial

Adding support for the skel directory

You now need to tell your profile.ini about that skel directory if you want it to mean anything.

[GatherDir::Template]
root = skel
include_dotfiles = 1 ; I want the dot files! 
./profile.ini    # this file ---> indicates inclusion of
./skel                           #      /
./skel/.gitignore          <-----------/
./skel/.perltidyrc         <----------/
./skel/Changes             <---------/
./skel/perlcritic.rc       <--------/
./skel/weaver.ini          <-------/

Other misc profile additions

Now most people will probably just put in
./skel/dist.ini

With some template variables expanded, but I myself prefer the dedicated module approach for dist.ini, so I add that to profile.ini

[GatherDir::Template]
root = skel
include_dotfiles = 1 ; I want the dot files! 

[Author::KENTNL::DistINI]
And for good measure, I want every new dist to automatically have git set up with it, so I add that plugin too.
[GatherDir::Template]
root = skel
include_dotfiles = 1 ; I want the dot files! 

[Author::KENTNL::DistINI]
[Git::Init]

Gluing it together

Now if you've been following me, you'll have a directory that looks a lot like this:
dist.ini
lib/
lib/Dist
lib/Dist/Zilla
lib/Dist/Zilla/MintingProfile
lib/Dist/Zilla/MintingProfile/Author
lib/Dist/Zilla/MintingProfile/Author/YOURID.pm
share/
share/profiles
share/profiles/default
share/profiles/default/profile.ini
share/profiles/default/skel
share/profiles/default/skel/.gitignore
share/profiles/default/skel/.perltidyrc
share/profiles/default/skel/Changes
share/profiles/default/skel/perlcritic.rc
share/profiles/default/skel/weaver.ini

And you're probably wondering how the 'share' directory and its contents will be seen by 'YOURID.pm' after installation.

The magic is this stanza in your dist.ini

[ModuleShareDirs]
Dist::Zilla::MintingProfile::Author::YOURID = share/profiles 

Yeah, at first that baffled me too.

Into the guts of File::ShareDir

Dist::Zilla utilizes sharedir in 2 places, 1. In the module that does the minting, and 2. Using some instructions injected in your install tool.

During Install

Lets pretend there is a directory, we'll call it $x, which is known as the "base directory" for all Things File::ShareDir::Install installs, which is also known to to File::ShareDir.

For example:

/usr/lib64/perl5/vendor_perl/5.12.2/auto/share/    # $x 
/usr/lib64/perl5/vendor_perl/5.12.2/               # module install dir.

The ModuleShareDirs plugin tells File::ShareDir::Install that the directory 'share/profiles' should be installed in a path associated with the module, i.e.:

$x/module/Dist-Zilla-MintingProfile-Author-YOURID/ <= share/profiles/ 

So, during install, the files in share/profiles/* will be copied to that directory.

.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/skel/perlcritic.rc
.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/skel/weaver.ini
.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/skel/.perltidyrc
.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/skel/.gitignore
.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/skel/Changes
.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/profile.ini
.../Dist/Zilla/MintingProfile/Author/YOURID.pm

And that is not all that scary =)

How your profile uses that

Having seen how it installs to the file-system, how the plug-in works should be a bit of a no brainer.

  • dzil new -P Author::YOURID -p default Some-Module-Name
  • Dist::Zilla, sees -P, and loads the respective profile from @INC, Dist::Zilla::MintingProfile::Author::YOURID
  • Your minting profile uses that sharedir role, and asks File::ShareDir for the directory associated with 'Dist-Zilla-MintingProfile-Author-YOURID', which of course returns that '.../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/' we installed above.
  • The Role then determines which sub-profile to use ( default, as specified by -p ) and then returns that path ( now .../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/ )
  • .../auto/share/module/Dist-Zilla-MintingProfile-Author-YOURID/default/profile.ini is then read and a new distribution is constructed using the instructions therein
  • And presto, you have a newly minted dist =):
    $ cd Some-Module-Name;
    $ find
    # ./.git 
    # ... a bunch of .git files ...
    # ./dist.ini
    # ./weaver.ini
    # ./perlcritic.rc
    # ./Changes
    # ./.perltidyrc
    # ./.gitignore
    # ./lib
    # ./lib/Some
    # ./lib/Some/Module
    # ./lib/Some/Module/Name.pm
    

Hopefully thats enough to get you started, but...

There is one warning you must uptake with the use of file sharedir.

ONCE YOU INSTALL A FILE IN A SHAREDIR FROM A CPAN DIST, IT IS THERE FOR GOOD

At least, it will be there till somebody manually deletes it, at least in the current implementation of CPAN.

This means if you decide to put a file in skel/, and then later decide you don't want that file in skel/, you're going to have a problem when something decides to "add all of skel/ to your new dist".

For vendor-wrapped/packaged stuff ( ie: Redhat, Debian, Gentoo ) you're fine, those systems have package management that guarantee atomicity somewhat, but the CPAN toolchain ( at least, in my experience ) provides very little guarantee in this regard. This may change in the future, but that's how it seems to be now. But now I've warned you, you'll hopefully not bump into that =).

2010-12-19

Introducing Data::Handle

Comming to a mirror near you, soon, is Data::Handle.

What does Data::Handle do?

Data::Handle solves 2 very simple problems that occur with the __DATA__ section and the associated *DATA Glob, and both of them are to do with "multiple modules trying to access the section".

1. Provide a reliable way to get a file-handle with the position at the start of the __DATA__ segment

  1. *DATA is really a pointer to the entire file, and not just the data segment
  2. The Perl interpreter sets the current position in the file to be after the __DATA__ line

The first time you read from *DATA this of course works fine, but the issue is once you read it, it moves the internal file cursor, and if you read the whole section, after the first complete read, the cursor now points to EOF. For a second block of code to re-read this data without communicating with the first block of code, it has to then rewind the file cursor back to the start prior to reading, and there is no way naturally to know where that point to rewind back to is.

Other modules so far have remedied this by trying to rewind to the start of the file, and manually emulate various parts of the Perl Parser to re-find the start of the __DATA__ section before re-reading its contents.

This module however takes a different approach, and assumes that hopefully, the first person to read that file handle will know what they're doing, and use this module to do it. This module will then record the file offset the __DATA__ section began at, so from that point onwards, rewinding to the start is a trivial exercise.

And all this happens for you simply by you doing :

my $handle = Data::Handle->new( __PACKAGE__ ); 

instead of doing

my $handle = do { no strict 'refs'; \*{ __PACKAGE__ . "::DATA"} };
. ( Note: Side perk, the new syntax is simpler, more straight forward, easier to remember, and no dicking around with strict! ;D )

2. Provide a reliable way for 2 separate logical code units to access the same __DATA__ segment without interfering with each other

Because *DATA is a filehandle, and there is only one of them, seeking around in it can be problematic.

Especially if you have 2 code units that are trying to read it from different places. For a contrived example, prior to this module if you wanted to go back and re-read the start of the section, or skip forwards and read something later in the section, without forgetting where you are now, you'd need a contrived dance of seek/tell. Instead, now, you can just create another worker that will read that stuff for you, and the original handle will retain its position.

my $handle = Data::Handle->new( 'Foo' );
while( <$handle> ){ 

   if ( $_ =~ /something/ ){ 
       # get line 1. 
       my $slave = Data::Handle->new('Foo');
       my $firstline = <$slave>;
       do_stuff_with_first_line($firstline);
   }
   
   # continue as normal.
}

Internally, there is a lovely dance of Seek() going on there, but from an interface perspective, you don't need to know its seeking, all you need to know is "Get reference to DATA, get data from it".

Sure, you can probably argue you could do it easily with lots of seek() in a nice way, but that logic falls apart when you have code in 2 separate places reading the same *DATA.

Its much smarter to be defensive about it, and have some assurance that you can read a file descriptor in a safe way without something evil like this tampering with it.


my $handle = do { no strict 'refs'; \*{ __PACKAGE__ . "::DATA"} };
while(<*DATA>){ 
   do_something_with_($_);
   evil_function();   
}

....
sub evil_function { 
  my $handle = do { no strict 'refs'; \*{ __PACKAGE__ . "::DATA"} };
  seek $handle, 0, 3; # seek to EOF.
}

That is spooky action at a distance!

Data::Handle solves this by meticulously tracking position in each instance, and re-seeking the file handle to the place it was at the end of the last tracked read, so regardless of how much seeking around some other module did, as long as you got on the scene first, you should be unstoppable ;)

2010-11-14

Handling optional requirements with Class::Load

In a previous blog I discovered Class::Load and its awesomeness.

Here is one practical application of it:

--> UPDATES

--> UPDATES2

Automatic Optional Requisites

Say you have a library which provides some form of extensibility to consuming modules, and you want a way to "magically" discover a class to use, but use a fallback if its not there.

Here is how to do it with Class::Load

use strict;
use warnings;
package Some::Module;
use Class::Load qw( :all );

sub do_setup { 
    ...
}
sub import {
    my $caller = caller(); 
    my $maybemodule = "$caller::Controller";
    if( try_load_class( $maybemodule ) ){ 
        do_setup( $maybemodule ); # its there, and it works.
    } else { 
        if ( $Class::Load::ERROR =~ qr/Can't locate \Q$maybemodule\E in \@INC/ ){ 
           do_setup("Some::Module::Default");
        } else { 
           die $Class::Load::ERROR;
        } 
    }
}
To do it the right way without Class::Load is extraordinarily complicated.
use strict;
use warnings;
package Some::Module;

sub do_setup { 
    ...
}
sub import {
    my $caller = caller(); 
    my $maybemodule = "$caller::Controller";

    # see rt.perl.org #19213
    my @parts = split '::', $class;
    my $file = $^O eq 'MSWin32'
             ? join '/', @parts
             : File::Spec->catfile(@parts);
    $file .= '.pm';
      
    my $error;
    my $success;
    { 
       local $@;     
       $success = eval { 
          local $SIG{__DIE__} = 'DEFAULT';
          require $file;
          'success';
       };
       $error = $@;
    }
    
     if( $success eq 'success' ) ){ 
        do_setup( $maybemodule ); # its there, and it works.
    } else { 
        if ( $error =~ qr/Can't locate \Q$maybemodule\E in \@INC/ ){ 
           do_setup("Some::Module::Default");
        } else { 
           die $error;
        } 
    }
}

And even then, you still have a handful of sneaky bugs lurking in there :/

  1. With the second code, if somebody dynamically created the ::Controller class and didn't create a file for it, it will not work properly, and they'll have to tweak $INC somewhere for it to work
  2. If somebody loaded the ::Controller class manually before hand, but it failed, and they didn't report the error, on 5.8, the above code will behave as if the code Did load successfully. ( Truely nasty )

Class::Load has a lot of heuristics in it to try avoid both these situations ( well, the latter one will be soon once a 1-line patch goes in )

There are a few things still that I don't like doing that way, but for now, that's the best I can get

  1. Using a regular expression to determine what type of load failure occurred is nasty, but the only alternative approaches are either
    1. more complicated
    2. prone to be wrong on 5.8

What I'd like to be able to do

and may write a patch for

use strict;
use warnings;
package Some::Module;
use Class::Load qw( :all );

sub do_setup { 
    ...
}
sub import {
    my $caller = caller(); 
    my $maybemodule = "$caller::Controller";
    if( try_load_working_class( $maybemodule ) ){ 
        do_setup( $maybemodule ); # its there, and it works.
    } else {
        do_setup("Some::Module::Default"); #its not there.
    }
}

The idea being "Syntax errors are syntax errors, there's no good reason to suppress them , at all", so in the above code, if Whatever::Controller existed, but was broken, it would die, instead of treating it as if it were absent.

UPDATE

Module Patched and on github! =). Waiting on an authoritative update =)

package App;
use Class::Load qw( :all );

sub import { 
   my $caller = caller();
   my $baseclass = load_optional_class("${caller}::Controller") ? "${caller}::Controller" : "App::Controller";
   push @{$caller}::ISA, $base_class; # this line is pseudocode.
}

UPDATE #2

On CPAN: http://search.cpan.org/~sartak/Class-Load-0.06/lib/Class/Load.pm.
Thanks Sartak =)

2010-11-13

Searching / Design spec for the Ultimate 'require' tool.

Perl's de-facto require method is something of confusing amounts of complexity, complexity that is often overlooked in the edge cases. It looks straight forward:

require Class::Name;

And you think you're done right?

Not so.

Most of the problems come from one of 2 avenues.

  1. Things that happen when the module specified cannot, for whatever reason, be sourced
  2. Things that happen when you want to require a module by string name

Point 2 is probably the most commonly covered one, and it seems to be the primary objective of practically every require module I can find on CPAN.

However, many of the existing modules, in attempting to solve the string-name issue, result in the handling of 'this module cannot be sourced' becoming WORSE!.

Module Sourcing Headaches

The mysterious Perl 5.8 double-require hell

The following code, in my testing, works without issue:

     eval "require Foo;1"; 
     require Foo;

Now, if Foo happens to be broken, and cannot be sourced, on Perl < 5.10, then nothing will happen in the above code!. Scary, but true. Its even scarier if those 2 lines of code are worlds apart.

A quirk of how Perl 5.8 functions ( which is now solved in 5.10 ) is that once a module is require'd, as long as that file existed on disk, $INC{ } will be updated to map the module name to the found file name. This doesn't seem to bad, until you see how it behaves with regard to that being called again somewhere else. Take a look at this sample code from perlfunc:

sub require {
    my ( $filename ) = @_;
    if ( exists $INC{$filename} ){
       return 1 if $INC{$filename};
       die "Compilation failed in require";
    }
    ...
}

Now on 5.10 this is fine, because $INC{$filename} is set to 'undef' if an error was encountered. But on everything prior to 5.10, the value of $INC{$file} is in every way identical to the value it would have if the module loaded successfully. And as you do not want to require the module again once it has loaded, this behaviour falsely thinks "Oh, that's already loaded" and doesn't tell anyone anywhere that there is a problem.

If that's too much reading for you, here's the executive summary of the problem: You need everyone, everywhere, who either directly, or indirectly calls require inside an eval, to make sure any compilation/parsing errors with require is handled immediately. Because failing this, everything else that requires that same broken file will treat the file as successfully loaded, will not error, and you'll just get some confusing problem where the modules contents will not be anywhere you can see them.

From a debugging perspective, this behaviour frankly scares me, and I'm very glad its fixed in 5.10, and glad I can use 5.10, but for you poor suckers stuck working with 5.8, or trying to make 5.8 backwards compatible modules, this problem will crop up eventually, if not for you, for somebody who uses your modules.

Awful exceptions are awful

This following code looks fine at a first approach, but there are many things wrong with it:

     if( eval "require Foo; 1" ){ 
         # behaviour to perform if there is a Foo
     } else {
         # behaviour to perform if there is no Foo
     }

A nice and elegant way of saying "Try use this module, and if its not there, resort to some default behaviour"

But what about the magical middle condition, where its there, but its broken?. In this code, it will silently fall back to the default behaviour, and nothing anywhere will tell you that Foo is broken, and you'll spend several hours with a dumb look on your face while you prod completely unrelated code.

What we really need is a way to disambiguate between "its there" and "its broken", because ideally, if its there, and broken, we want a small nuclear explosion.

On Perl 5.10 and higher, this isn't so hard, we can just prod $INC{} to see what happened.

TestValueImplication
exists $INC{'Foo.pm'} a false valueThe module couldn't be found on disk, or nobody required it yet
exists $INC{'Foo.pm'} a true valueThe module exists on disk, and somebody has required it
defined $INC{'Foo.pm'}a false valueThe module exists, somebody required it, but it failed ( >5.10 only )
defined $INC{'Foo.pm'} a true value
  • The module loaded successfully ( >=5.10 )
  • Absolutely nothing useful( < 5.10 )

So that approach is not exactly very nice, or very portable.

The next option you have, is, if you're fortunate enough to actually get require to die for you when it should, is regexing the exception it throws. But that is just horrible. Regexing messages from die is stupid, its limited, and prone to breaking. Proper object exceptions are our salvation. What we really need for this situation is different exceptions that indicate the type of problem encountered, so we're not left guessing with cludgy code.

Stringy require headaches

This is the lesser evil, but not without its perils.

At some stage, if you write anything moderately interesting, you'll find the need to programmatically divine the name of a module to require. This is where require tends to bite you in the ass.

sub load_plugin { 
    my $plugin = shift;
    my $fullname = 'MyPackage::' . $plugin;
    require $fullname;
}

This is simply prohibited by the Perl Gods of Yore. You have to find some other way, and there are many modules targeted at this. There are some simple approaches, but they're also somewhat dangerous approaches too sometimes.

Bad Approach

Here is something you should really avoid if you're expecting the code to be used anywhere worth having any security. DO NOT DO THIS:

sub load_plugin { 
    my $plugin = shift;
    my $fullname = 'MyPackage::' . $plugin;
    eval "require $fullname;1" or die $@;
}

Firstly, you just pretty much wrote a wide open security hole. Somebody just needs to call:

   load_plugin( 'Bobby; unlink "/etc/some/important/document";' ); 

and the show is pretty much over. That's not necessarily so tragic if its your own code, and you're the only person who ever invokes it, but if its public facing, ( and especially if the code is published somewhere ), then avoid that style like cancer, because in my opinion, its not "if" its exploitable, but "when" its exploitable. Taint mode may help you a little bit, but don't bet on it.

Secondly, if you were foolish enough to have accidentally left out that 'or die $@' part, then you will have just created an invisible bug to be discovered later for everyone using Perl 5.8. Congratulations.

Less insane approach

The less insane approach is to emulate how perl maps Package names to file names internally, and pass that value to require. ( Because when you pass something as a string to require, its expecting a path of sorts, not a module name ).

sub load_plugin { 
    my $plugin = shift;
    my $fullname = 'MyPackage::' . $plugin;
    $fullname =~ s{::}{/}g;
    $fullname .= '.pm';
    require $fullname;
}

This is good, because there's no room for accidentally forgetting to call die $@, and the worst somebody can do is specify an arbitrary file on disk to read, which is what you were doing to begin with anyway. This is way way less dangerous than allowing execution of arbitrary code. Both these code samples are still plagued by the 5.8 double-require situation, if somebody manages to require() the broken code before you do and hide the error, but that's substantially less likely to happen.

Existing Modules, and what is wrong with them

I've seriously looked at many many modules on CPAN for this task. And sadly, none fit the bill perfectly.

UNIVERSAL::require

This seems to be the most popular one. But it only solves the stringy-require issue, and in reality, adds MORE potential for failure.
  • Victim to the double-load on 5.8 issue.

    this one line of code is sufficient enough to make this weak to the double require issue.

    return eval { 1 } if $INC{$file};
    
    As discussed above, on 5.8, if the file has already been 'required' but failed, $INC{$file} will be set to the path to that file. And as a result , UNIVERSAL::require will just respond with "Oh right".

  • No Exceptions

    This module doesn't help us at all with regard to exception objects. It relies entirely on Perl's native ( virtually non-existent ) exception system

  • Actually exacerbates the 5.8 issue

    In my opinion, this module actually makes us take a step backwards in progressive coding. It replaces useful informative exception throwing, with silence, and requires you to check a return value. The result is, everyone who does Foo->require() without checking the return value, will result in the very next thing that tries to require Foo, and expect an exception when its broken, silently succeed, but there will be no "Foo"

  • 2005 called and want their Perl style back

    Seriously, we've been trying to encourage people to use stuff like 'Autodie' because checking the return value of every open, every close, and every print ( yes, print can fail! ) is tedium, lazy people often forget to, adding 'die "$@ $? $!" ' at the end of everything SUCKS, let alone throwing actual exceptions that explain /what/ the problem was.
    Try working out whether the reason open failed was the file just wasn't there, or there was a permissions issue, or one of the other dozens of possible reasons, via code, and you're stuck using regular expressions. Yuck.

  • Monkey patching

    A lot of people really dislike the monkey-patch style that bolts into UNIVERSAL. Magically turning up everywhere on every object is really nasty, and really magical, and far too much magic for something that could be achieved by using an exported method instead. Seriously, string_require("Foo::Bar") vs "Foo::Bar"->require(); the difference is not big enough to warrant the nastiness of the latter.

Module::Load

  • 5.8 Double-Load Weak

    Still relies entirely on require to die if it cant load something.

  • No Exceptions

    Relies on $@ being a useful enough value to the user

  • Implicitly treats Exceptions like scalars

    Even if in the future fantasty land Perl 's require started throwing useful Exceptions ( backtrace, attributes that explained the problem type, introspection, soforth ), the code concatentates it into another scalar, so any exceptions that may exist will get squashed

Module::Locate

  • Holy hell, what?

    the code is from 2005 and has 2005 written all over it, if it was less chatoic, I might be able to see how it works

  • Doesn't invoke require

    It doesn't use require anywhere, so it doesn't even populate $@

  • Recommended use is to pass discovered variables to require

    Doesn't sound like much of a win to me. probably prone to the 5.8 issue

  • Doesn't throw exception objects

    Seems in 2005, nobody had discovered exceptions yet really.

Module::Require

  • Module is not really designed for one-off module requires
  • Code is weak vs 5.8 issues.
  • Code is pretty high on the wtfometer
  • Code aggravates 5.8 issues with suppressed failures by ignoring $@ after failures
  • No exception objects

Mrequire

  • Mangles $@ with chomp
  • No exception objects
  • 5.8 double-require weak
  • Oh dear, please , not AUTOLOAD :(

File::Where

  • Mostly an over the top file finding library, doesn't handle any of the require stuff
  • the usual, no exceptions, 5.8 double-require-weak

Module::Use

  • Not for requiring modules at all.

autorequire

  • Not really for this job, but...
  • Has a method for detecting package loading, however....
  • That method is subject to the 5.8 double-require weakness and its friends

Acme::RequireModule

  • proDespite being Acme::, it sucks less than everything else so far!
  • Still depends on native require for exceptions
  • XS
  • Still defers to the internal require() op, so probably still suffers the 5.8 problems.
  • Depends on >= 5.10 anyway

ClassLoader

  • Just as bad interface wise as UNIVERSAL::require
  • But worse, AUTOLOAD magics
  • Documented in German
  • eval "use $string"
    , very bad
  • Substitutes Perl require-fails string-only exceptions with alternative, german, string-only exceptions. Joy.
  • Prone to 5.8 issues

Module::Runtime

  • Prone to 5.8 issues
  • Standard Perl native exceptions only

THE BEST SO FAR

Class::Load

  • pro:Actually appears to have work-arounds in place for heuristically solving the 5.8 problem!
  • pro:Tests for the above claimed fact!
  • pro:Tests pass !
  • Still no exception objects ( perl default exceptions )
  • pro: Reasonably sane API
  • pro: No need to check silly return values

tl;dr summary

Class::Load is awesome, you should use it everywhere you need require to actually work sanely with possibly-missing or possibly-broken classes (ie: everywhere that there is a user-part in a require ).
You can probably use it for more, but that might be overkill =).

The only way I can see something being better than it is if something decides to implement object exceptions with failure metadata in them, instead of needing to re-explore the failure manually

2010-09-17

Installing Multiple Perls with App::perlbrew and App::cpanminus

Having learnt from my previous mistakes, this is a simplistic way to set up multiple somewhat isolated installs of Perl in a user profile

App::perlbrew is a very handy tool for managing several user-installs of Perl, and facilitates the easy switching between Perl versions.

App::cpanminus is the most straight-forward and lightweight cpan client I've ever seen, and it just works, and works well, and leads to relatively pain-free installation 80% of the time.

1. Install A Bare copy of Perlbrew

Getting a copy of Perlbrew should be the very first thing you do. No cpanm, no local::lib, just straight perlbrew.
$ cd ~ 
$ curl -LO http://xrl.us/perlbrew

2. Setup Perlbrew

Once we have a copy of perlbrew, we run the install command of it, which completes the bootstrapping of perlbrew. Then all thats needed is to update your profile with the right magic line so that new shells will have the right environment set up.
$ perl ~/perlbrew install
$ rm ~/perlbrew
$ ~/perl5/perlbrew/bin/perlbrew init
# use the line perlbrew spits out.
$ echo "source /home/test_1/perl5/perlbrew/etc/bashrc" | tee -a ~/.bashrc

3. Enter your new perlbrew ENV

Now we enter our new shell so that we can test the change to our configuration. We run env and grep the PATH value just to double check perlbrew has worked properly.
$ bash
$ env | grep PATH
PATH=/home/test_1/perl5/perlbrew/bin:/home/test_1/perl5/perlbrew/perls/current/bin:/home/test_1/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/usr/games:.

4. Choose a mirror

This step is mostly optional, but it lets you choose which mirror perlbrew will download Perl sources from, so a local one is best for speed sakes.
$ perlbrew mirror

5. Install your wanted perl versions

Now we perform the slow installation of our Perls. In my case, I'm installing a copy of the current stable ( 5.12.2 ) and the current development release ( 5.13.2 ). The -v is optional, but you'll want it if you do not wish to die of boredom because it generally just sits there doing nothing for 10+ minutes without it.
$ perlbrew -v install perl-5.12.2
$ perlbrew -v install perl-5.13.4

6. Setup 'cpanm' for each perl

This step appears to be the most important step. If you previously had cpanm installed with system perl you do NOT want to be using that at all. When cpanm is installed, the bin/ script hard-codes a path to the perl it was installed with, so using cpanm built with system perl will build installed modules using that system perl instead, and using its install paths and soforth, and you do not want this. So, you must install a cpanm for each perl using this bootstrap technique.
$ perlbrew switch perl-5.12.2
$ curl -L http://cpanmin.us | perl - App::cpanminus
$ perlbrew switch perl-5.13.4
$ curl -L http://cpanmin.us | perl - App::cpanminus

7. Configure local cpans

Strangely, I've found a few modules I try install tend to expect a working CPAN install, regardless of what tool I'm actually using. This should be fixed, but there is a practical work-around until then. Simply configure cpan!
$ perlbrew switch perl-5.12.2
$ cpan
# Answer all setup instructions
» o conf commit
» q
$ perlbrew switch perl-5.13.4
$ cpan
# Answer all setup instructions
» o conf commit
» q

8. Test your installs

This is a list of things I've found to trip up various corner cases and indicate you've built it wrong.
$ perlbrew switch perl-5.12.2
$ cpanm --interactive -v App::cpanoutdated
$ cpan-outdated
$ cpanm --interactive -v App::CPAN::Fresh

$ perlbrew switch perl-5.13.4
$ cpanm --interactive -v App::cpanoutdated
$ cpan-outdated
$ cpanm --interactive -v App::CPAN::Fresh
With all things going to plan, those 2 things at least should build and be runnable. cpan-outdated and cpanf should both be runnable in both perls without complaining it cant find their modules, and CPAN::Inject and Compress::BZip2 should install without strange failures. ( those 2 modules lead me in prior cases to discover broken setups that needed fixing to work, so hopefully, going to plan, following the instructions above will avoid this havoc. )

9. Profit!

Thats all there is to it. Note we do NOT use local::lib for this setup. Using each Perls default local module installation directory should be perfectly satisfactory, and as long as you're in a properly configured ENV and you're using 'perlbrew' to select perl's that are not system perl, everything should be sweet =).
Ok, lots of things on my machine fail to build still, but those peskynesses I'm convinced are unrelated to the Perl setup.

10. Credit

Props to the people who helped me out with working out this configuration ( brian d foy, miyagawa, John Napiorkowski ) and to the authors of cpanm ( miyagawa ) and perlbrew ( gugod ). These are awesome tools, and once you learn them, they really can make working with Perl a much more pleasureable experience!. And also props and ♥ to the Perl Community for simply existing, and fostering this development path.

2010-09-15

I ♥ the Perl Community

miyagawa++,jjnapiork++,bdfoy++.

Perl is awesome, but the community is better, with nothing even in competition as I know it.

Where else can you blog about a confusing corner case you hit in a seemingly rare operating system and get Excellent answers from not only great people, but the author of the module the problem was in, one of the people who wrote or contributed a lot of the other useful tools you use, and the author many recognized Perl books

And then, not only did I get the right solution for my problem, but many other alternative good approaches, as well as answering parts of the question I didn't even ask with side tips that seem "related enough" that I'd likely encounter in similar ways, and how to make my life easier when that happens.

I ♥ this positive approach to programming, where people are not only caring about solving my specific problem, but suggesting things that can help me become a better programmer as a whole, and I'm frankly proud simply to be involved with a community which has such a valuable work-ethic.

Frankly, its a shame its so hard to sell Perl on the community aspect, because it is just awesome in ways I've never seen before in a Programming Language, and it by far trumps technical aspects in terms of awesomeness. If brainf**k had a community as awesome as Perl has, it would probably be better than many languages simply because of the community aspect, at least in my opinion. Its just a shame you can't convey how great such a community being preset is to newbies to the language without first immersing them in the culture and community, because to understand and appreciate, I think you must first experience it.

OpenBSD + Perl + Modern Tools and Approaches -> Me = Confused :(

So, I'm doing my first attempt at a hand-holding free install of Perl. I'm used to the niceties of Gentoo and being able to do everything through its package manager, so I thought I'd try doing it the way everyone else in the world apparently uses as "The most practical".

I'm going to walk you through what I did, mostly constructed from memory, so you have an idea of what the problem I have is, or, if you're in a similar situation, you can get some progress and learn from my mistakes once I've worked out what I need.

Normally, I'd ask about this on #perl@irc.freenode.org or something on irc.perl.org, or if appropriate, file a bug. However, in this case, I can't even conceive of which would be the right place to target my question, OpenBSD is in my estimation very "niché" market at the moment, as are lots of the modern tools for Perl, and I don't know where the appropriate place to solicit help for them are. So, I approach the ALL MIGHTY LAZY-WEB.

The Setup

  1. Installed OpenBSD 4.7
    This shall be left as an exercise to the reader as to how this works.  Its too much to cover here, and it really is pretty straight forward =).
  2. Install cpanm
    Everyone I see in Perl these days seems to be ranting about this, so I used the perscribed instructions:    
    $ curl -L http://cpanmin.us | perl - --sudo App::cpanminus
  3. I don't want to be stuck using Perl 5.10.1, which is great and all, but I'd rather be doing work with 5.12.2 and 5.13.* . And I keep getting recommendations NOT to use system Perl for ANYTHING other than getting your custom Perl running. ( Using system Perl is fine in Gentoo, at least how I use it, we've got 5.12.2 in tree now, and stuffing Perl dists into Package Management JustWorks™ ). The new sex for this is allegedly perlbrew, so I'm firing that baby up next.
    $ cpanm --sudo App::perlbrew
  4. All appears good!. Now from here on, is where I think a few things start to drift south, but not entirely sure WHERE.
    $ perlbrew init
    # add instructed line to bash
    $ bash
    $ perlbrew install perl-5.13.4 -v
    $ perlbrew install perl-5.12.2 -v
  5. All this appears to run smoothly.
    $ perlbrew switch perl-5.13.4
  6. Here is where I do the stupid things that possibly lead to my downfall. First, you must understand how I want my setup:
    1. I want my primary development user (kent) to have 2 copies of Perl available, 5.13.4 and 5.12.4
    2. I want the modules for each install of Perl to follow their respective installs so I can just switch between Perls and have the modules switch over too
    3. "Production" Will repeat this process, except with less versions of Perl, and probably with less modules installed.

    To achieve this, I insert lines in my .bashrc until it resembles this
    source /home/kent/perl5/perlbrew/etc/bashrc
    export PERLDIR=/home/kent/perl5/perlbrew/perls/current
    export MODULEBUILDRC=/home/kent/perl5/perlbrew/etc/.modulebuildrc
    export PERL_MM_OPT="INSTALL_BASE=${PERLDIR}"
    export PERL5LIB="${PERLDIR}:${PERLDIR}/i386-openbsd"
    export PERL_CPANM_OPT="--local-lib=${PERLDIR}"
    
    and .modulebuildrc of course contains this:
    install  --install_base  /home/kent/perl5/perlbrew/perls/current/
    
  7. For the most part this works perfectly, and I'm off installing modules happy as Larry.
  8. And then a few hours later, something depends on IO::Compress::BZip2. Now is the beginning of sorrows.

The Problem:

Can't find libbz2!

I'm sure as eggs I have bzip2 and family installed and working.
However, this worrisome notice appears during build:
 Entering Compress-Bzip2-2.09
Configuring Compress-Bzip2-2.09 ... Running Makefile.PL
Parsing config.in...
/usr/bin/ld: cannot find -lbz2
collect2: ld returned 1 exit status
compile command 'cc -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -Wl,-E  -fstack-protector -o show_bzversion show_bzversion.c -lbz2' failed
system bzip2 not found, building internal libbz2
Ah .... ok.
$ bzip2 -h 2❭&1 | head -n 1
bzip2, a block-sorting file compressor.  Version 1.0.5, 10-Dec-2007.
$ /usr/bin/ldd $(which bzip2)
/usr/local/bin/bzip2:
        Start    End      Type Open Ref GrpRef Name
        1c000000 3c006000 exe  1    0   0      /usr/local/bin/bzip2
        065b5000 265b9000 rlib 0    1   0      /usr/local/lib/libbz2.so.10.4
        07295000 272ce000 rlib 0    1   0      /usr/lib/libc.so.53.1
        0643c000 0643c000 rtld 0    1   0      /usr/libexec/ld.so
Ok, so maybe it is a bit geriatric
That should be fine though right? WRONG

Something magical keeps finding Perl 5.10.1 :(

Surely, this abomination will not end well:
Building and testing Compress-Bzip2-2.09 for Compress::Bzip2 ... cp lib/Compress/Bzip2.pm blib/lib/Compress/Bzip2.pm
AutoSplitting blib/lib/Compress/Bzip2.pm (blib/lib/auto/Compress/Bzip2)
cd bzlib-src && make 
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   blocksort.c
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   huffman.c
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   crctable.c
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   randtable.c
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   compress.c
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   decompress.c
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   bzlib.c
ar cr libbz2.a  && ranlib libbz2.a
cc -c    -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"\"  -DXS_VERSION=\"\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   bzip2.c
/usr/bin/perl /usr/libdata/perl5/ExtUtils/xsubpp  -typemap /usr/libdata/perl5/ExtUtils/typemap -typemap typemap  Bzip2.xs > Bzip2.xsc && mv Bzip2.xsc Bzip2.c
cc -c  -Ibzlib-src  -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include -O2     -DVERSION=\"2.09\"  -DXS_VERSION=\"2.09\" -DPIC -fPIC "-I/usr/libdata/perl5/i386-openbsd/5.10.1/CORE"   Bzip2.c
In file included from Bzip2.xs:7:
ppport.h:231:1: warning: "PERL_UNUSED_DECL" redefined
In file included from Bzip2.xs:4:
/usr/libdata/perl5/i386-openbsd/5.10.1/CORE/perl.h:330:1: warning: this is the location of the previous definition
Running Mkbootstrap for Compress::Bzip2 ()
Um. Um. Um.
How about NO
$ perl -v  | grep version 
This is perl 5, version 13, subversion 4 (v5.13.4) built for OpenBSD.i386-openbsd
That's going to go down like a houseboat on fire.

What comes next is only a natural

t/010-useability.t ...... 1/3 ol 'BZ2_bzDecompressInit'nm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symb
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzDecompress'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzBuffToBuffDecompress'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzDecompressEnd'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzCompress'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzBuffToBuffCompress'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzlibVersion'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzCompressInit'
/usr/bin/perl:/home/kent/.cpanm/work/1284524774.31144/Compress-Bzip2-2.09/blib/arch/auto/Compress/Bzip2/Bzip2.so: undefined symbol 'BZ2_bzCompressEnd'
And more and more of that explosion until you see:
Files=25, Tests=33,  7 wallclock secs ( 0.35 usr  0.21 sys +  4.74 cusr  1.44 csys =  6.74 CPU)
Result: FAIL
Failed 25/25 test programs. 30/33 subtests failed.
Oh crap. That's not good.
Something Seriously wrong is going on here, but hell knows what it is, and I'm the least qualified to work it out.

Call For Halp

If you know what I've done wrong, and how to correct this fatal flaw, please, point me straight. I can only reward you with Karma Cookies and a blog of response and update.
I acknowledge that CPANTS lists many many passes for this module, so it must be I who is at fault, right?

perl -V

Summary of my perl5 (revision 5 version 13 subversion 4) configuration:
   
  Platform:
    osname=openbsd, osvers=4.7, archname=OpenBSD.i386-openbsd
    uname='openbsd stridor.lan 4.7 generic#558 i386 '
    config_args='-de -Dprefix=/home/kent/perl5/perlbrew/perls/perl-5.13.4 -Dusedevel'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=y, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='3.3.5 (propolice)', gccosandvers='openbsd4.7'
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags ='-Wl,-E  -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib
    libs=-lm -lutil -lc
    perllibs=-lm -lutil -lc
    libc=/usr/lib/libc.so.53.1, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-DPIC -fPIC ', lddlflags='-shared -fPIC  -L/usr/local/lib -fstack-protector'


Characteristics of this binary (from libperl): 
  Compile-time options: MYMALLOC PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
                        PERL_USE_DEVEL USE_LARGE_FILES USE_PERLIO
                        USE_PERL_ATOF
  Built under openbsd
  Compiled at Sep 14 2010 11:31:21
  %ENV:
    PERL5LIB="/home/kent/perl5/perlbrew/perls/current:/home/kent/perl5/perlbrew/perls/current/i386-openbsd"
    PERLDIR="/home/kent/perl5/perlbrew/perls/current"
    PERL_CPANM_OPT="--local-lib=/home/kent/perl5/perlbrew/perls/current"
    PERL_MM_OPT="INSTALL_BASE=/home/kent/perl5/perlbrew/perls/current"
  @INC:
    /home/kent/perl5/perlbrew/perls/current
    /home/kent/perl5/perlbrew/perls/current/i386-openbsd
    /home/kent/perl5/perlbrew/perls/perl-5.13.4/lib/site_perl/5.13.4/OpenBSD.i386-openbsd
    /home/kent/perl5/perlbrew/perls/perl-5.13.4/lib/site_perl/5.13.4
    /home/kent/perl5/perlbrew/perls/perl-5.13.4/lib/5.13.4/OpenBSD.i386-openbsd
    /home/kent/perl5/perlbrew/perls/perl-5.13.4/lib/5.13.4
    .

2010-08-06

Why Am I not using Perl 6 Yet?

I'm not here to deride it, I think its pretty, the syntax is nice, and it lacks some of the annoyances I currently have with Perl 5. Its got great features, and I whole heartedly want them to keep on trucking with that project.

My problem is not a petty squabble over things like Hurr, not perl5 enough or Derp, uses too much rams!, or Its too slow! or qualms about its completedness or its buggyness.

To a pragmatic person, none of those things really matter that much, you have to be doing really heavy work for speed and ram to be a problem on a Modern machine, and for a lot of things, I could not care less if startup time was a WHOLE SECOND LONGER. Hell, the total amount of time spent bitching about load time and speed now, in the real world, is likely to exceed the net total amount of time spent actually waiting for Perl6 to start. And the volumes of text and debate on this issue is almost certain to be a much larger waste of memory ( considering how much a single bit of information is replicated everywhere, and how it has to be replicated just to be *read*, and all the transport stuff that makes that possible ).

Back on the subject!

I think my biggest reason for not using Perl6 yet is that I'm not using Perl6 yet. I guess this is somewhat circular reasoning. but the problem is when I think "Oh, I have a task to achieve", my brain instantly starts forming it with regard to Perl5 and its idioms and methods.

Additionally, When I use Perl5, I'm not really spending a great deal of time messing around with its syntactical nuances. What I'm spending more time doing, is importing and using code and modules that already exists. I have a good mental understanding of all those great Perl modules from CPAN, and which ones I can JustUse to do whatever it is I want to be doing.

When I want to be doing something I don't already know how to do, the first thing I'm hitting up CPAN to see if somebodys done it already in a way I need, or to see if there are a few aggregate parts I can scrape together to make what I want

Also, most of my coding these days revolves around my various Perl5 modules, enhancing, maintaining, etc, and all this of course requires Perl5 to be employed. Its silly to consider depending on Perl6 to make a Perl5 module. And although I know I probably should be helping to reduce this problem by making Perl6 ports of my modules, its a bit chicken-egg because many of my modules are extensions for other Perl5 modules.

So, essentially, going Perl6 would require me to basically throw out everything I know, and then resort to doing things myself? If this is not the case, I don't understand/see how else I'm expected to do something in Perl6.

There's lots of fun examples of people doing raw hacking in Perl6, but I don't see boatloads of people using modules, and I don't see boatloads of Perl6 modules on CPAN when I'm searching for things I need to do.

If there's a secret second c6pan somewhere I'm just not seeing that these magically awesome Perl6 modules are being served on instead, Somebody should post a link to somewhere I'm likely to stumble over it.

Because presently, the gut reaction is barely better than suggesting I move back to PHP, where I have to reinvent every wheel myself in the event my behaviour is not implemented by a core PHP feature.

And the idea of being stuck back in that mindset is less than inspiring to me.

What would it take me to switch?

In a nutshell:
  • A much more obvious path to adoption
    • Obvious path to learning core syntax
    • Obvious path to finding extensions/modules
  • A More Comprehensive Archive of Perl6 modules.
  • Being things I currently use available on Perl6 in similar ways to how they are now, so I can jump ship, and start using those versions instead, and then start hacking on/improving those things with my own modules.

2010-07-30

Git Internals: An Executive Summary in 30 Lines of Perl, for smart newbies.

Update: Modified code a bit to handle the 'pack' specials. They're not so straight forward, will blog more on that later.

This blog post is not intended as a replacement for a real in-depth understanding of Gits command line interface, but it does aim to maximise the exposure of how it works internally, as really, its internal logic is astoundingly simple, and anyone with a good background in graph theory and databases will pretty much be able to quickly see the elegance in it. For more details, check out the excellent book, Pro Git, especially the internals chapter

The code

Gits core essentials, are almost nothing more than a bunch of deflated(zlib) text files. I'm going to assume you've got enough intelligence to RTFM and get a copy of something gitty and text based checked out. Perl Modules are good examples of this. I'm using my Dist::Zilla::PluginBundle::KENTNL::Lite tree.

git clone git://github.com/kentfredric/Dist-Zilla-PluginBundle-KENTNL-Lite.git /tmp/SomeDirName

I'm going to show you the core of git's system, which is just the "object" store.

cd /tmp/SomeDirName/.git
find objects/

Woot, there is all your files and stuff in git. How does it work? Thats where the perl script comes in.

#!/usr/bin/perl
use strict;
use warnings;

use Compress::Zlib;
use Carp qw( croak );

sub inflate_file {
    my ( $filename , $OFH ) = @_;
    my ( $inflator, $status ) = Compress::Zlib::inflateInit or croak("Cannot create inflator: $@");
    my $input = '';
    open my $fh, '<', $filename or croak("Can't open $filename, $@ $! $?");
    binmode $fh;
    binmode $OFH;

    my ( $output );
    while ( read( $fh, $input, 4096 )) {
        ( $output , $status ) = $inflator->inflate( \$input );
        print { $OFH } $output if $status == Compress::Zlib::Z_OK or $status == Compress::Zlib::Z_STREAM_END;
        last if $status != Compress::Zlib::Z_OK;
    }
    croak( "Inflation failed of $filename , $@" ) unless $status == Compress::Zlib::Z_STREAM_END;
}

for ( @ARGV ) {
    next if $_ =~ /\.(idx|pack)|packs/;
    print qq{<--------BEGIN $_ --------->\n};
    inflate_file( $_ , *STDOUT );
    print qq{<--------END $_ --------->\n};

}

Pretend you cargo-cult dump that code to /tmp/deflate.pl

Now check this out:

perl /tmp/deflate.pl $( find objects/ -type f ) | cat -v | less

Awesome, you're now seeing the guts of how your repository works. For real. All we did was deflate each and every object. You'll see 3 types of object, ( each object says at the front what type they are before the ^@ ), tree's, blobs, and commits ( with trees being the most complicated of all ).

Blobs, they're just a files contents

Commits, all they are is a blob of text, with commit messages and stuff, timestamps, etc, and with text references (pretend its like an a-href in a web page or something ) to preceding ( parent ) commits, and a commit tree.

Trees are probably the hardest to work out just by looking at it. Its more or less just another text file, with another list of text references, except text references are pointing at either blobs, or other trees. So, you can pretend a "tree" is like a "dir" in some ways. There's data besides this, like file/dir names, and permissions, but thats the gist of it.

This has been your executive summary =)

2010-07-19

Current Limitations In Exception Driven Perl: Stringy Core Exceptions

Lets just assume for one moment that we have a proper Exception Hierarchy, and that this wasn't a huge gaping hole in the current Exception landscape.

There's still the other problem of so much Perl code being not designed in Exception friendly ways.

die "$string"and croak "$string" is about as detailed as you get from most things.

And I'm sure everyone agrees that only passes for the bare minimum of exception handling techniques. No benefits of runtime stack introspection ( Edit: Ok, not without mangling sigdie, yuck ), re-throwing exceptions without losing the source failure point ( Edit: to clarify, not all 'die' calls are represented in the error ), let alone problem classification without resorting to regexing' the failure string. ( and that's far from reliable, considering those strings are targeted at humans, not machines, so are prone to being modified at a time later in life in a way your regex won't recognise, breaking your code ).

autodie is a good start to solving this problem, it doesn't have all the bells and whistles I'd hoped for, it has an error hierarchy, but it doesn't appear very flexible to extensible into other projects ( the whole thing is defined in a 'my' variable in Fatal.pm it seems ), and additionally, it doesn't supplement any of the things in Perl that already just die by throwing their own stringy exception, because as far as autodie appears concerned, if its already throwing an exception, why replace it?

One such builtin that is in this type of problem is require

There are at least 3 unique separate failure conditions that I know 'require' can spit out.

  • File not Found in @INC
  • require returned false value
  • compilation failed in require

All of the above being reported merely as strings leaves much to be desired. Sure, its great when things fail in obvious ways, but handling it in code is far too pesky.

Not everyone will have experienced this problem of course, but let me demonstrate a scenario.

sub findFirst { 
  my $plugin = shift;
  my $parent = "SomeApp";
  my @guessOrder = ( $plugin . "::" . $parent , $plugin );
  my @fails;
  for( @guessOrder ){
     local $@;
     eval "require $_; 1";
     if ( $@ ) {
        die $@ if $@ !~ /not found/ ; 
        push @fails, $@;
     } else { 
        return $_ ; 
     }
  }
  die "Couldn't load any of @guessOrder : @fails ";
}

my $plug = findFirst("Foo::Bar");

This is about as semantically clean as I can get it. The goal here is to permit "Not Found" family of require failures, but upon encountering something that exists but is merely broken, then push that failure up to userland, and, in the event none are found, dump all the errors out showing all the attempted paths that were searched and what was searched for.

But there are several problems with this code, the most obvious is that stringy eval is a really bad idea, I had hoped that at least one of the workarounds for this sillyness on CPAN came with something that threw an Exception object instead of a string.... but no, all I can find is ones that rely on the stock Perl system, and ones that go contrary to all logic and require you to check a return value for failure.

Another problem is the check for a string in the error. This is not as big a problem, but somebody malicious I guess could break something by explicitly crafting a death message that matched that line.

Another lovely problem is that death-rethrowing thing. Finding everywhere that the problem occurred in a non-insane way is hard. Ideally, not only should you have a trace depth from top level down to the point of the failure, but also a trace of everywhere the error was re-thrown, because the failure is really a domino effect, and not being able to see how it propagates without dropping into a debugger is hell.You tend to need more complex cases to see why this is happening though.

#!/usr/bin/perl

use strict;
use warnings;

sub fail {
  die "Hurp Durp!";
}

sub maybfail {
  unless ( eval { fail; 1; } ) {
    die "maybfail: $@";
  }
}

sub moarfail {
  unless ( eval { maybfail; 1; } ) {
    die "Moarfail: $@";
  }
}

moarfail;
To me, I'd like to be able to see that
  • the root error occurred as main:22 { moarfail:17 { maybfail:11 { fail:7 { die } } }
  • The error was rethrown at main:22{ moarfail:17 { maybfail:12 } }
  • The error was rethrown at main:22{ moarfail:18 }
At present, here's the best I can get out of that simple structure:
$ perl -MCarp::Always /tmp/die.pl 
Moarfail: maybfail: Hurp Durp! at /tmp/die.pl line 18
 main::moarfail() called at /tmp/die.pl line 22
$ perl /tmp/die.pl 
Moarfail: maybfail: Hurp Durp! at /tmp/die.pl line 7.
$ perl -MCarp::Always /tmp/die.pl 
Moarfail: maybfail: Hurp Durp! at /tmp/die.pl line 18
 main::moarfail() called at /tmp/die.pl line 22
$ perl -MDevel::SimpleTrace /tmp/die.pl 
Moarfail: maybfail: Hurp Durp!
 at main::fail(/tmp/die.pl:7)
 at (/tmp/die.pl:11)
 at main::maybfail(/tmp/die.pl:11)
 at (/tmp/die.pl:17)
 at main::moarfail(/tmp/die.pl:17)
 at main::(/tmp/die.pl:22)

Note how none of those traces reflect the fact I call "die" on line 12? Be glad the die isn't like 30 lines away in a different method where it might go completely unnoticed.

In fact, each and every one of these backtraces confuse me, because I can't work out why some know about the failure origin, and others don't ... ( Carp::Always seems to let you down and being completely unable to see a stack. :/ )

I would in fact, much rather prefer something like this that actually worked:

#!/usr/bin/perl

use strict;
use warnings;

sub fail {
    BasicException->throw( error => 'HurpDurp' );
}

sub maybfail {
  try { 
      fail;
  } catch ( BasicException $e ) { 
     MoreComplexException->adopt( $e )->throw( error => 'Maybfail');
  }
}

sub moarfail {
  try { 
      maybfail;
  } catch ( MoreComplexException $e ) { 
     EvenMoreComplexExcetpion->adopt( $e )->throw( error => 'Moarfail');
 }
}

moarfail;

Nothing I've seen handles that "adopt" thing, but its my little way of saying "We are in fact creating a new exception, because we want to provide more information about the problem, and increase the meaning of the problem relative to this context, but we also want to recognise that this problem is likely caused by another problem(s) that we identify here."

In case you TL;DR'd here, ( and because my train of thought was just snapped -_- ), the summary of this is: Its really challenging doing proper exception-oriented Perl when so many code features still throw those nasty stringy exceptions. :(

2010-07-18

Current Limitations In Exception Driven Perl: Exception Base Classes.

I've started re-attempting to do Exception Oriented Perl Programming recently, and quickly discovered a whole raft of things that got in my way.

This is the first of such things.

I was very much appreciative of Exception::Class, it looks Mostly to Do The Right thing, its mostly simple and straight forward, it itself has some apparent limitations with regard to exception driven code, but I'll cover those later.

The biggest annoyance I have at present is there is no apparent de-facto base set of Exception classes to derive everything else from. I was expecting some sort of Exception Hierarchy much like Moose's Type Hierarchy, but none is to be found anywhere, and this stinks.

Is everyone to have their own base hierarchy for everything? The idea of every project having its own FileException class ship with it to me feels like Fail, and this problem I feel will be needed to addressed before more people start taking exception driven Perl seriously.

Additional to this fun, is presently, all the exception classes share the same name-space as everything else in Perl, because they're just Perl packages. I accept this limitation is mostly Perl's fault, but I still dislike it. The 'Type' name-space suffers a similar problem, but its not quite so bad.

The challenge here is having adequate classes to represent accurately all the classes of exception one wishes to provide, but have them still sanely organised, but without people needing to type out 100character incantations just to throw an exception.

Something akin to MooseX::Types which injects subs into the context would be nice-ish, the only problem there is when you do something stupid like create/import an exception with a name identical to a child namespace, ie:

   package Bar;
   use SomeTypePackage qw( Foo );
   use Bar::Foo; # Hurp durp. Bar::Foo->import() ==> Bar::Foo()->import() 
   Bar::Foo->new(); # moar hurp durp. Bar::Foo()->import() 

Its reasonably easy to work around, but discovering you've failed in this way is slightly less than obvious.

2010-06-27

Todays amusing Perl parser confusion

Have a look at this very simple code and see what you expect it will do:

#!/usr/bin/perl
use strict;
use warnings;


print "hello";

1

=pod

=cut
__END__

It looks trivial right?

Not so.

$ perl /tmp/pl.pl 
Can't modify constant item in scalar assignment at /tmp/pl.pl line 13, at EOF
Bareword "cut" not allowed while "strict subs" in use at /tmp/pl.pl line 8.
Bareword "pod" not allowed while "strict subs" in use at /tmp/pl.pl line 8.
Execution of /tmp/pl.pl aborted due to compilation errors.

Wait.

Wut?

Running it through Deparse reveals the culprit:

$ perl -MO=Deparse /tmp/pl.pl 
Can't modify constant item in scalar assignment at /tmp/pl.pl line 13, at EOF
Bareword "cut" not allowed while "strict subs" in use at /tmp/pl.pl line 8.
Bareword "pod" not allowed while "strict subs" in use at /tmp/pl.pl line 8.
/tmp/pl.pl had compilation errors.
use warnings;
use strict 'refs';
print 'hello';
1 = 'pod' = 'cut';
__DATA__

Pesky indeed!.

The solution? Insert the humble ; like your mother taught you to.

#!/usr/bin/perl
use strict;
use warnings;


print "hello";

1;

=pod

=cut
__END__
$ perl /tmp/pl.pl 
hello

Perhaps this is worthy of applying a bugfix. Perl version = 5.12.1 =).

2010-06-25

Any good advice on focusing on the one scope in this massively metarecursive language?

The recursivity of the meta-programming these days in Perl is astounding.

This is not necessarily a bad thing, but it has its drawbacks in various fields

While I love authoring modules, and I love contributing to various projects, I often find this is a need, when I would rather be focusing on something that I need.

An Example

Let me give you and example: one of my family members requested them work on a website for them, for one of their businesses, and I as a result want to produce the best product I possibly can for this.

The first concern I encountered was shipping it. I need to be able to develop this website in a way that I can ship it somewhere ( target unknown ) and have a relatively quick, relatively hassle-free installation that Just Works, so in the event I have to hand the code over to somebody else to work with, or ship it to a different server where I may have less control over the environment or distribution it runs, it will still mostly just work

This lead me to my state-of-packaging post, where I started wasting various time trying to work out what best way to bundle/package and otherwise get the software to just work.

This need sort-of emerged out of the want to use the latest and greatest tools, such as Plack, the latest editions of Moose, etc.

However, as discovered in the aforementioned article, the state of linux distributions with regard to Perl in the larger scale largely sucks, and pretty much the "best" option tends to result in "using CPAN".

CPAN is great and all, don't get me wrong, but compared to existing linux distribution package management techniques, Perl dependency and file management leaves much to be desired. Sure, its miles ahead of Ruby and Python, ( not to mention evolutions of species better than PHP, Java and C/C++'s native package management ) but since when do we use the lesser tools as our measure of standard?

So anyhow, after musing for several days on this dilemma, researching various options, talking to various people, and blogging about it, and not getting very far, I decide I'm just wasting my time again and I should just hack something up on my box, and worry about this package management crap later

Distraction 2.0

So, I decide to get it working on my machine first, worry about everywhere else later, you know, when it matters. This is of course a potentially dangerous decision from a reliability standpoint, because you may discover whatever technique you decided to use on your system is completely non-viable on another.

On my machine, the first thing I do is go through my toolkit and update all the various packages I'll need using my Distributions Package Management tools. ( This surprisingly in my experience sucks less than it does than on the other distributions I've tried ).

Then I discover a discrepancy in how another developer has mapped Perl dependencies to Package Manager dependencies, that is different to how I've been doing them, and I then have to work out if its merely an error, or its intent. ( The specifics of this I won't bore you with here ). As part of diagnosis, while I'm waiting for a response on IRC from the developer who wrote that mapping, I of course write a Perl script to work out where else this style of mapping is being used in attempt to gauge how often it is used.

This eventually diverges until I'm parsing individual build scripts with Perl and am trying to extract balanced bracket sets from these files with context. ( Bad me, I should have just used Text::Balanced )

Fortunately, I disregarded that script eventually, because I realised how much of the day I'd wasted on this problem already. Argh. Still no closer to even starting the actual code :|

Other times, when doing the update phase, I discover a package incompatibility with Perl, for whatever reason. A recent example is some bizarre failure with Eval::Context. This failure is being a bit hard to trace down, because the failure occurs, as far as I can make out, in Carp. The usual techniques such as -MCarp::Always or -MDevel::SimpleTrace do not want to work, as for some reason, their presence cause the wonderful Heisenbug scenario, the bug vanishes! ( well, and a new one appears in its place ). And to make matters worse ( much much worse ), when I run the build + test by hand instead of under the packager sandbox installation system, the bug also vanishes. Pesky indeed. ( I haven't filed a bug for the above yet, in case you're asking, there's simply no point filing one until I can reliably recreate the scenario in a sterile way. And as a general rule I've found with Perl, most of the time, If I figure out what the problem is, I figure out a solution at the same time )

Lets assume for a moment I was able to actually work out what was going on, after dicking around for a few hours, I'd probably have found a patch that worked too, and possibly submitted a bug-request and patch to upstream, and then applied the workaround to the Perl overlay, I'd be able to get on my way to the next package.

Granted, at the moment, the number of failing packages I'm encountering is much much higher as I'm helping test the Perl 5.12.1 release precluding the integration into the main tree, and I'm voluntarily fixing these things because somebody has to test this stuff before it hits Luser land

More Recursion

Its not the case this time, I mean, yet with this project ( mostly because its yet to have any code! ), but I often find myself swimming deeper and deeper into the metaprogrammy sea.

In the beginning, it was just writing modules that made my life easier.

Then comes the fun of distribution of those modules to make others life easier

Then comes the want to make distribution of Modules easier

Then comes the awesome madness that is Dist::Zilla

Then comes you writing plugins for Dist::Zilla

Then you're writing plugin bundles for the above

Then you're working on Dist::Zilla itself( Patches ! :D )

Then you're contributing code to other peoples Dist::Zilla plugins

Then you're contributing code to fix various packages that other peoples Dist::Zilla's plugins use.

All this is great stuff, really, community++, but something in the back of my mind says "Hey, you're lost in the meta, you're so far removed from what you were actually trying to achieve you can no longer see the woods for the trees, in fact, you can't even see trees, all you're seeing is carbon atoms and you're trying to compute the spin on their electrons!"

My Problem Really

I think my problem is really I don't see a viable way of staying strictly a "high-level abstraction" consumer, and just using the abstractions that exist to achieve my goal, and I'm always drilling down into the guts of things, patching their core, getting all low-level into the implementation of things and forgetting my original goal for weeks.

The best I can come up with is "hey, perhaps you'll have to be anti-contributive a bit, an er, yuck, but write code that is probably redundant somewhere in a way that's not really optimally reusable, because the long-term maintenance requirements of publican shared code are a bit high"?

I think I just sicked up in my mouth at the idea of that :/

But I have to find some way to focus on the project level, food doesn't put itself on the table!

Some basic statistics on "Line Noise"

I was reading another blog about somebody intending to analyse what amount of perl code constitutes as "Line Noise", but they didn't appear to have Actually Done It.

I took a naïve approach and didn't make any assumptions about what "line noise" constitutes, and just did basic statistics on the prevalence of various characters for the sake of interest.

Partial Dump
  0.2 % :   511319 x char   64 : "\@"
  0.2 % :   564540 x char   55 : 7
  0.2 % :   593117 x char   79 : "O"
  0.2 % :   601710 x char   77 : "M"
  0.3 % :   675072 x char   92 : "\\"
  0.3 % :   684986 x char   68 : "D"
  0.3 % :   698665 x char   78 : "N"
  0.3 % :   709768 x char   76 : "L"
  0.3 % :   712074 x char   80 : "P"
  0.3 % :   763426 x char   56 : 8
  0.3 % :   784577 x char  107 : "k"
  0.3 % :   797560 x char   82 : "R"
  0.3 % :   833723 x char   54 : 6
  0.4 % :   912737 x char   52 : 4
  0.4 % :   920716 x char   93 : "]"
  0.4 % :   921001 x char   91 : "["
  0.4 % :   924075 x char   73 : "I"
  0.4 % :   947539 x char  118 : "v"
  0.4 % :   956653 x char   67 : "C"
  0.4 % :   996323 x char   65 : "A"
  0.4 % :  1000637 x char   83 : "S"
  0.5 % :  1125435 x char  119 : "w"
  0.5 % :  1151874 x char   46 : "."
  0.5 % :  1220735 x char   34 : "\""
  0.5 % :  1222341 x char    9 : "\t"
  0.5 % :  1222927 x char   51 : 3
  0.5 % :  1241600 x char   69 : "E"
  0.5 % :  1243448 x char   53 : 5
  0.5 % :  1332828 x char   84 : "T"
  0.6 % :  1443662 x char   57 : 9
  0.6 % :  1491434 x char  120 : "x"
  0.6 % :  1499376 x char  125 : "}"
  0.6 % :  1500792 x char  123 : "{"
  0.7 % :  1718028 x char  103 : "g"
  0.7 % :  1739054 x char   40 : "("
  0.7 % :  1739695 x char   41 : ")"
  0.7 % :  1792258 x char   59 : ";"
  0.7 % :  1825133 x char  121 : "y"
  0.8 % :  1837291 x char   98 : "b"
  0.8 % :  1842316 x char   35 : "#"
  0.8 % :  1960600 x char   50 : 2
  0.9 % :  2149806 x char   62 : ">"
  1.0 % :  2410416 x char   49 : 1
  1.1 % :  2594921 x char   61 : "="
  1.1 % :  2684166 x char   95 : "_"
  1.1 % :  2709633 x char  112 : "p"
  1.2 % :  2818643 x char   58 : ":"
  1.2 % :  2952175 x char  104 : "h"
  1.2 % :  2995621 x char   45 : "-"
  1.3 % :  3151943 x char  109 : "m"
  1.3 % :  3283418 x char   36 : "\$"
  1.3 % :  3291138 x char  102 : "f"
  1.4 % :  3339529 x char   39 : "'"
  1.4 % :  3355931 x char  117 : "u"
  1.5 % :  3638254 x char   99 : "c"
  1.6 % :  4016055 x char  100 : "d"
  1.9 % :  4598003 x char   44 : ","
  2.0 % :  4786703 x char  108 : "l"
  2.2 % :  5472272 x char   48 : 0
  2.6 % :  6279579 x char  110 : "n"
  2.6 % :  6306811 x char  111 : "o"
  2.7 % :  6625715 x char  105 : "i"
  2.8 % :  6872608 x char  114 : "r"
  3.0 % :  7315145 x char  115 : "s"
  3.1 % :  7522087 x char   97 : "a"
  3.6 % :  8711403 x char   10 : "\n"
  3.7 % :  8972142 x char  116 : "t"
  5.4 % : 13289205 x char  101 : "e"
 24.2 % : 59186425 x char   32 : " "

I find it quite intriguing how the various bracketings are unbalanced. Also the significantly greater use of ">" vs "<" indicates people write more than they read.Edit: probably more =>

Also, what is extremely amusing, is in this sort order, ignoring "r" "a" and "t" and all whitespace going down, a word is formed. That word.... is "noise". Weird.

For a full dump of my diagnositcs, see my github gist

The code I used to generate these stats is pretty straight forward, and would be interested in seeing what sort of results other people get, and possibly the result of adapting the code to work for C and other non-perl languages to work out how much "line noise" they are.

#!/usr/bin/perl
use strict;
use warnings;

use 5.12.1;
use File::Find::Rule            ();
use File::Find::Rule::Perl      ();
use Data::Dumper                qw( Dumper );

say $_ for ( @INC );

my @pmfiles = File::Find::Rule->perl_file->in( @INC );

my %stats;

for my $file ( @pmfiles ){
    say "scanning $file";
    open my $fh, '<', $file or next;
    my $char;
    while( read $fh, $char, 1 ){
        $stats{$char}++;
    }
#    last;
}

my @data = sort { $a->[0] <=> $b->[0] } map { [ $stats{$_} , $_ ] } keys %stats;

$Data::Dumper::Terse = 1;
$Data::Dumper::Useqq = 1;

my $numchars;
$numchars += $_ for values %stats;

for( @data ){
    printf "%5.1f %% : %8d x char %4d : %s" ,
       ( $_->[0] / $numchars * 100 ) , 
       $_->[0] , 
       ord( $_->[1] ),
       Dumper( $_->[1] );
}

2010-06-17

The Search for the Perfect Project Setup

I feel a bit like a retard today.

Perhaps, a spectacular one. I don't even know what to search for with regard to my problem as follows, and I guess I don't have the best Idea of what I want, so I'm blogging about it in the hope I can linearise my thought process a bit and work out what to do, and perhaps, somebody can point me in the right direction.

NB. There's a fair bit of "TL;DR" content here, but it stands in case people try to suggest I use these solutions instead, Its primarily a demonstration of what I've tried, and the logic I've obtained therein which I used to reach my current conclusion, and thus, my actual request.

Firstly, My current situation

At the moment, I install all my modules, not via any of the CPAN clients, but through my distribution. This yields a much cleaner system, and dependency tracking is more reversible, which files were installed by which distribution is more reliable, and distribution collisions are explicitly barred.

This is moderately straight forward, in Gentoo, we have these ebuilds which automate most of the hard work, and the technical debt of building a CPAN module and installing it is pretty much 0. A single 30 line text file, most of which is boiler-plate, ( and generated ), and its essentially bash code, almost freebsd in nature.

I'm not a fan-boy for Gentoo for any of the traditional reasons people ascribe to it ( i.e. as funrolloops portrays ). I actually like how the package management works, I like having access to all the source, I like being able to break stuff and report reasonable bug reports to get actual bugs fixed, and I like being able to Just Fix It myself when I want to. I'm not going to go and rubbish anybody else for their distribution choices or why they choose them, just for me, Gentoo is the sweet spot in my taste system. ( I just expect people to return the favour and not treat me like the retard because I'm not using $THEIR_SYSTEM )

As a general rule, other distributions have given me various headaches for various reasons, I haven't tried Arch yet, so I can't write that off as unfit for my way of working yet, but from what I see its mostly nice.

Perceived Obstacles: In walks Deb/Buntu

For various reasons, my way of working with Perl on Gentoo is not very friendly on some other Distros. At present, I have box running Ubuntu, which I initially set up to JustWork and be pretty simple for flatmates to use as an Internet terminal. It has since lost this role, and its really too much effort for me to wipe it off and install $OtherDistro from scratch on it. And fundementally, needing to do that just to work in Perl on that distro in a satisfactory nature is either a failure in that distro ( Snarky comments about Ubuntu here ), a failure of Perl ( I hope not, ) , or a failure of myself ( Pretty likely ).

I've seen and tried using dh-make-perl and its behaviour is very dis-satisfactory. Unfortunately, the most recent Perl I can get on Ubuntu is 5.10.1 , and the most recent version of dh-make-perl I can get on Ubuntu is the geriatric 0.62, which is goodness knows versions behind Debians equivalent.

dh-make-perl problems

  1. Non Recursive nature

    I can handle this, that's OK, I'm used to walking deps by myself on Gentoo where needed and satisfying them, its not challenging. But that said, these files are generated build scripts which are just text files, which are essentially generated from a naive template, and this is *really fast*. The dh-make-perl script by comparison takes as long to generate and build the .deb file as I could generate and edit the text file myself by hand!.

    Additionally, at present I only generate my files by hand by choice. I only do it by hand to guarantee quality in the generation, so that I can redistribute it.

    I could just use Vincent Pit(VPIT)++'s marvellous CPANPLUS::Dist::Gentoo which for the most part JustWorks™. It does all the cool recursive traversal, generation of ebuilds where needed, and its hands free, and fast.

    I attempted to use CPANPLUS::Dist::Deb, and that kinda just failed, which I'll go into later

  2. On half the things I've tried to build with it so far, its failed

    Again, possibly I'm a retard, or possibly Ubuntu is failing again, but it keeps dying with weird problems trying to find dependencies, or computing dependencies, and sometimes even can't detect things that have been built earlier and installed. ( For the record, I've been banging my head against the wall trying to get Plack to build )

    Sure, due to the nature of perl stuff its a bit hellish to extract dependencies reliably in all cases, but even then, this is Plack man, its pretty straight forward.

    Gentoo dependencies are reasonably simple to sort out when automation gets it wrong, the Debian format? I don't even know where to start.

    Granted I haven't spent much time reading the Debian Developer Guides to learn how to fix this sort of problem, and what sort of incantations to call to get something to build once I've manually fixed the problem, but its really overkill to even need to do that, I didn't need to read anything to start hacking on ebuilds. Its all self-contained and its bash, a language I already know, and extremely straight forward. Sure, I needed to learn a bit for supremely advanced edge cases, but I don't see demand for those on a regular basis.

I guess the obvious solution to the above would be learning more about Debian? But I've already exercised more than my share of WTF quota in this avenue.

CPANPLUS::Dist::Deb

Either this module sucks, or its just terribly broken, or its sucking due to ubuntuisims. My impression is its starting to be a little under-maintained, but not sure. The first time I tried to use it ( well, install it that is ), the majority of its tests just failed hard. So, I upgraded from Karmic to Lucid, and as a result, tests just Hang instead for about 5 minutes, before running the tests again, and failing most of them. Brilliant.

make[1]: Entering directory `/home/anyone/pl/CPANPLUS-Dist-Deb-0.12'
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/00_constants.t .. ok     
t/01_load.t ....... ok    
t/02_debs.t ....... 1/? # Taking care of Build / xs  # massive hang here.

And then the rest of the Massive Failure is too big to include even in this inordinately large blog I eventually managed to get it to build and install, but I had to use --notest to get it to work.

Actually, I had to use define DEB_BUILD_OPTIONS="nocheck" .. because for some lovely reason, --notest, despite being very helpful, is deprecated!

Then the real fun started

using cpan2dist --format CPANPLUS::Dist::Deb Plack went off and decided to build packages with stupid names ( 'cpan-libplack-perl' anyone? ), that then fubared for some reason I still don't even want to understand. Hell, it makes Java back-traces look simple.

Conclusion: Perhaps relying on distro-packaged CPAN packages on most distros still sucks too hard

I've come to understanding at long last why people JustUseCpan™ instead of relying on their distros. Just look at the massive hell-hole of problems I encountered on just one distribution of Linux. Woe be unto him to wants to develop a Perl Project and then ship it and hope its easy to install using the tools provided by the recipients distribution of choice. I've been lulled into a false sense of security by my lovely system which is so simple to use.

So, You're doing a Project and relying on CPAN.pm and friends

There's a variety of goals a person like myself wants to achieve with this scenario.

  1. Low Pollution

    Pooping over /usr and friends is unacceptable. Especially if its not 100% Guaranteed reversible. No 2 Modules should be able to modify each others files, either by intent or accident. In some distributions, this is guaranteed by building and installing into a clean directory-tree with a "sandbox" mechanism that prohibits writing outside the build environment, and then collision-testing all the files in the clean-install directory prior to unpacking them into the file-system, and then bailing if a collision occurs. I like to have this degree of certainty with modules, and in fact, all software, which is the primary reason I rely on my Distros' package manager because it can give me these guarantees.

    You should NOT need elevated permissions to ever perform configure/build/test or install. Final application to the file-system should be performed by an externality with the needed permissions, that has no way of being "scripted" during the install phase by the package that is being installed.

    If another mechanism can exist within a context ( think perhaps something like local::lib ) that give me this same certainty without resorting t say, putting the whole bastard in git and relying on the ability to revert commits, ( its not that I'm averse to gitifying an install tree, its just when you install lots of modules, you don't want to have to halt things between installations just to maintain a 1:1 commit:distribution ratio -_-. I tried something like this once, and it was masochism ) then ThatConcept++, I want it!

  2. Ease of Roll-Out/Distribution

    Ideally, you want Some Way to minimise the amount of work one needs to do on any given target to make sure the installed modules are the very same ones that were on the platform it was developed in. Having to do the above dicking around on various distributions with their rubbishy package management crap, is a real nightmare. Especially if you don't have the luxury of knowing in advance what the target machine will be running. Sure, you try to know, but sometimes requirements change, and sometimes you don't get much choice about the machine you're working with, so its great to have it completely not matter where you're taking it.

    If you can assume its going to have a working version of some recent version of Perl, and that its not a completely different platform to the original ( ie: transitioning from Linux to Win32( or worse, Win64 ) is a nightmare, it would be nice to be unilaterally transformable, but that's too much "dream" at the moment ), then you can dump your code tree on it and have it more-or-less JustWork without having to waste more time working out how to get the bastard up and running.

    For me, this means I'd want a way to have a mostly-perl-version agnostic local::lib-ish installation, which essentially requires

    1. Checkout
    2. Some way to rebuild .XS stuff for $arch_target without needing to reinstall everything from scratch
    3. Optionally run t/* tests for everything that's installed
    4. Run/Serve up the code

  3. Somehow avoid the need to build a second instance of Perl on the target machine

    Having to do this is both very annoying, and very time consuming. Having a system, a methodology that avoids this need and Just Works for everyone who uses this methodology would be great

Kicking around the idea

/
 build/
      tars/
         Source tar.gz's 
      tmp/
         "Scratch" directory where things are configured/built/fake-installed
      installed-t/
        dist-name-version/
          Some attempt at extracting t/ from each dist
 cpan/
      main/
        primary @INC Path
      profile_a/
        supplementary @INC for experiments
 project/
      project_code*

There's some theoretical layout ideas. Some borrowed from how CPAN currently works.

To facilitate this layout however, some theoretical tools are needed

  1. Firstly, some way to create an @INC path that includes only the modules shipped with Perl itself, if that. This would be like local::lib, except we explicitly do not want modules that are provided by the system to be visible. This is to ensure that when new modules are added to the projects dependencies, they have to be installed in the projects custom inc path in order to work, to avoid the issue of going later on to a different machine, and then and only then discovering you need it.
    If there is no practical way to modify @INC that satisfies this criteria, then a combination of Module::CoreList and require hijacking would be needed to prohibit loading non-core modules from the system.
  2. Secondly, some way to "bootstrap" an environment for anything that might be using the project, be it hacking up $ENV vars like local::lib does, or something that loads itself via perl -M to mess with stuff before the rest of the code runs.
  3. A variation on the above to be able to run a cpan client without vision of "system" Perl libraries, in order to install things as if they were nowhere on the system already.
  4. Optionally, some tool that hooks into the cpan client to extract information to facilitate rebuilding XS files and running tests at a later install
  5. Some method to bundle an entire project tree for network-redistribution ( Git is the most logical option to me, but Rsync or tar.gz + scp would be suffice here too )
  6. A recipient tool on the receiving end that can re-inflate the code directory back in place ( git checkout for example )
  7. An ability to, like on the design machine, "bootstrap" into the controlled environment scenario.
  8. Optional/Nice to have: Automated XS Rebuild for all applicable items if needed
  9. Optional/Nice to have: Automated re-test of everything installed ( preferably without having to re-unpack re-configure re-rebuild and re-install every single package.( The idea is, to have the system be able to make itself useful, in the shortest possible time, without having to connect to the internet to download more data at any stage )
  10. Run the "bootstrapped" services.

This is about as far as I've gotten in my fleshing out of my desirables, let alone building a solution that works. I am sort-of hoping there is something simple and straight-forward that already exists and I can just go use and then recommend to everyone else I see because its just so damn awesome. But as I stated half-an-hour of reading ago, I don't have a good idea how to look :/

In the famous words of one too many lazy coder: "Plz Halps"

In case something in the above has made you want to mock me, please remember, I already said I feel like a retard.