######################################################################
    Archive::Tar::Merge 0.01
######################################################################

NAME
    Archive::Tar::Merge - Merge two or more tarballs into one

SYNOPSIS
        use Archive::Tar::Merge;

        my $merger = Archive::Tar::Merge->new(
            source_tarballs => ["a.tgz", "b.tgz"],
            dest_tarball    => "c.tgz"
        );

        $merger->merge();

DESCRIPTION
    "Archive::Tar::Merge" takes two or more tarballs, merges their files and
    writes the resulting directory/file structure out to a destination
    tarball.

    In the easiest case, there's no overlap between files. Then the result
    is just a union of all files in all source tarballs.

    If there's overlap, but the overlapping files are identical in the
    source and destination tarballs, there's no conflict and
    "Archive::Tar::Merge" just puts the files into the destination tarball
    without further ado.

  Deciders
    If there is a conflict, e.g. two files under the same path in two source
    tarballs which have different content or permissions, a decision needs
    to be made which of the source files is going to be used in the
    destination tarball. The "decider" parameter takes a reference to a
    function, which receives both the logical path and the physical
    locations of the two (or more) conflicting files. Based on this, it
    decides what to do:

    *   Return the array index of the winning file. If the decider gets a
        reference to an array of 2 source tarballs, returning

            return { index => 0 }

        will pick the first file while

            return { index => 1 }

        will pick the second one.

    *   Return the content of the winning file. How it comes up with this
        content is up to the decider, typically, it applies a filter to one
        or more of the source files, derives the content of the destination
        file from them and returns its content as a string.

            return { content => "content of destination file" }

    *   Ignore the file. Leave it out of the destination tarball:

            return { action => "ignore" };

    As parameters, the decider receives the file's logical source path in
    the tarball and a list of paths to the unpacked source file candidates.
    Here's an example of a decider that resolves conflicts by prioritizing
    files from the second source tarball ("b.tgz"):

        my $merger = Archive::Tar::Merge->new(
            source_tarballs => ["a.tgz", "b.tgz"],
            dest_tarball    => "c.tgz"
            decider         => \&decider,
        );

          # If there's a conflict, let the source file of the
          # last candidate win.
        sub decider {
            my($logical_src_path, $candidate_physical_paths) = @_;
          
              # Always return the index of the last candidate
            return { index => "-1" };
        }

        $merger->merge();

    If a decider wants to make a decision based on the content of one or
    more of the source files, it has to open and read them. A somewhat
    contrived example would be a decider which appends the content of all
    conflicting source files and adds them under the given logical path to
    the destination tarball:

        use File::Slurp;

        sub decider {
            my($logical_src_path, $candidate_physical_paths) = @_;
          
            my $content;

            for my $path (@$candidate_physical_paths) {
                $content .= read_file($path);
            }

            return { content => $content };
        }

    A decider receives a reference to the outgoing "Archive::Tar::Wrapper"
    object as a third parameter:

        sub decider {
            my($logical_src_path, $candidate_physical_paths, $out_tar) = @_;

    This allows for adding extra files to the outgoing tarball and perform
    all kinds of scary manipulations.

  Hooks
    Sometimes not all of the source files are supposed to be merged to the
    destination. A hook, called with the logical source path and a list of
    source file candidate physical paths, can act as a filter to suppress
    files, modify their content, or store them under a different location.

        my $merger = Archive::Tar::Merge->new(
            source_tarballs => ["a.tgz", "b.tgz"],
            dest_tarball    => "c.tgz"
            hook            => \&hook,
        );

          # Keep only "bin/*" files
        sub hook {
            my($logical_src_path, $candidate_paths) = @_;
          
            if($logical_src_path =~ /^bin/) {
                  # keep
                return undef;
            }

            return { action => "ignore" };
        }

    Just like deciders, hooks receive the file's logical source path in the
    tarball and a reference to an array of paths to the source file
    candidates. Hooks *differ* from deciders in that they are called with
    *every* file that is merged and not just with conflicting files.

    If a merger defines both a decider and a hook, in case of a conflict, it
    will <only> call the decider. It doesn't make sense to call the hook in
    this case because the decider already had the opportunity to totally
    render the content of the file itself, making things like source paths
    useless.

    If a hook returns "undef", the file will be kept unmodified. On

            return { action => "ignore" };

    the file will be ignored. And a hook can, just like a decider, filter
    the original file or create its content from scratch by using the
    "content" parameter.

    A hook also receives a reference to the outgoing "Archive::Tar::Wrapper"
    object.

  Using a decider to add multiple files
    If you have two tarballs which both have a different version of
    "path/to/file", you might want to add these entries as "path/to/file.1"
    and "path/to/file.2" to the outgoing tarball.

    Since the decider gets a reference to the outgoing tarball's
    "Archive::Tar::Wrapper" object, it can easily do that:

        my $merger = Archive::Tar::Merge->new(
            source_tarballs => ["a.tgz", "b.tgz"],
            decider => sub {
              my($logical_src_path, $candidate_physical_paths, $out_tar) = @_;
          
              my $idx = 1;
              for my $ppath (@$candidate_physical_paths) {
                $out_tar->add($logical_src_path . ".$idx", $ppath);
                $idx++;
              }

              return { action => "ignore" };
            }
        );

    Note that the decider code not only adds the different versions under
    different paths to the outgoing tarball but also tells the main
    "Archive::Tar::Merge" code to ignore the file to prevent it from adding
    yet another version.

  Post-processing the outgoing tarball
    If the "Archive::Tar::Merge" object gets the name of the outgoing
    tarball in its constructor call, its "merge()" method will write the
    outgoing tarball into the file specified right after the merge process
    has been completed.

    If "dest_tarball" isn't specified, as in

        my $merger = Archive::Tar::Merge->new(
            source_tarballs => ["a.tgz", "b.tgz"],
        );

    then writing out the resulting tarball can be done manually by using the
    "Archive::Tar::Wrapper" object returned by "merge()":

         my $out_tar = $merger->merge();
         $out_tar->write("c.tgz", $compressed);

    For more info on what to do with the "Archive::Tar::Wrapper" object,
    read its documentation.

TODO
    * Copy directories? * What if an entry is a symlink in one tarball and a
    file in another? * Permissions of target files created by content * N
    different symlinks (hash dst file)

LEGALESE
    Copyright 2007 by Mike Schilli, all rights reserved. This program is
    free software, you can redistribute it and/or modify it under the same
    terms as Perl itself.

AUTHOR
    2007, Mike Schilli <cpan@perlmeister.com>