Distribution Assembly Process


Jspwiki style: center

This document captures the process in which we create a runnable, installable GlassFish distribution zip file and relevant artifacts.

Design Goals

The primary design goal of the distribution assembly process is to simplify creating multiple distributions that consists of different contents, to work with multiple profiles of GlassFish.

Toward this goal, the system is designed to reduce the amount of descriptor necessary to create a distribution, and infer as much as possible from POMs of the maven modules.

How it works

At the highest level, a distribution is built as a Maven module, by Maven, by utilizing the maven-antrun-extended-plugin and its functionalities. Much of the heavy lifting is actually in this plugin.

A distribution is assembled from multiple Maven modules, so the first step is to figure out what set of modules are used for building a particular assembly. This is simply determined by computing the transitive dependencies from the distribution maven module.

The process then look at those modules more carefully, analyze relationship among each other, and pay attention to the type of the modules. During this phase, it determines which module goes where. For example, an HK module A might be placed as-is in glassfish/modules, another module might go to glassfish/modules/ejb, while yet another might be unzipped and placed into glassfish/install.

The process then places all these files under target/stage, and pack it up to create a single zip file, which becomes the final output from the distribution assembly.

Main Concepts

Maven Modules

All the pieces that eventually get into a distribution comes as artifacts from other Maven modules. The modules packaging in Maven play a key role in determining how the artifacts are handled.

  • hk2-jar type: HK2 modules generally go under glassfish/modules or one of its sub-directories. For more details about how the exact location is determined, refer to the advanced topics section. For more details about how to write an HK2 module, refer to V3EngineeringGuide.
  • jar type: When a HK2 module X depends on a plain jar module Y, Y will be placed into the same directory as X is. Note that this means sometimes you get multiple copies of Ys in different directories.
  • pom type: POM modules are only used for bundling multiple modules into one, and as such the POM module itself will not be placed into the final distribution. See the "Features" section for more details about how this is used.
  • distribution-fragment type: These modules are used to bundle up files to be placed as-is into the distribution. These include documents, shell scripts, data files, etc.

Features

Maven modules are normally too fine-grained for distribution assembly purpose, so if we start listing up individual modules that are to be included into a distribution, the list will quickly get intractable.

So in v3 we adopted a convention to group a series of related maven modules together, so that we can work at higher-level. This grouping is called "feature." For example, the EJB feature already consists of 4 Maven modules and there'll be likely more Maven modules. But they are all grouped together into one EJB feature (called "ejb-all"), so that in distributions/glassfish/pom.xml, it only depends on this ejb-all module and don't list up individual pieces.

This simplifies the distribution assembly process, and it leaves the component team the control to decide what will be a part of the distribution.

At Maven level, a "feature" is nothing but a POM module that lists members as dependencies. Maven supports transitive dependencies, so you need not list up all the member modules individually,as long as they come in as transitive dependencies. See the ejb-all feature for an example.

Distribution Fragments

A distribution fragment is a Maven module which creates a zip file as an artifact. The contents of this zip file will become a part of the final distribution (hence the name 'fragment'.)

A distribution fragment is typically used for delivering resources — documents, scripts, data files, etc — into the distribution, but it's often used to place binaries into specific locations as well (such as a system web/ejb application.)

A distribution fragment can be built any way you want, but the most common way to build this is to create src/main/resources directory and place resource files in here as they'd appear in the final distribution. See ejb-timer-databases for an example that follows this pattern.

In more complicated situation, you'd need to perform some processing to resource files. The followings are the typical use cases:

  1. Perform token replacements
  2. Run XSLT transformation
  3. Generate scripts
  4. Execute a program to create data files
  5. Rebundle 3rd party distribution of something (e.g., Ant, JavaDB) with minor modifications

For this kind of needs, the maven-antrun-extended-plugin would be useful, as it lets you use Ant for procedual, sequential processing. See ejb-timer-databases POM for an example of how this can be done.

Once a fragment is defined, have the appropriate feature POM (or in some rare cases, the distribution POM) list your fragment as a dependency like this:

<dependency>
    <groupId>org.glassfish.external</groupId>
    <artifactId>javadb</artifactId>
    <version>...</version>
    <type>distribution-fragment</type>
</dependency>

See the ejb-all feature for an example of this.

Distribution Inheritance

It's often convenient to define a distribution in terms of difference with another distribution. For example, one might want to say "EE bundle is web bundle + EJB." A distribution inheritance is a mechanism in which one does this.

To define a distribution by inheriting another distribution, simply declare a dependency on another distribution like this:

<dependency>
    <groupId>org.glassfish.distributions</groupId>
    <artifactId>web</artifactId>
    <version>...</version>
    <type>zip</type>
</dependency>

This means you can inherit from multiple distributions, if necessary. All the modules included in any of the base distributions will become a part of the new distribution. See the glassfish distribution for a complete example.

How-tos

I need to add XYZ to a distribution

  1. If it's an HK2 module to be used in application server JVM...
    1. Think about why/when your HK2 module needs to be in the distribution. For example, is it an utility or a library to be used by another module? If so, you don't need to do anything, because when that "another module" is a part of the distribution, your stuff will be automatically pulled in.
    2. Think about whether your HK2 module should belong to a "feature" cluster, like EJB or web-tier. If so, locate that feature POM in the V3 workspace and add your module in there. Your module will automatically show up in the appropriate distributions.
    3. If you are developing a brand-new feature cluster, where you expect to have several modules, write a feature POM and add your HK2 module in there, instead of adding it directly to the distribution POM.
    4. Finally, if you are positive that your HK2 module doesn't fit any of the category, add the dependency directly to the distribution POM.
  2. If it's a file that need to be placed somewhere else in the distribution, then write a distribution fragment. See the "distribution fragments" above for more details.
  3. If it's something else, contact knowledgable folks for discussion

I need to create a new distribution

TBD.

Advanced topics for distribution maintainers

How do we decide which modules go which directory?

This involves in a fairly straight-forward dependency graph analysis.

First, the process starts with the entire dependency graph between HK2 modules, and several "head" HK2 modules. Intuitively speaking, a head HK2 module is a module that brings in a cluster of other HK2 modules through its dependencies. Head HK2 module will be placed into its own directory, such as "web" or "ejb".

With this input, the computation goes like this: for each HK2 module, we look at the dependencies to that HK2 module. If all the dependencies from the root goes through one and only one head, then this HK2 module will be placed into the same directory as the head (the justification is that this module is only "used" within a cluster lead by this head, not shared.) If there are dependencies that go through multiple heads, then the module would go directly under the modules directory. This includes those HK2 modules that are directly referenced by the distribution POM.