Beyond Continuous Integration

A Holistic Approach to Build Management (Part 1 of 2)

In this first of a two part series, we are going to try to answer the following question:
What are the types of builds that you would typically see during the lifetime of a project?

But before we can start to answer this question, we need to come to an agreement about what happens during the lifetime of a project. And while we could spend a lot of time debating whether analysis and design are required stages (your view here would depend on whether you subscribe to the agile or traditional methodologies), there would be little value in doing so as there is no need to build anything in these stages anyhow. And I'm reasonably sure that we can all agree on at least a rough outline of the lifecycle of a project once coding actually begins. So here it goes. Once the coding begins, the developers are busy implementing the planned features while project managers are running about printing out productivity charts and scheduling meetings (sorry, as a long time developer I could not resist). At some point, the QA team (whether formal or not) is brought in to test the project. This typically starts the QA loop where the QA team tests a snapshot of the project and reports issues to the development team, which in turn works on fixing the bugs. A new iteration through the loop starts when the QA team is given a new snapshot of the project, presumably with some of the issues fixed, to test. This loop can go on for a while and ends when the quality of the project is deemed "good enough" to ship. At this point the project is complete. I know the above is an overly simplified description of what goes on in typical project as it does not acknowledge that subsystems may be developed and QA'd individually and then later integrated and QA's at the system level, or the fact that after a project is complete development on the next version may start, or many other details found in real world projects. But I think that our simple model suffices for the purposes of our discussion.

Getting back to the question at hand, let's identify the type of builds that would take place at each stage in the lifecycle of a project. During the development stage, when developers are busy implementing features, there are at least two types of builds: local development builds, as well as Continuous Integration builds. Some teams may also choose to have nightly builds. Once the project enters the QA stage, we will continue to have local development builds and CI builds, but we will also add QA builds. And finally, at the completion of the project we will have a release build. Lets take a look at each one of these builds in more detail now.

Local Development Builds

Local development builds are performed by developers on their development environment. These builds include the sources and dependencies that exist in the development environment at the time the build is performed. It is important to note that these sources do not have to represent the contents of the source code repository. Typically, local development builds would include at least some sources that were locally modified by the developer.

Continuous Integration Builds

We had some great articles on Continuous Integration (CI) last month so I'm only going to repeat the parts that are pertinent to our discussion. Continuous Integration is aimed at decreasing the pain of integrating changes made during development. The idea is that the more time developers spend going off in their own separate directions, the more painful it will be for them to merge their changes once they do decide to do so. Thus, CI proposes that in order to minimize the pain of integration, we should lessen the time between integrations. And perhaps holding true to its eXtreme Programming roots, CI urges us to go to the extreme here as well -- to such an extreme that we should integrate after every change we make.

There is another part to CI that comes in at this point; this is the verification part. As developers integrate their code after every change they make, there needs to be some mechanism to ensure that the last change integrated was not actually a step backwards. We need to ensure that an application that compiled and passed all its unit tests before the last integration does not, all of the sudden, fail to compile or pass unit tests after the integration. If that happens, then the last change integrated needs to be backed out and fixed. In a non-Continuous Integration setting, the development team has the luxury of passing off the application to the QA team for testing after an integration marathon, but that is not really possible in the case of CI. That's because the QA team would be asked to retest the application several times per day; potentially after every change that was committed.

Continuous Integration builds are there to verify that the latest integration was not a step backwards. In order to do that, CI builds need to happen pretty often and typically run on a schedule in order to encompass a low number of integrated changes. Also, CI builds need to build the entire project, and run unit tests (and perhaps more involved tests such as smoke tests or functional tests) on it. If unit or other type of tests are not run, then essentially there is only one type of test that is performed during the CI build, the compile test -- does everything still compile or did the last integration break something so that the project no longer compiles. As we can see, the more tests we make part of the CI builds, the more valuable they become. Another important feature of CI builds is that they need to be produced from the latest snapshot of the sources, as they exist in the source code repository. That is to say, CI builds cannot include any uncommitted code because such code has not been integrated.

Nightly builds are a variant of CI builds. There are two forces pulling in opposite directions on Continuous Integration builds. On the one hand, CI builds should be very quick so that they can happen often. The more often the CI builds happen, the shorter the feedback loop between integrations and their quality, and thus the more continuous the integration. On the other hand, the more tests the CI builds contain the better. That's because additional tests allows us to verify the quality of the integrations more accurately. But these forces pull in opposite directions, because as one wants to decrease the time between builds, adding more tests to the build definitely increases the length of time the build takes to complete. Nightly builds may provide a happy medium whereby the CI builds taking place during the day do not include a full armament of tests and can thus run very quickly. But the nightly builds do include a full assortment of tests that verify the quality of the integrations performed in the last day very thoroughly. Teams that make use of Nightly builds have made a tradeoff sacrificing accuracy of verification for speed during the day, but balancing that with a very thorough verification at less frequent intervals.

We'll come back to Continuous Integration and it's impact on build management in the second installment of this column, but now lets move on to QA builds.

QA Builds.

The goal of QA builds is to produce artifacts that will be handed off to the QA team for testing. It is very important that QA builds provide traceability fro the build artifacts all the way to the sources, dependencies, and environment used to produce those artifacts. The reason for this is that any QA build may be deemed "good enough" to ship, in which case a production build that reproduces the QA build artifacts exactly has to be prepared. We'll talk more about the need to this traceability when we talk about Release Builds below. For now, lets note the fact that in order to provide this traceability, QA builds need to be produced from a snapshot of the sources as they exist in the source code repository.

It is also very important that each QA build produce artifacts that are uniquely identifiable. Basically, I'm talking about a build number or some sort of build/version identifier. Being able to identify the build that artifacts came from can go a long way in helping avoid confusion about whether a particular bug has been fixed or not. Not to mention the fact that it can save the entire team the headache of having to compare file sizes and timestamps.

QA builds are not performed on a schedule like CI builds. Rather, once a release to QA is to be made, the QA build is performed.

Release Builds

The goal of release builds is to produce artifacts that will go into production. The phrase "go into production" is definitely polymorphic as in some cases it may mean that the artifacts will be burned onto a CD and shipped to customers, in other cases it will mean that the artifacts will be deployed to the production servers, and yet in other cases it may have other meanings. As with QA builds, it is critical that release builds provide for traceability from the build artifacts all the way down to the sources, dependencies, and environment used to produce the artifacts. There are many reasons for this but perhaps the easiest one to illustrate is the maintenance scenario where a bug discovered some time after the release has to be fixed. One of the goals when fixing bugs on production software is to change as little as possible. This is really a risk management strategy aimed at minimizing the risk of the bug fix release breaking something that was working in the original release. The best way to ensure that the bug fix is changing as little as possible is to start with the sources that were included in the original release. At least then we can measure and control the extent of the changes.

Like with QA releases, it is very important that each release build produce artifacts that are uniquely identifiable. In the case of release builds this is typically accomplished with the use of a unique version number and sometimes and additional build number.

Another similarity between release and QA builds is that they both need to be produced from a snapshot of the sources as they exist in the source code repository. Typically, the same snapshot of the sources that was used for the QA build that was deemed "good enough" is used for the corresponding release build. Also, release builds and QA builds are initiated manually rather than by a schedule. That's because despite best planning and project management efforts, I have yet to see a project/team that can predict ahead of time when a "bug free" snapshot of the code will be ready to go.

Conclusion

There are at least four different types of builds that take place during the lifecycle of a project. In this installment we have identified their goals and requirements. In the next installment we will take a look at the mechanisms by which we can meet the requirements and achieve the goals of each of the different build types.

On a side note, this is the first appearance of the Build Management column in CM Crossroads. I am very happy to be able to play a bigger role in this community and excited about its prospects for the future. I invite your feedback, questions, and suggestion. Please let me know what topics you would find most interesting.




© 2006-2007 Urbancode, Inc.
Anthill, AnthillPro, and AnthillOS are trademarks of Urbancode, Inc.
All other trademarks are owned by their respective owners.
tel: (216) 858-9000 fax: (216) 858-6902 email:info@urbancode.com