Posts Tagged ‘management’

SCM Best Practices and Continuous Integration Go Hand-in-Hand

June 15th, 2011 by AccuRev

There’s no denying that this has certainly been the Agile decade for the software development industry.  It’s evident all around us in this tenth year since the Agile Manifesto was created. Most companies and development organizations today have implemented some form or aspect of Agile methodology into their software development processes. Whether you’re aiming for pure Agile or a mixed/hybrid approach, proven best practices in all phases of the software development lifecycle are crucial to success.

This is especially true in the case of continuous integration, one of the foundational aspects of the Agile methodology. The concept of continuous integration, as defined by Martin Fowler, is “a fully automated and reproducible build, including testing, that runs many times a day.  This allows each developer to integrate daily, thus reducing integration problems.”

With this approach, developers can work more closely in parallel while identify problems and debugging on the fly, accelerating the development process and improving the quality of the finished product.  The benefits of continuous integration are tremendous, but can quickly be eradicated if software configuration management (SCM) best practices are not carefully followed.

There are a handful of SCM best practices that can optimize continuous integration.   Let’s start with a quick look at the first two:

  • Using an SCM system to store and version all source code
  • Utilizing private developer workspaces

Best Practice: Using an SCM System to Store and Version all Source Code

Parallel development and distributed software teams can make tracking changes a daunting task, especially with the frequent changes that occur when using continuous integration methods.

For this reason, it is important to employ a software configuration management (SCM) system to strictly version changes to the code base. In addition to versioning source code, everything needed to build the system should be placed under version control, including the following:

  • Third-party libraries
  • Properties files
  • Database schema
  • Test scripts
  • Install scripts

All developers should have at least read-only access to all files needed for the build and should obtain all such files directly from the SCM system. This approach ensures that developers are working with the latest build environment, and is preferable to the common but error-prone practice of placing such files on a shared file server.

To effectively implement continuous integration, all development groups should work from the same central source code repository so that the latest changes from other developers are easily and immediately available.

Best Practice: Utilizing Private Developer Workspaces

In order to fully realize the benefits of continuous integration, software development organizations need to ensure that developers can remain productive regardless of the overall state and stability of the project source code. To achieve this, private workspaces that give developers full SCM capability should be used. Private workspaces enable developers to

  • work in isolation
  • revert to known “good” states when needed
  • checkpoint their changes
  • share only mature, well-tested code with other team members

The benefits of isolation are bidirectional—it protects developers from incoming changes, and protects the shared code configuration from incomplete or incorrect changes from any one developer. By creating private workspaces, developers receive all the benefits of SCM for their personal use, including the ability to revert to a previous state, viewing and tracking of changes between software configurations, and setting aside changes to begin work on a different task.

Once a new known good state is reached (for example, when a developer completes engineering and testing work on a feature), developers should checkpoint their work, typically by “checking in” or “keeping” the local changes in the SCM system. The checkpoint ensures that the developer’s work is safe on the SCM server and that the checkpoint can be revisited at any time. However, since the changes have not been shared, other developers and teams are not affected.

When a developer breaks isolation and decides to share a code change, he or she is essentially making an assertion that the change has reached a higher level of maturity. This, coupled with the use of local developer builds, helps to ensure that only mature, well-tested code is passed on to the rest of the development team, a primary benefit of continuous integration.

Arming Software Development Project Managers with Real Data

November 30th, 2007 by lorne cooper

Managing software development is tough enough as it is. Managers take responsibility for the results their teams deliver. They’re on the hook for hitting schedules, delivering quality, and meeting customer requirements. Worst of all, if they’re saddled with slacker engineers, they still have to get the work done, and done right.

Tools are needed to help managers stay on top of their projects and find problems when there’s still time to fix them, before the schedule pressures explode. Project management tools are valuable, but their schedules only reflect the quality of the information that went into making them. Garbage in, garbage out.

The SCM system is the source of the information that drives smart decisions. Here are four sources of real data that can come out of any quality SCM system. I’ll illustrate with AccuRev because, well, because it’s the best.

  1. Bug Arrival Rate
  2. Churn rate
  3. Task Estimate accuracy
  4. Complexity Creep

1. Bug Arrival Rate

A key graph in any project manager’s arsenal compares bugs reported over time against bugs closed over time. Here we use AccuRev’s AccuWork issue tracking tool to store all issues, whether reported by QA, sent in from customer service’s CRM, or entered by the powers that be as customer requirements. Since our iterations are four weeks long, we use the week as a reporting time period. Some AccuRev customers use days.

In AccuWork, queries are easy to create, and we get a report from AccuWork in XML covering all the issues opened or verified [foot note: In our terminology, developers “close” issues, and QA “verifies” them. Other companies have developers “submit” issues and QA “close” them] against a particular iteration.

The XML slides right into Microsoft’s excellent Excel 2003. I use PivotTables and Excel graphs for all my reports. Others might leverage their superior Perl skills.

Here, we add a “week” field based on the AccuRev time field.

lorne pic final Arming Software Development Project Managers with Real Data

2.Churn Rate

A key metric for quality is how often different parts of the code base have to be modified to address issues. Here, the AccuRev code repository has all the information we need. I use the AccuRev Hist command

accurev hist –a –s “release candidate stream” –fx –t “2006/1/8 – 2006/1/1”

to get an XML file of all the “Promotes” (sort of like “check-ins”) to a release candidate stream within a time range. Some use a separate “hist” command for each release period. Personally, I take a big time range and then use Microsoft’s Excel 2003’s XML file support to slice and dice into tasty chucks. I tried it once with Excel 2007. Once.

Since we component-ize our code base based on directories, the “path” field is particularly interesting, as directories can be as significant as individual files.

Loading the XML file into Excel, my macro deletes some unneeded columns and creates a Pivot Table like the following:

lorne pic 2 final Arming Software Development Project Managers with Real Data

Here I’ve filtered the data to not include the binary files, and used the Excel Pivot table’s advanced feature to show the top ten in descending order. AccuRev.c takes a beating every release but it isn’t significant. Server/diff.c and Client/stat.c seem to be at the center of a lot of changes that release, and are probably candidates for some code reviews.

3. Task Estimate Accuracy

Along with dates of issue opens and issue closes, we have chosen to include a user field for the original estimate of the length of time it will take to complete the issue. That gives us something to compare to when we “close” an issue and send it to QA, and an opportunity to correlate against the length of time before an issue gets verified.

As a best-practice, we try to normalize issues to a couple of days in length. Longer and you haven’t really figured out what needs to be done. Shorter and you’re liable to strangle in your own spit, a management practice I can’t recommend having seen the results.

We get this data out of AccuWork again, and can slice it by group, release, or even individual developer. This is a quick way to quantify the subjective feel we all have for who is slow, who’s fast, and who’s clueless. Here’s a scatter plot for a group. Note that they are generally on the high side of their estimates, and that they haven’t done a very good job normalizing tasks to 2-3 days:

lorne pic 3 final Arming Software Development Project Managers with Real Data

4. Complexity Creep

As software gets modified, and especially when managers become conscious of the amount of time taken to fix a bug or implement a feature, there is a tendency for developers to make the code base less maintainable. There are some metrics for code complexity that can benefit the manager when trying to assess when it is time to refactor. A frequently used metric is cyclomatic code complexity.

While we don’t do any code complexity measurements here, some of our larger customers do. This is easy to collect using a trigger on changes being sent to a release candidate (or QA candidate) stream, where the trigger calls the analysis tool and puts the results into a tab-separated file for later plotting from Excel.

A simple (Perl) implementation of the server side “server_post_promote” trigger creating a file “complexity.txt” looks like:

open COMPOUT, “>>complexity.txt” or die “Can’t open complexity.txt”;
if ($stream eq “acmeProj_QA”) {
$complexity = complexity($stream, $version, $workspaceDir, $filePath);
print COMPOUT “$user\t $date\t $transaction\t $stream\t $version\t $filePath\t $complexity\t $issueNum\n”;
}
close COMPOUT;

Together, these reports provide a good mix of data around schedule, estimation and code quality, giving our beleaguered project manager a pretty nice set of armor.

Please post your favorite management reports, even if you don’t (yet!) use AccuRev.