A Case of Architectural Refactoring

Some weeks ago one of my customers decided that one of its biggest ASP.NET web intranet projects needed a sort of architectural revision, mainly to support better its customers with built-in fault tolerance but also to unchain development of the various sub-projects through better separation between software modules.

By “architecture” I mean a broad concept that includes also the processes tied to a specific software architecture.

A bit of background: the architecture served a project (called PORTAL) for the creation of big intranets called portals that utilised the software modules produced by different sub-projects (also called applications or contexts).

Before the revision, all sub-projects were bound tightly together with the same code base and at runtime; this was a big problem in every aspect, much like an original sin. This situation arose because the management had little time to make the right choice due to strict delivery deadlines and assumed that it would cost much less than a setup that included isolation from the start. The decision was further constrained by the worry that with more liberty, every sub-project would go its own way.

From the Software Configuration Management (SCM) point of view, all sub-projects artifacts were treated as pertaining to a single project so they lived in the same branch of the versioning system: every code change, even the smallest one, was available to every sub-project immediately after the check-in/commit; it was easy and quick to fix problems, but it was equally easy to break every sub-project with a single line of code (and with ~50 upset engineers waiting for the fix).

For the sake of completeness, the architecture included concepts of runtime independence from the start (e.g. every application had its own logical database), but they weren’t enforced during the years of development, so their implementation was incomplete or definitely broken.

To make the things more complex, there was the sub-project (called here ARCH) which was producing all of the architectural and shared modules from the Content Management System (CMS) to the Object-Relational Mapping (O/RM) modules. The fixes requested by one sub-project were immediately available to all of the others, but also the pace of changes produced instability in ARCH and the instability of ARCH was distributed constantly to every sub-project. This lead to a system that was at best unstable, untestable, and kept running only by the efforts of single developers that lost their nights for it: ARCH was often seen as a “bug broker” and not as a service/module supplier.

After some (very needed, I must say) analysis and prototyping, we introduced some changes that lead to a system were all applications run independently of the others i.e. if a database failed, only the corresponding application will fail, the other applications will continue to work seamlessly even if they are inside the same portal of the failed application.

The new SCM process involves dedicated per-application branches and a new level of indirection for the ARCH project that incarnates in a versioned deliverable (whose name is ARCH.Dependency) that includes all the “products” of ARCH. Where possible this is in binary form so that the “customers” of ARCH are discouraged from making changes directly to ARCH’s code base (something bad that was happening before) and, where needed, there exists a dedicated branch of the sources of ARCH.Dependency for every project that makes specific requests; this means that a specific version of ARCH.Dependency is not sculpted in rock and so sub-projects don’t have to worry too much about rigidity.

This new setup guarantees that every project (even ARCH) can work at its own pace, with its own delivery dates and a much leaner management process (not every date and feature must be discussed with every other project leader). This provides the possibility to deliver features in a more “agile” way than before, no single feature needs to be held back because it might break the other projects. ARCH can continue to serve its customers, but now it can also have some out-of-band space to do something new like refactoring or big/breaking feature implementations.

I will let everyone draw their own conclusions, however, from my point of view, the most important lesson here is that it would have been cheaper to invest in architecture design, methodology and process, SCM and tools from the outset, and for at least two reasons:

  1. every mistake in the architecture (i.e. in the building blocks like a framework or the tools) multiplies, mutates and propagates itself exponentially, so more initial investment should translate in less total errors in the entire project
  2. more information on the project means more feedback (on the project ealth) and so more control (to let people take the right actions at the right time)