Furious Purpose: June 2008

Monday, June 30, 2008

I'm One-Dimensional?

Wow, I guess my use of del.icio.us is pretty near one-dimensional. Either that, or I'm pretty near one-dimensional. I prefer the former. ;)

(via silk and spinach)

Sunday, June 29, 2008

Why Hiring Technologists is Difficult

I've just written up a brief post applying Malcom Gladwell's mismatch problem to hiring people in the technology sector for Toronto Technology Jobs. Since it's not Toronto-specific, I thought I should share it here with some of you here.

Wednesday, June 18, 2008

The Pitfalls of Technology-Driven Architecture

Software architecture is about technology, and it's nearly impossible to talk about software architecture without talking about technology. But software is meant to serve the needs of a particular set of users, to solve the problems in a particular domain. Likewise, the architecture of a particular piece of software should be intended to solve the problems that face the software within a particular domain. Architecture is driven by business need.

Architecture is Driven by Business Need
For instance, the domain of mass-market internet search requires searching for words within a vast quantity of data, returning the results very quickly, and doing so for a large number of people. Accordingly, the architecture is one of work-division and work-sharing, scale and data-handling, as is evident in the details of the architecture that Google has occasionally shared.

Technology-Driven Architecture
Unfortunately, it's common to see architectural discussions revolve solely around technology, where the technology and the architecture are not clearly linked to the problems facing the business, at least in the near term.

You can often see this when technologists of one shade or another (architects, chief technology officers, developers, whatever) get excited about a particular technology. That technology starts to show up in discussions without being grounded in business need. For example, if a technologist believes that .NET has better tooling than Java, he or she might start suggesting that any new project be built in .NET without having considered the cost and benefit to the company or organization in which he or she works.

Warning Signs
The most concrete and obvious warning sign for technology-driven architecture are when new software designs and architectures are proposed and discussed, and the technologies and the generalized benefits of those technologies are described at length but are not tied directly into the problems that are facing the business.

A secondary warning is when the architecture is tied to problems facing the business, but not the most immediate problems. If the architecture addresses future-facing or secondary problems, you have to ask yourself: Are these the right problems to be traclking right now? For instance, cloud computing architectures address scale, but many businesses have much more immediate problems than how to scale.

These choices are less black-and-white. It's true that many successful software businesses will eventually face the problem of how to scale, and if you ignore that potential problem right through the architecture and design of your application, you may end up paying a heavy cost at some later date. On the other hand, if you decide that scale is a problem you will eventually face, you might spend a lot of time building solutions to scale-out into early product designs, slowing down the development, restricting choices, and generally making it that much harder to succeed. These kinds of long-term architectural choices can be insidious in both directions.

You'll also find warning signs whenever anyone argues that a particular technology is just better than the alternatives, as if there's no trade-off to be made. There's always a trade-off to be made, and ignoring the costs in a new technology has sunk many a project into unexpected quagmires.

Common Examples
Some technologies make regular appearances in technology-driven architectures. They're technologies that sound good, that cause some people to believe they're just better, without seeing the inherent tradeoff. These examples are probably no surprise to anyone. :

Enterprise JavaBeans
Service-Oriented Architecture
Rule Engines
Workflow Engines
Business Process Management (BPM / BPEL)
Enterprise Service Bus

Each one of these can have benefits when compared to some of their alternatives, but each comes at a cost. They have a tendency to complicate the architecture, add more moving parts, reduce the predictability and testability of the final solution while slowing development. They can require a different mindset in order to use effectively. And then there's the more visible costs: learning curves, licensing fees, support contracts.

Exceptions to the Rule
There are exceptions to every rule, including the rule about there being exceptions to every rule. This rule is no exception. ;)

There are some instances where technology-driven architectural choices make sense:

Good prognostication is very hard, but if your team has a great track record of taking on future-facing problems in advance and you trust that, if the team wants to explore a technology, they probably have a good reason for it, and it'll work out in the end ... well that's a judgement call you can make.
Choosing an architecture based on a technology for a project that isn't on the critical path for a business can expose the development team to a new technology, allow them to explore it, find its strengths and weaknesses. This will allow them to make business-tied decisions about that technology in the future, and increase the odds of success if that technology does prove valuable on a more critical project at a later date. Even here, though, if the technology doesn't have an obvious link to the problems facing the business, it may be a waste of time to explore it at length.

There are all sorts of reasons to make technology and architecture choices, and I can't possibly tell you that your architectural choices are right or wrong. Mostly, I'd just like anyone proposing an architecture to stop and ask themselves what business problems this architecture addresses, and if those are the right problems to be tackling.

Monday, June 16, 2008

SproutCore Down

It's always amusing to me when the websites for web frameworks have visible downtime that appears to be related to the "application" or at least it's handling of errors rather than to, say, hardware.

SproutCore is currently showing an HTTP 500 - Internal Server Error.

Saturday, June 7, 2008

JAXB and Collections

Am I missing something? Does JAXB really require you to expose a naked collection in mapped objects?

If I expose a Collections.unmodifiableSet(property), JAXB invokes .clear(), which throws an UnsupportedOperationException.

If I return a shallow copy of the collection, when JAXB is converting XML to Java, JAXB requests the collection, clears it and modifies it, without so much as calling the setter again. As a result, the collection values don't appear in the final Java object.

That seems pretty awkward, so my first assumption is that I've made a mistake somewhere along the line, but ... perhaps JAXB really does want me to expose a mutable object, with the risk that entails. :/

Thursday, June 5, 2008

Equality and Object-Relational Mapping

Hibernate (and apparently JPA) uses the same object instance to represent the same record within the context of a transaction/session. This ensures that the default Object implementation of equality (Object.equals(object) compares instance identity) is sufficient for comparisons of persistent classes within a given transaction.

Outside of the transaction, this is no longer true -- one record in the database might have multiple instances, and if the values are changing, the values might not be the same. This leads one to all sorts of questions about how best to implement equality for persistent objects. There are discussion threads and wiki pages on the Hibernate site that go on about this in length, and I've read them a number of times over the years.

This morning, I think I've finally decided that all the mental effort I've put into that over the years is mostly wasted time. I'm coming around to the believe that instance-comparison is, in fact, a good behavior for the equals() method of persistent (or persistable) object instances.

There are basically a limited number areas where the equals() method of a persistent object comes into play for me in rough order of frequency of use:

Regular database-interacting code within the application. In this case, I'm typically working within the context of a particular transaction, and instance comparison is sufficient.
Test code, often persistence tests or integration tests where I'm expecting an object's fields to be the same before and after a particular operation (e.g. create object, save to database, load fresh instance, verify same fields). In these cases, I can use a Commons ReflectionEqualsBuilder without altering the implementation of equals().
Long-running database-interacting code. In this case, I might have detached instances between transactions, but mostly comparisons will occur within the context of a transaction, and I don't mind reattaching / refreshing to make this work.
Database-interacting application code that needs to compare objects from one transaction with objects from another. This does come up, but it's infrequent. In these cases, I think it's probably reasonable that the default comparison (equals()) fail fast, encouraging me to think carefully about why I'm doing this kind of comparison, what I'd like to compare, and how I'd like to go about it. This is rare enough that putting a little extra effort into these cases and having to put thought into why I'm doing it is probably a good thing.

So, I'm thinking for my next project, I'm going to stick with instance equality for my persistent objects and see how it goes.
Most of the time, I'm working with objects in the context of a transaction. When I want to compare objects across multiple transactions, I'm typically doing this in a test (e.g. a test of the persistent object mapping), and I can easily handle this with a reflection equals builder without actually altering the implementation of equals().

Monday, June 2, 2008

Detecting Intermittent Build Failure with Timed Builds

Almost every project I've been on eventually reaches a point where one or more tests are failing sporadically. Usually, this indicates there's a problem with the test, such as relying on timing, but occasionally it's a problem with the unit under test, or with the testability. For instance, integration testing of concurrency in a web-application is difficult, as you cannot control the timing of the threads inside the application server.

These problems often make themselves known in subtle ways by failing on occasion. If the developers working on the project aren't proactive, it's not uncommon to reach a point where any failed build on "known broken" tests are simply run again, which can consume a lot of potential productivity.

In order to find these kinds of intermittent failures, we often run timed builds in the off-hours. In the periods we're unlikely to be developing (at night and on weekends), we have a build trigger set to run project builds repeatedly (in Bamboo, this is a scheduled build with a cron expression like: 0 0/20 0-7,19-23 ? * *). When we come back the next day or after a weekend, we have a large number of build results waiting, and if we have an intermittent failure, it's likely to have come up at least once, often more than once.

This does take a little effort, but it's a step towards promoting the overall health of the build.