Monday, May 26, 2008

SubEtha SMTP, Maven and a Community Repository

We'd been using Dumbster to run some email integration tests. I've used it before, and it gets the job done. However, when running the overnight timed build (to drive out problems, we run the build repeatedly overnight), the email integration tests would occasionally lock up, never exit, and cause the build to run forever. A few logging statements drove out the source of the problem -- the Dumbster.start() method was going away and not coming back.

I'd previously seen the suggestion that SubEtha's Wiser was as capable and better code (less flaky, fewer problems) so I quickly swapped out Dumbster for SubEtha. Or, so I hoped. The swap was pretty easy, the hard part was getting a Maven-ready version of Dumbster with:

  • A versioned JAR: Grab the JAR from the download, upload it to our local repository, give it a group id and a version number (org/subethamail/subethasmtp/2.0.1/subethasmtp-2.0.1.jar).
  • A versioned JAR of sources: ZIP the source directory starting with the top-level package, rename it to a JAR, upload it to the local repository with the same group and artifact id as well as a version number (org/subethamail/subethasmtp/2.0.1/subethasmtp-2.0.1-sources.jar).
  • A POM listing the transitive dependencies: Create a POM, check all the dependencies in the lib directory, look up each to find out the appropriate artifact and group identifier, spend a little time evaluating the smallest set of dependencies that are necessary (if one includes another, such as was true for parts of Mina), then upload that as the POM (org/subethamail/subethasmtp/2.0.1/subethasmtp-2.0.1.pom).
This isn't a vast amount of work, and it could be simplified a little by tooling (that last step in particular), but it's a pain, and it's entirely preventable, if someone, ideally the project itself, provides the JARs and POM for you.

Since I'd already done the work, I posted my work to the subetha issue tracker in the hopes that they'd adopt it. No such luck. Their response:

That's great and all, but I don't really care about Maven, nor do I really want to add or maintain a pom.xml in the project. Sorry.

I admit, if I weren't using Maven on my project, the idea of keeping a POM up to date and uploading it to a remote repository would seem like a pain, and I've seen that attitude before. That said, if you want your project to be adopted by people, this is one way to ease the transition for some group of users, so it's unfortunate that projects take this stance.

A Community Repository
With that in mind, it'd be great if there were a public community repository for Maven which was like the public repository, but held artifacts for projects that were unwilling to supply them, but willing for someone else to maven-ize their project. Members of the community who'd created a Maven-enabled version for their own use could post their version for others to pick up.

This could be as simple as a second public repository which could be added, or perhaps even simpler, a top-level folder in the Maven public repo called unofficial where unofficial artifacts could be posted. The latter would ensure that Maven users would be fairly clear up front that they were pulling in an unofficial artifact, by virtue of the group id.

The primary danger in this approach is that people would fail to respect the license terms of software -- by packaging a piece of software for Maven, you may be considering redistributing it, and you'd have to be careful, as some software projects might not wish to have this happen.

As long as it were possible to have one's project removed from the repository and if the community tried to respect the desires of those projects, this still seems like it could be a very helpful addition.

Thursday, May 22, 2008

Spring Can Hide Transaction Demarcation Failures

When you define a TransactionProxyFactoryBean containing TransactionAttributes, the method names you use for transactional boundaries need not exist. The transaction attributes are examined to determine which methods on the inner class need to have advice applied, but nothing prevents you from attempting to apply transactions on methods that do not exist.

That sounds pretty innocuous, but it can mean that you thought you applied a transaction to a service method, but failed to do so, or that a service method that was renamed is no longer transactional but that you receive no warning. This has bitten me a few times, and I've seen very little discussion of it, so I thought I'd record it here.

This is of particular import if the underlying persistence implementation (ORM and Database) are willing and able to create transactions for you, or operate non-transactionally, if you fail to do so yourself. Without this, service methods that don't have a transaction boundary will simply error out when they attempt to do any serious work with the underlying database. With these "simplifying" assumptions, your service methods may be written as if they are transactional, but without the safety applied by having a transactional boundary.

With this in mind, I lean towards using the @Transactional annotation that Spring provides, which is more visible when you're doing work in the service class and less likely to get lost during simple refactorings.

Alternately, if you're using Spring and Hibernate, you might want to turn off HibernateTemplate's allowCreate, which enables it to create a session if not present, which helps to mask the problem.

I hope this saves some of you future pain.

Wednesday, May 21, 2008

Maven Build Profiling

I've often wished I had a tool that could provide a high-level profile of the build execution. I'm not talking about running the entire build on a profiler -- while that might be the ultimate goal, it's probably overkill, and harder to read than something simpler.

For instance, Maven 2 has a build lifecycle and plugins that tie into that. If Maven 2 could generate a report that indicated how long each phase of the build lifecycle took, possibly breaking out each pre- and post- phase, as well as each of the plugin executions, that'd be a terrific way to make a quick start on reducing the time of a particular build.

For instance, if your project has a fifteen minute build, but you're spending 13 of the 15 on the integration-test phase, then you have a pretty good idea right away on what your options are with respect to cutting down the build time (profiling the integration tests, reducing the number of integration tests, run integration tests only in CI, etc.).

I don't have enough Maven plugin development experience to argue if this is something that could be plugged into Maven easily, or not -- it seems as if the Maven plugin approach is better suited to things that execute at particular places in the lifecycle rather than interleaving with the lifecycle.

Friday, May 16, 2008

AMM: CMM for Architecture?

You know, I'm surprised that I haven't encountered an effort to create a CMM-like maturity model for architecture: an Architecture Maturity Model.

Although I'm not personally a big believer in CMM, nor do I put the same weight on big-A Architecture that some do, I guess it just seems like the sort of thing that the management boards of enterprises would like to hear. "Our new billing system is AMM Level 5!"

Is there something like that lurking in TOGAF, Zachman, ITIL etc. that I haven't encountered because I've studiously avoided learning these? :)

It doesn't seem unheard of in a Google Search, but just because someone, somewhere has used the phrase doesn't mean that it's an actual movement that has any weight behind it.

Thursday, May 15, 2008

Rated Syndication for RSS/Atom

I've often wished that content syndication via Atom and RSS came with ratings. Basically, there are feeds that contain good content, but have entirely too much of it. Consider the prolific output of some person or site like Robert Scoble or Engadget, and imagine that instead of having to choose between ignoring them entirely or drinking from the firehose, you could subscribe to the top X% of their syndicated content.

So when Engadget delivers the news about iPhone 2.0 and rates it 95%, you'll read it, but when they unbox a Samsung Blackjack 2.2 Silver and rate it 30%, you might not even have to skip past it with 'j' in Google Reader.

Heck, Google Reader could even offer to tweak your settings based on usage. "I've noticed you only read about 15% of Robert Scoble's postings. Would you like to subscribe to his top 20% syndicated content instead?" "I see you've read everything that Clay Shirky wrote this week. Would you like to expand your subscription?" "I see you're writing a memo ... "

The ratings could be delivered entirely by the content author/syndicator or through tools they supply for readers to modify the ratings, a la Digg / Reddit.

So, if you're out there working on the next syndication format, please take some kind of content rating into account, save me from the never-ending river of content.

Tuesday, May 13, 2008

Viewing Logs as Trees using NDC/Viewer

One of the problems that one tends to find when attempting to diagnose a problem using logs is the sheer volume of information that one has to digest. I've been thinking about that a little over the last couple of days, and it seems to me that with a little work you could display logs in tree form, like a call stack, which would give you the ability to do many of the kinds of things you can only do with a profiler right now, such as:

  • Track down the hibernate logging that relates to a particular HTTP request so that you can diagnose a tricky persistence problem.
  • Analyze the number of SQL calls made by a single HTTP request.
  • Get a rough outline of the time involved at each layer of a request.
In order to make that happen, it seems like you'd want:
  • Judicious application of the nested diagnostic context in Log4J or Logback to store a sense of the layer. This is probably done easiest with an aspect that could be applied at various tiers (services, daos, etc.)
  • Some kind of request-tracking key, whether this is the thread, or some kind of context-free identifier stored in the mapped diagnostic context. Again, this could be applied via an aspect.
  • Better log-parsing/viewing tooling. Using the above pieces of information, you could display logs as a tree, allow for querying, etc.
It's an interesting idea. Like most interesting ideas, it would take someone to actually follow through on it, and i'm not convinced that's going to be me. ;)

Monday, May 12, 2008

Project Wiki or Project Blog?

I've seen a lot of technology development projects adopt wikis as a way of sharing information. Often, unless you appoint someone the job of keeping the information both up-to-date and organized, these often get out of hand pretty quickly. Without a common kind of organization to keep things in check, it gets difficult to find information, and without someone constantly reviewing the documentation for relevance, some information gets out of date quickly while others remain relevant.

I'm starting to wonder if projects should probably seek a platform that emphasizes a blog style rather than a wiki style. The blog entry metaphor is more suited to "writing information about a point in time." When you do a search for something in your project blog and find information from May 2006, you're much more likely to think to yourself, "Hey, I wonder if this is still relevant."

That's not to say that a Wiki isn't a good idea, just that wiki content should be saved for information that is both long-term relevant and worth keeping up-to-date, and then someone should, in fact, regularly review it for relevance and make any necessary updates.

In keeping with my agile mindset, this kind of regularly updated documentation should be kept to that information that is very valuable, so that you don't distract a team whose job is creating value through software and end up spending a lot of their time reviewing and updating documentation that isn't necessary.

Wednesday, May 7, 2008

Scala Lift-Off

Interesting to see Scala and Lift gain a little momentum, with Scala Lift-Off, an unconference.