Tuesday, November 27, 2007

Performance: Groovy/Grails vs. Ruby/Rails?

So, in the comment thread that followed: The Pain of Java after Ruby, I off-handedly commented that I'd been given the impression that Groovy or Grails' performance was worse than that of Ruby or Rails. I was inevitably challenged on that point.

Let me back up a little and cover some of the ground I covered in that comment: I haven't attempted to benchmark Groovy and/or Grails at all; Grails didn't have great support for REST services when I last looked at it, so I didn't include it in some of my performance explorations at the time. We were able to get 600 requests per second from Restlet, and were able to get something on the order of 40 requests per second on the same hardware using Ruby on Rails. I'm sure both of those results could be improved upon. (At the time, JRuby was scoring about 28 requests per second; I gather performance has improved since then).

So, it begs the question: Does Groovy and/or Grails outperform Ruby and/or Rails? You'd certainly like to think so, given the Java VM underneath it, but what have people said?

Well, the Grails folks would probably argue that their solution performs better although it's still clearly a tight race. There's a little grumbling from some people with more Rails expertise, but there doesn't seem to be any outright disagreements.

That said, if you compare either of those solutions back to a Java equivalent, the results are usually pretty striking. There's a few examples of that out there.

So for now, I'll say that it seems like Groovy may have the edge on performance compared to Ruby, although it's still very much a two-horse race, and both have many further opportunities to gain ground.

However, it's also clear that both of these two solutions are, in the overall spectrum of development platforms, relatively slow. Depending on what you're building, that may not be a big deal, but it can certainly be a concern.

Monday, November 26, 2007

The Pain of Java after Ruby

Having done several months of Ruby in the past year, I occasionally find myself looking at a page of Java code and thinking to myself, "Wow, that's Java alright."

For instance, within the past few weeks, I needed a URI for testing. I made it a constant (private static final). Since it can throw an exception, I had to put the constant in a static initializer. Since the exception might be thrown, the variable might not be initialized, and so on and so forth. Eventually, you end up with this:


private static final URI SOURCE;

static {
URI uri = null;
try {
uri = new URI("/");
} catch (URISyntaxException e) {
Assert.fail("Couldn't create a URI for testing; please fix the test. This shouldn't happen.");
}
SOURCE = uri;
}


Although I understand the rationale behind each of the steps that gets you here, it's a bit of a boiled frog syndrome. By solving one problem with slightly more complex code, and repeating that pattern a few times, you eventually end up with pretty horrific code. Thanks, Java!

Now, to be fair, there are other ways around this code. For instance, I could easily create this as an instance variable in the test setup code, which in many ways would be a more normal approach, and it'd take all the above code away. That's probably the most obvious resolution for the problem at hand, but there are certainly others.

In any case -- because there are alternate solutions, this isn't really a problem, but having painted yourself into this particular corner it's easy to think to yourself, "Wow, this code would be much nicer in Ruby."

Monday, November 19, 2007

Individual and Collective Responsibility and Standup Meetings

Although I believe that stand-up meetings can be a powerful tool, I think they often go astray: they turn into status meetings. Others have covered the patterns and anti-patterns of standup meetings before, but I'd like to delve into this one point in more detail.

When standup meetings turn into status meetings, I'm inclined to think the problem often traces back to a sense of where the responsibility lies: with the individual or with the team.

Individual Responsiblity
Does each member of the team think that he or she is responsible for a task and must let the rest of the team (or a subset of the team) know where he or she is at? That's a status meeting, and it stems from a sense of individual responsibility.

With this mindset, individual team members may feel as if he or she is personally late to bring a task to completion mid-iteration, that the tasks others are on is interesting, but basically unrelated to the task at hand, and that a standup meeting is simply a way of keeping in touch with the rest of the team, and more importantly, making sure that some "important" members of the team know what's going on. The danger here is that each individual focuses on their task to the exclusion of the team's goals.

Collective Responsibility
What you should be striving for is a sense of collective responsibility. The team, as a whole, should understand where they're trying to go, what's important and what isn't, whose tasks may be on the critical path, and what those tasks mean. Without this collective responsibility, a team will often not act like a team, but rather like a collection of individuals.

When a team member on an important task is stymied, a team with a sense of collective responsibility will react naturally to that, attempting to route around damage. Rather than worrying about their own task they'll react to the needs of the team, trying to accomplish the goals set for the team, rather than worrying about a task that may or may not have been assigned to (or, better, selected by) them.

It's from this collective responsibility that "self-organizing" arises, and without it, an agile team is simply a set of very small teams managed by a central command structure.

Lazy Web
Have you experienced both mindsets? Have you managed to move a team from individual to collective responsibility? Does this make sense to some of you?

Thursday, November 15, 2007

Mis-Attributed

Once upon a time, I was in University spent a fair amount of time on Usenet, and in some writing groups (I put together the proposal for rec.arts.sf.written, and maintained the FAQ for a while). One of the taglines I used once in a signature, I cribbed from an essay written by, I believe, Ursula K. LeGuin. My memory fades, but it was a good essay:

On the whole, audiences prefer that art be not a mirror held up to life, but a Disneyland of the soul, containing Romanceland, Spyland, Pornoland and all the other escape lands which are so much more agreeable than the complex truth."
I guess I failed to attribute it as well as I should, because the Internet has decided, in all its vast wisdom, to attribute that quote to me.

I'd love to take credit. It's a nice quote. I wish I had written it. But I didn't. And I'm pretty sure, although I don't have the essay at hand, that Ursula K. LeGuin did. So if you run a quote site, aim the attribution at her. It may not be accurate, but it's closer than me. Better still, either track it down or remove it entirely.

Wednesday, November 14, 2007

Eclipse 3.4M3

I was a little distracted by QCon when Eclipse 3.4M3 hit the street. There are a few interesting bits in the New and Notables:

  • Eclipse will sense when you've verified that a class is an instance of something more specific, and offer code completions for that more-specific class with automatic casting. I don't need to do this a lot, but when I do, I find the process a little irritating, so it's nice to have a little quick support for that.
  • Personally, I think with the Save Actions in 3.3, projects should bite the bullet, define a project-specific formatter, and set Eclipse to format on every save. That said, for people not willing to go that route, the new ability to format only edited lines offers a possibility to reduce the chaos on check-in (at the potential cost of having a file that is partially formatted).
  • The ability to detect unnecessary @SuppressWarnings continues to improve the job Eclipse can do to clean up after the mess we make for ourselves.
  • The improvements to Call Hierarchy are welcome; I've often wanted the ability to do a call hierarchy from a member field, not just a member method. If you haven't used call hierarchy, think about hitting Ctrl-Alt-H any time you might otherwise have reached for Ctrl-Shift-G. By offering a multi-level trace on what code access a method or field, you can often find more, faster than a find usages.
This is in addition to some of the highlights from M2 and M1:
  • Differentiating between read and write access on Find Usages output.
  • Quick assists to create getters/setters and extract methods.
  • Replace all with preview diff looks really nice.
  • The extract class refactoring to group method parameters into an object: often, when a given set of parameters is passed around to more than one method, this is a sign that there's really an object at work that just hasn't been created.
Eclipse 3.4 seems to be part of the Ganymede simultaneous release. There's a list of projects in that simultaneous release, but very few of them seem to have released much information of what's to come. Looks like Eclipse 3.4 will be followed by 4.0, according to the draft plan.

Friday, November 9, 2007

QCon: LinkedIn Architecture

Notes from the Linked-In: Lessons learned and growth and scalability session at QCon with Jean-Luc Vaillant.

Their architecture includes:

  • Java (trying out some Ruby, adding some C++, as little as possible)
  • Oracle 10g and MySQL
  • Spring
  • ActiveMQ (tried OracleMQ, doesn't recommend it)
  • Tomcat & Jetty
  • Lucene
Graph computations don't perform very well in a relationship database: with large numbers of members, and large numbers of connections, the combinatorics can be staggering. Add to this that simple approaches to storing this information would require extensive joining. Best way to get performance was to run the algorithms on the graph in RAM.

That raises the connection of how to keep the RAM database in sync at all times. One option is to update the database and inform other engines of changes through direct RPC, reliable multicast, JMS. This has the typical problems of two-phase commit.

An alternate approach that LinkedIn has used is to log changes in a transaction log which can be pulled from each graph engine into RAM as necessary. The approach is currently Oracle-specific, but it is applicable to just about any database.

Once that's in place, the in-memory techniques for traversing the graph are far less painful. Breadth-first traversal to get connections of various degrees. Using symmetry to find connections from both sides.

Having run into issues with Read-Write Lock, he prefers Copy On Write.

QCon: eBay Architecture

Notes taken during Randy Shoup's session on eBay's architecture at QCon

Partition Everything
Functional and horizontal segmentation. Partitioning data based on modulo of key, ranges, etc, depending on the data. Load balance pools of application servers, divided functionally. Search index separate from read-write listings.

Avoid database transactions - seek consistency through other approaches: BASE.

Absolutely no session state - transient state maintained/referenced by:

  • URL Rewriting for Small Data
  • Larger data in Cookies (up to 4k)
  • Largest state in a Scratch Database (e.g. Multi-page flows, like listing an item)
Asynchronous Everywhere
By pushing dependencies off into asynchronous calls, it decouples availability, performance. Can retry. Improved user experience of latency although the data/execution latency might actually drop. Can allocate more time to procssing than user would tolerate.

More interestingly, this allows you to spread the cost of load over time. Spikes are less important, because much of the processing can queue up, then catch up in off-peak cycles.

Message dispatch with pub-sub cycles - when listing something, triggers an ITEM.NEW event which can be consumed by the summary update, user metrics, image processing, etc. They have over 100 logical consumers consuming ~300 events. As described the other day, they use at-least-once delivery and any order, rather than trying for only-once and ordered.

Event consumers go back to the primary source of the data rather than relying on the data in the event.

eBay also uses periodic batch processing for infrequent, periodic or scheduled processing, and for problems that are difficult to partition (e.g. full table scan), such as: generating recommendations, importing third-party data, computing sales ranks, archiving and purging deleted items. Often drives further downstream processing through message dispatch events.

Automate Everything
Machines are cheaper, and scale better and more cheaply than humans. They also adapt to a changing environment.

One approach here is adaptive configuration. Define SLAs for logical event consumers (e.g. 99% of events processed in 15 seconds), and then allow the consumer to dynamically adjust to meet defined SLA by tweaking event polling size, polling frequency, number of threads, and minimize cost by adjusting to changing environment (e.g. add more instances to consumer pool: can ramp down polling frequency).

Also employing machine learning to collect user behavior, aggreate that and make decisions, redeploy the metadata results of the learning. Can use this to choose pages/modules/inventory that provide best experience for user and context. Need to perturb the system to try alternatives in order to avoid getting stuck on local maximums.

Remember that Everything Fails
In order to be as available as possible: Assume everything can fail, all resources will become unavailable. Detect failure and recover from failure as rapidly as possible. Do as much as possible even when failure detected.

eBay logs all activity (request, exceptions, app-generated information), especially database and resources, logs on a messaging bus (1.5TB of log messages per day!). Listeners automate failure detection and notification. Compare scenarios with data-warehoused possibilities for root-cause detection (did we roll out new code? which database partitions are affected? etc.)

Make sure that all changes to the site can be rolled back - in every two week period, eBay rolls out 100,000 lines of code. Many changes involve dependencies between pools, so rollout plans contain explicit transitive set of dependencies. Automated tools execute a staged rollout with checkpoints and immediate rollback if necessary. Automated tool also enables rollback, including full rollback of dependent pools.

Audience question: "How do you test this?"
Randy: "We test it every two weeks, with our blood, sweat, effort."
Dan: "When the rollout plan is risky, there will be an explicit test of the rollout, not just the features."

Also, all features have on/off state driven by central configuration, so features can be turned off for operational or business reasons. This decouples code deployment from feature deployment. Applications can check for the availability of features in the same way that they check for the availability of resources.

Often don't go back and remove the on-off capability, although this does eventually result in configuration and code bloat. Allocate "a pretty decent percentage" of time to head-room ("We're going to bump our head up against that wall unless we fix this", such as developer productivity and refactoring) in which some of that 'clean up' can be addressed.

In failure detection, it's much easier to detect 'failed' than 'slow'. Once a server/resource is considered failed, it is "marked down" by sending alerts and no longer sending requests to the resource. If the resource isn't critical, that functionality is suspended. If it is critical, functionality is retried (alternate resource) or deferred (guaranteed asynchronous message). Explicit mark up allows resource to be restored and brought online in a controlled way - system can be marked up for different parts of the infrastructure at a time, very important when the failure was a load-induced failure. If the entire system were told the resource was back online, it might result in immediate return to failure state.

Thursday, November 8, 2007

Qcon in Review: November 7th

Summarizing November 7th, for those not following my tumbl'd thoughts.

Kent Beck's opening keynote was well-presented, but I didn't take much away from it other than the reason behind his shaved head.

A panel discussion on Architecting for Performance and Scalability was good, but it's hard to cover that kind of topic in an hour, particularly with a panel. Of particular note was Ari Zilka arguing that scaleout for relatively small scale should focus on load-balancing rather than partitioning.

After lunch, I attended Designing for Testability, with Cedric Beust and Alexandru Popescu. This was perhaps a mistake for me, having done a lot of pervasive testing and having used TestNG, I didn't really learn very much. This was the second time I felt that way, so I realized I was going to have to watch which topics I attend -- the ones I'm most interested in are often the ones I know the most about, and am least likely to learn something from an hour-long primer. TestNG contributed a great deal to the world of Java testing, but I feel like JUnit has been revitalized by that competition and the choice between the two is less clear than it was between TestNG and JUnit 3.8.1. Cedric also showed his usual bias against agile and TDD: which is fine, as far as I'm concerned, you use the process that works for you.

Next up was Eric Evans on Strategic Design. This was a good talk, although he's a measured speaker rather than overflowing with energy. Still, he's a clear communicator, and there were some interesting points. I particularly liked the metaphor of maps as models. His anecdote about pudding, which relies on the language context (american pudding: goey custard-like dessert; british pudding: any dessert) was also entertaining. Some of the detailed elements seemed vague, but it was still enjoyable.

I followed that up with Cameron Purdy's Java Scalability and Reliability. I'd seen this presentation (or a heavily related one) before, but he's such an entertaining speaker, I didn't mind seeing it again. He's also not afraid to stir up a little humorous controversy: taking a shot at Bob Lee and calling FTP the precursor to SOAP.

Finally, a panel discussion on What will the future of Java be? was really great. There was a lot of hunger in the room to see Java evolve in some fashion, whether or not it's part of its current platform, or in the shape of a new platform on the JVM. The panel was well-comprised with different viewpoints, and there were some pithy comments and good debates.

Joshua Bloch opined that Java is mature, and shouldn't go around in a pink miniskirt and pierced navel, but rather continue to evolve in a very measured fashion, making room for a new platform on the JVM, one that solves some of the existing fundamental problems. This was an interesting perspective, but I wish I could see evidence that there was serious effort being put into this concept (sorry, Charles Nutter, I don't think JRuby's that solution, as interesting as it is). A fair amount of time was spent talking about static/dynamic languages and backwards compatibility and the size of the Java download. Erik Meijer concluded with an interesting statement about DSLs: a disaster waiting to happen. This was probably my favorite session so far, both the topic and the speakers were great. Well worth watching when InfoQ puts it up.

QCon November 6th: DSL Tutorial

For those not following my more stream-of-consciousness reporting on QCon via Tumblr, I thought I'd summarize on a daily basis here.

On November 6th, I attended the Domain Specific Languages tutorial put on by Martin Fowler and Kent Beck. They started the presentation by letting us know they were only prepared for a half-day tutorial and had only recently realized they were scheduled for a full day. They then proceeded to get through the first 'hour' of material by lunch, and took the afternoon for the rest, so all in all it seems like the content was well-suited to the timeframe.

They did a good job of describing the kinds of domain specific languages (they used the terms internal and external, rather than the 'embedded' term that's been used elsewhere), showed some good examples of DSLs in Java and Ruby as well as completely custom DSLs, the tiny languages that the Pragmatic Programmers talk about. They also spent some time describing the boundaries of DSLs: What's a DSL vs. an API; a DSL is usually a thin veneer over a framework; the hardest part in writing a DSL is writing the framework beneath it.

The last portion focused on those external DSLs, using Antlr as an example on how you might build and use a parser for a domain-specific language. This was the most interesting to me, because although I've read a little on the subject, including some of Martin Fowler's exploration of Antlr already, it's the area that I knew the least about (having used jMock's Java DSL for years, and having done a few months of Ruby in anger).

There were a few interesting moments:

  • Charlie Poole indicated that some domain specific languages make it harder for non-english speakers to use the API, because they lack the common use of the idioms of the language. For instance, is "2.days.from.tomorrow" easier than Calendar.getInstance().add(Calendar.DAY,2).getTime() for a german speaker? (I dunno, I'm not German).
  • Neal Ford, describing the muddy boundary between an API and a DSL, said, "DSLs are like pornography: Hard to define, but I know it when I see it."
  • Martin Fowler and Neal Ford have slightly different takes on open classes, the ability to add methods and behavior to classes you don't control, easy to do in Ruby. Martin argued that it was reasonable within the context of an application you control, but riskier in a library that will be integrated with someone else's application (and other libraries). Neal Ford argued that experienced Rubyists find this conflict arises very rarely, and that the benefits outweigh the risks.
In the end, I don't know that I actually learned that much, but that's to be expected when you already know a fair bit about the area. I expect that when they release a book on the subject, it'll be a good one, with the kind of detail you can't stuff into a day-long presentation, and particularly valuable to those of you who are interested in, but have little experience with, DSLs.

I ended the day by having some Mexican food from Mijita Cocina Mexicana, possibly the food highlight of my trip (so far).

Monday, November 5, 2007

Tumbling QCon

Blogger isn't ideal for short snippets; I'd been considering tumblr for a while, and since they've recently released a new version, and I'm going to want to put out snippets at QCon, this seems like the ideal time to give it a whirl.

So, in the meantime, I'm thinking of using this blog for longer, article (or articella) pieces, and tumblr for links, quotes, and other very simple posts.