Wednesday, August 12, 2009

Now If We Could Just Get Them To Stop Selling Entourage...



Microsoft barred from selling Word in U.S.

Saturday, August 08, 2009

Exciting news! "Mule in Action" Released


I haven't blogged in a loooong time. It's partly because I've been busy and also because I haven't felt very inspired by anything in software development lately. I'm not a natural blogger...I'm not even on Twitter (gasp!).

However, today I have something to get excited about! The mailman dropped off my shiny new copy of "Mule In Action". It turns out "Some Guy" named David Dossot wrote a book. You can expect my completely biased, totally non-objective review soon (sneak peak: I loved it!)

Congrats David! Dom, I still wish we'd managed to get that pantomime horse...

Friday, December 12, 2008

When Is a Unit Test Not a Unit Test?

One area of debate that I often get drawn in to is the difference between unit tests and integration tests. Obviously there is value in drawing a distinction between the two, but where to draw the line is definitely a point of discussion amongst developers who care about this sort of thing (those that don't care should likely be in a different profession).

For example, is a test that fires up the Spring container really a unit test? Well, no not really, but is there a point in making this distinction? Debatable.

The last couple of weeks I've been working on trying to claw some legacy code back from the abyss (servlets and EJB2) and bring it in line with something resembling modern application design (anything but servlets and EJB2). I was shocked and delighted to find that this code actually had some unit tests, even if some of them weren't testing what they purported to be testing. However, twice now I've been bitten by the same problem which has provided at least one clear way to distinguish unit tests from integration tests. It's very easy to verify, simply disconnect from the network and run your test suite:

IF YOUR TESTS FAIL WITHOUT A NETWORK CONNECTION, THEY ARE NOT UNIT TESTS!

I've been refactoring this code on the bus (sure, coding on the bus is dorky, but then again so is riding the bus, so what can I say?) and have been unable to run the test suites because the tests are trying to connect to the network. In the worst cases, a simple network connection won't do and I had to have a VPN connection just to test the code.

Basically this comes down to designing your classes for extensibility. You should be able to mock or stub any integration point in your code to allow you to test independently of outside systems.

It's the least you can do for some poor sucker like me who volunteers to refactor your code!

Thursday, November 13, 2008

Book Review: Release It!: Design and Deploy Production-Ready Software - Michael Nygard



Michael Nygard's "Release It!: Design and Deploy Production-Ready Software" is the sometimes harrowing tale of software in the wild. It deals with the author's experience with real production systems and the sometimes painful experiences of dealing with them once they are live. Although the names are changed to protect the innocent (or guilty) you can tell that Mr. Nygard has had experience with some pretty large-scale deployments. The book details some of his experiences and the lessons that he has learned along the way. These lessons provide the reader with ideas and strategies to be applied when dealing with production systems. By taking these lessons to heart, one hopes to avoid some of the nightmare scenarios that are detailed in the book.

This book had me hooked from the opening story regarding some poorly written JDBC code that took down an entire airline. I have personally corrected this code several times in my professional life and every time I've done it, I've said to myself "I'm kind of being picky...this is never going to happen". To see that it does happen and that when it does it can be catastrophic was a real eye-opener.

My one complaint with the book is the lack of code samples. One could argue that code samples are not useful in a book like this, as every production system is different and may require custom solutions to it's particular production issues. Personally, I find that I learn best through examples. I also find that code samples get my creative "juices" flowing. For example, I would have been interested in seeing some fully implemented "circuit breaker" code (if only to see the author's chosen implementation...I have my own ideas about how I'd implement such a feature...). This is really a minor complaint as there are many other books out there filled with reams of code.

I would recommend this book to anyone interested in building scalable, fault-tolerant systems (who isn't?). My only caution is that after reading it you'll start to realize all the faults, weaknesses and ticking time-bombs in your own applications.

Sweet (bang! bump!) dreams!

Thursday, October 02, 2008

Stored Procedures vs ORM: Cutting through the FUD

The Problem

I was working on a project that used a stored procedure to perform database inserts. This project was in the process of migration from EJB2 Entity Beans to a JPA implementation using Hibernate. This stored procedure was the only one in the whole application and I could not see a need for it. There were several reasons why this stored procedure was not desirable:
  1. It was database specific. This meant that each developer needed access to an enterprise database in order to run any integration testing. Developers were not able to setup their own databases for integration testing (i.e. HSQLDB or even a local MySQL)
  2. It contained business validation logic. This meant that the full application logic could not be tested without being connected to the enterprise database. No working on the bus on the way into work, etc. Say hello to VPN connections and every developer needing a copy of the database schema...not a very scalable approach to development
  3. Dealing with inserts into this particular table with a stored procedure while using Hibernate for everything else muddied up the overall design of the code making it an overly complicated mixed bag instead of a concise abstraction
When I proposed getting rid of the stored proc to simplify the design I was told: "Stored procedures are faster than ORM. Use the stored procedure." When I persisted and asked for further information to support this claim I was told "they just are".

I find this kind of attitude more common than I would hope in the Computer Science discipline. After all, the word "science" is right there in the title. Anyway, with the help of one of my colleagues I put together a test suite to exercise the code using both JPA/Hibernate and stored procedures. The results are summarized below:



Conclusions

Based on this information we can draw a few conclusions:
  1. The fastest single time was acheived with stored procedures
  2. The slowest single time was acheived with JPA/Hibernate
  3. Overall performance was very comparable with a maximum variance of ~150ms
In fact, when the results were averaged, the results were:
  • Stored Procedures: 99.98ms
  • ORM: 124ms
There are a couple of techniques that could be implemented that would potentially boost the performance of the JPA implementation. One is PreparedStatement caching, the other is the use of a second-level cache such as Ehcache. I will likely re-run the test with these enhancements (in my spare time....ha!) and post my findings.

So the question becomes: is a less than 25ms differential worth the added complexity and maintenance costs of coupling your development to a particular enterprise database platform? Largely it depends on the throughput demands of your application. If this were a trading application with ridiculously high volume and real-time requirements then every single millisecond might be worth the cost. For most of us though, the value of simplified design and portable development outweigh the potential costs many times over.

One added gotcha...

One thing we discovered while writing the test suite was that our model is not thread-safe. The model contains a running balance and a per user counter that needs to be incremented for each insert. To further complicate things, this application is clustered so even if we did make the Java objects thread-safe, there is nothing to stop different nodes in the cluster from corrupting each other's data. While this is very unlikely to occur in production, as a responsible developer it's my job to consider and avoid these situations. It turns out that the stored procedure locks the row in the database essentially providing synchronization at the database level (and therefore across the cluster). While this may seem like a benefit, it actually limits throughput as it is pessimistic locking. JPA/Hibernate would be able to perform optimistic locking with the addition of a version column to the table. This however would require changes to another monolithic stored procedure that I recently discovered was also making inserts to the same table. I guess I'm stuck with the stored proc for the forseeable future.

And the award for the longest most rambling blog post goes to...

Tuesday, September 16, 2008

Book Review: Implementation Patterns - Kent Beck




In Implementation Patterns, noted software guru Kent Beck lays out some of the tricks of the trade that he has picked up along the way to become a master coder. Fans of TDD (and who isn't?) or just good coding practices in general will recognize Mr. Beck as one of the authors of the JUnit testing framework as well as a contributor to Martin Fowler's seminal book Refactoring.

The book itself is fairly lightweight. There are some "patterns" that will likely seem obvious or even trivial. I feel like the term "pattern" is being overused. The book is more of a collection of best-practices for software development than a true pattern book in my opinion. For the most part, I found this book interesting from the perspective of hearing an expert such as Kent Beck explain the sorts of decisions he makes while writing code. I found his distinction between writing frameworks and writing applications very interesting and found myself wishing he had spent more time on this subject and left others (the Collections API, for example) for a more rudimentary book on Java.

Nonetheless, it's always good to hear the opinions of someone who has been through it all. There are very few of us out there that can say we've written as successful a framework as JUnit (although we've likely all written our own web framework at this point, haven't we?)

Wednesday, August 13, 2008

Green Bars! Should I check in?

So I've been working on what started out as a small refactor and has ended up as a 4 day journey down the rabbit hole. Tonight, I've finally achieved what should be a precursor to any check-in: green bars (or passing unit/integration tests if I'm running from the command line). However, as soon as I'd achieved this milestone I realized that I don't really want to check this code in...or at least not into the trunk. Even though the code works according to the tests, I'm still not confident in it. This is due to a few key reasons:
  1. Do the tests really assert correctness? I know that they're exercising the code and that they're passing, but what are they checking? I find this to be a bit of a problem, especially with "glass box" testing (usually involving mocks). Since I wrote the code and the tests, I feel like the tests may be biased.
  2. Even though I've been through a refactoring effort and fixed some shortcomings of the original API, I still feel like the code isn't "flowing". Code has a feel. When a model/pattern is right, you can feel it (until the next day when you decide it's the worst thing you've ever written..). This code doesn't feel right yet (aka my spider-sense is tingling).
  3. Nobody else has had a look at the code yet. I often find it extremely useful to collaborate with other colleagues. I enjoy pair programming, but even when that's not possible, I like to at least get an opinion from other developers. I find it very useful to see the code through other eyes.
Whew! There are likely other reasons I'm hesitant to check in. On the other hand, I'm feeling fairly uh....pent up...from not checking in for the last few days. I feel like I've reached a point with this code where I've got a milestone and any further refinements should use this point as a jumping off point. That's when I remembered the whole point of an SCM in the first place (or at least one of them). Developers often avoid creating branches. I've heard all sorts of reasons, the most popular being "merges are hard". Basically it all boils down to laziness. We have these tools for a reason and we should use them.

So basically I've rambled on and on just to say that I created my own development branch for this refactor and I'll merge it into the trunk when it feels right.