A Modest Proposal on Validation in the Middle Tier
Peter looks at a strategic issue: When to do validation? The answer isn't "Everywhere" but it could have been.
One of the standard questions that I get when designing n-teir applications is "Where should I validate my data?" The short answer is: At the database. The only way to ensure data integrity is to check as much as you can at the database because you can't guarantee that everyone writing code that updates your data will get the validation right.
The longer answer is: wherever possible. Allowing bad data to move from the client through your business objects and into your database is time consuming: It doesn't make sense to make your users wait through a round trip to the database to discover that they've entered a date in the wrong format.
More importantly, doing all validation in the database server is expensive. Your servers are a shared resource: the more people using your application, the greater the demand on your servers, limiting the maximum number of users you can support. Checking data in the Web browser means that you're using the users' CPU cycle. Effectively, every user who shows up not only adds to your application's burden but also adds to the application's computing power.
Actually, as a consultant who's paid by the hour, I don't feel too bad about that.
But, in addition to being costly, embedding validation code in several places is an accident waiting to happen. The more times you implement the same validation logic, the more likely it is that you'll get it wrong someplace. (We will now skip over the time I inadvertently paid everyone in my organization twice because I got a "greater than/less than" test backwards in a business object).
The primary danger is that validation code early in the process (e.g. client-side code) errs in being stricter than validation code later in the process (e.g. the server-side code). If the client is incorrectly rejecting entries that the server is perfectly willing to accept then the system is failing. The reverse (client-side code that's less strict than server-side code) is not a problem: Yes, bad data will slip through the browser and make its way to the server but it will be rejected at the server. Granted, your server is working too hard (there's that shared resource problem again) but your data retains its integrity.
A Modest Proposal
Interestingly enough, then, it appears that the optimal strategy is to validate everything you can at the client (to reduce demand at the shared resource) and then check it again at the database (to protect yourself from incompetent -- or malicious -- front-end developers). If your business object encounters a problem that could have been checked by the client, it should blow up; if there's an issue that can be checked at the database, the middle tier shouldn't attempt to detect it. Only those checks that can't be easily/transparently implemented at the client or the database should be implemented in middle-tier business objects.
And, by "validating everything you can at the client," I mean exactly that. Calling a Web service to validate data is not validating at the client, because you're transferring the validation back to the server.
Obviously removing all validation from your business objects and limiting validation at the client has its issues. Response times increase as users wait through a round trip to the database to find that they've got a problem; your database engine is working harder because requests that would otherwise have been denied by middle-tier validation are reaching the database.
However, if you believe that your database is your last defense against bad data, you must do as much validation there as possible. If you believe that scalability is important, then you want to do as much validation as possible at the client. Duplicating validation in the middle tier that is performed in the other tiers creates an opportunity for error.
Logically then, validation should be performed in the middle tier only if it can't be done in the other two tiers, or if it's absence can be demonstrably shown to be creating a bottleneck.
About the Author
Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.