Practical ASP.NET

A Modest Proposal on Validation in the Middle Tier

Peter looks at a strategic issue: When to do validation? The answer isn't "Everywhere" but it could have been.

One of the standard questions that I get when designing n-teir applications is "Where should I validate my data?" The short answer is: At the database. The only way to ensure data integrity is to check as much as you can at the database because you can't guarantee that everyone writing code that updates your data will get the validation right.

The longer answer is: wherever possible. Allowing bad data to move from the client through your business objects and into your database is time consuming: It doesn't make sense to make your users wait through a round trip to the database to discover that they've entered a date in the wrong format.

More importantly, doing all validation in the database server is expensive. Your servers are a shared resource: the more people using your application, the greater the demand on your servers, limiting the maximum number of users you can support. Checking data in the Web browser means that you're using the users' CPU cycle. Effectively, every user who shows up not only adds to your application's burden but also adds to the application's computing power.

If this suggests that you should validate data in the Web page (using JavaScript), in the code-behind file before handing the data over to the business objects, in the business objects themselves, and in the database... well, it is. Sadly, it's also a great way to run up the labor (and, as result, the cost) of building your application.

Actually, as a consultant who's paid by the hour, I don't feel too bad about that.

But, in addition to being costly, embedding validation code in several places is an accident waiting to happen. The more times you implement the same validation logic, the more likely it is that you'll get it wrong someplace. (We will now skip over the time I inadvertently paid everyone in my organization twice because I got a "greater than/less than" test backwards in a business object).

The primary danger is that validation code early in the process (e.g. client-side code) errs in being stricter than validation code later in the process (e.g. the server-side code). If the client is incorrectly rejecting entries that the server is perfectly willing to accept then the system is failing. The reverse (client-side code that's less strict than server-side code) is not a problem: Yes, bad data will slip through the browser and make its way to the server but it will be rejected at the server. Granted, your server is working too hard (there's that shared resource problem again) but your data retains its integrity.

A Modest Proposal

Interestingly enough, then, it appears that the optimal strategy is to validate everything you can at the client (to reduce demand at the shared resource) and then check it again at the database (to protect yourself from incompetent -- or malicious -- front-end developers). If your business object encounters a problem that could have been checked by the client, it should blow up; if there's an issue that can be checked at the database, the middle tier shouldn't attempt to detect it. Only those checks that can't be easily/transparently implemented at the client or the database should be implemented in middle-tier business objects.

And, by "validating everything you can at the client," I mean exactly that. Calling a Web service to validate data is not validating at the client, because you're transferring the validation back to the server.

Obviously removing all validation from your business objects and limiting validation at the client has its issues. Response times increase as users wait through a round trip to the database to find that they've got a problem; your database engine is working harder because requests that would otherwise have been denied by middle-tier validation are reaching the database.

However, if you believe that your database is your last defense against bad data, you must do as much validation there as possible. If you believe that scalability is important, then you want to do as much validation as possible at the client. Duplicating validation in the middle tier that is performed in the other tiers creates an opportunity for error.

Logically then, validation should be performed in the middle tier only if it can't be done in the other two tiers, or if it's absence can be demonstrably shown to be creating a bottleneck.



About the Author

Peter Vogel is a principal in PH&V Information Services, specializing in ASP.NET development with expertise in SOA, XML, database, and user interface design. His most recent book ("rtfm*") is on writing effective user manuals, and his blog on technical writing can be found at rtfmphvis.blogspot.com.

Reader Comments:

Fri, Feb 12, 2010 Peter Vogel Canada

Fregate: I sympathize with your plight--especially with have to cleanse your data before using. One of my current clients is having to deal with data coming from a physical device that (every once in a while) will suddenly spew garbage. Your situation is the one that I'd like to find a way to avoid: implementing the same validation at !!every!! level of the application.

Thu, Feb 11, 2010 fregate

In one of my recent project for a major Health Insurance provider my team implemented the following kind of validations: - client-side validation; - service-side validation; and (a HORROR story) validation of data RETRIEVED from database - because we have to deal with BAD legacy data before convincing client to perform a major data cleansing exercise. TRUST NO ONE!

Sat, Feb 6, 2010 Peter Vogel Canada

I don't argue that additional validation is needed beyond the client. I'm suggesting that the backup validation should be in the data layer. I'm suggesting that only what validation is absolutely necessary to protect the database AND can't be done in the data layer itself should be done in the middle tier.

Fri, Feb 5, 2010

The problem with client-side validation is how easily it can be bypassed. A malicious user (or just an overtly curious one) could easily disable any and all validation done by the browser and send just about anything to your middle tier. Even if all they cause is an unhandled exception to get logged they could flood your system with writes to this log file. The middle tier HAS to make sure the data is clean. The only purpose of client-side validation is to provide more immediate user feedback; it cannot be trusted for anything else.

Add Your Comments Now:

Your Name:(optional)
Your Email:(optional)
Your Location:(optional)
Comment:
Please type the letters/numbers you see above