Remember when you were learning the Software industry way... i remember one of my more enlightened University lecturers (there were few), who introduced me to the concept of GIGO (Garbage In, Garbage Out). The concept basically says that if you load lots of rubbish and faulty, erroneous data into your application then you can expect your application to return the garbage in kind. Kind of like data karma.
GIGO falls into the category of 'no-brainer'. But it's probably one of the most ignored fundamental concepts we have in existence. There are very few systems i have seen where developers make a point of preventing crap getting into their systems. GIGO is fundamental to a number of basic architectural areas. For example, It is the basis of many security concerns (don't allow invalid data in or it might be harmful), and It is the basis of data quality. Moreover I believe that GIGO is fundamental to basis system stability, robustness of applications and has a direct correlation to defect counts.
On a technical level we need to ask ourselves a few questions. If you allow potentially harmful data into your application, and you could of prevented it by doing nothing more than basic checks, is that a bad thing or is it an acceptable risk? I think it is a ridiculous risk. So what counts as bad data from an application architecture perspective? Think about the following:
-
Allowing null values to permeate into parameters on public methods.
-
Allowing invalid sized strings, negative numbers (when only +ve are allowed) or other such data in
-
Allowing data to enter through poor, ambiguous use of typing. (ie exposing object parameters when a generic or less abstract type parameter would provide more protection)
-
Using null values as flags, decision points etc
-
A host of other scenarios
So what is fundamentally wrong with say allowing a null to permeate? A null is the source of many errors. Firstly while you can gleam some meaning from a null, they don't play well if you manage to encounter one unexpectedly. I find it totally unaceptable to find code littered with ObjectReferenceNotFound errors. 99% of which are completely avoidable. So Look at the following overly simplified code sample. It is highly susceptible to this type of error:
public void ProcessOrders(List<Order> orders)
{
foreach(Order order in orders)
{
if(order.OrderLines.Count > 0)
{
//Process Line
}
}
}
First of all, orders could be null resulting in a bad error. Then OrderLines might be null too. I would not go to the level of checking deep down into object hierarchies, but you should definitely check that orders is not null.
public void ProcessOrders(List<Order> orders)
{
if(orders == null) throw new ArgumentNullException("orders");
foreach(Order order in orders)
{
if(order.OrderLines.Count > 0)
{
//Process Line
}
}
}
You can easily create utilities to make this a simple and pain-free process. You can do similar things to check that collections are populated (before indexing) to prevent out of range errors and the like. It's all very simple but solidifies code and makes a huge difference when your code goes into the wild.
Another big no-no in my book is to pass around really ambiguous chunks of data and then never bother to validate these 'chunks'. For example passing strings hopefully containing properly formed and valid XML. This is really bad these days when you can serialize pretty much anything. Passing non-standard data and then using if or case statements to work out what that data is supposed to be is also asking for trouble. It's all about being explicit and making sure that in the scenarios when you do have to pass losely formed or un-typed data around that you do validate it on the other end, and you should always expect one and only one data format for any given piece of data. Less moving parts, less chance for error, simple validation, less bugs.
In my experience if you use a few simple asserts, guards and other defensive techniques, and back them up with a really solid unit test suite you can reduce your bug count down to near zero levels. It's worth the extra effort to make sure you don't get Garbage out by not putting garbage in there in the first place! 