Kennie Pontoppidan hosts this month’s T-SQL Tuesday and asks us to write about “the daily (database-related) WTF”. I will admit that occasionally I submit to Schadenfreude. And when I get stuck cleaning up someone else’s mess, my hindsight is 20/20. I can point out all the problems they were too “ignorant” or too “lazy” to fix. And then I remember that “they” was “me”.
And I don’t just mean the proverbial “me” in that we’ve all been there. I mean my actual code. I’ve heard that if you don’t look on code you wrote 6 months ago and wince, you’re not growing as a developer. But just as often, after the initial “why!?” finishes painfully echoing in my brain, I will remember the context– the trade-offs, the constraints, the deadlines. Sometimes, I’ll plan out all of the improvements I’m going to make only to realize how they won’t work and that the code is written that way for a reason. (And of course that’s when I write the comments I should have written the first time.)
Years ago when I read The Daily WTF more frequently, it gave me a sinking feeling, especially the comments. I know, don’t read the comments, but that’s often the only chance for learning. See, the article format typically leaves the interpretation of the problem up to the reader. Instead of a vehicle for professional growth, where a problem is presented with a memorable story for impact and a solution for the readers to take with them on their travels, it usually presumes that everyone knows what’s gone wrong and is nodding along. Then the comments pile it on. There’s rarely any teaching. Instead, it’s pointing and laughing or highlighting even bigger problems (the “real” WTF). When folks do propose alternatives, they’re torn apart in a game of one-upsmanship.
We should all strive to improve, and we can’t learn without addressing failures, but not all “bad” code is really bad, and sometimes better code isn’t part of the better *solution*– speed to market and competing interests matter. And even if there’s objectively superior ways of doing something which don’t take more time, we impugn equally whether it’s due to a lack of knowledge or a lack of care. We were all newbies at one time and we all still write bad code. We should be helping, not hurting, and tone matters. A caustic, elitist technology community is toxic.
So I won’t be throwing stones from my glass house. Instead, I present a current code practice of mine which at first blush might make you think “What the Fudge?”. (That’s what WTF stands for, right?)
Consider a stored procedure signature:
CREATE PROCEDURE dbo.MyProcedure @MyParam1 datetime AS
Often, you’ll want a default value in case one of the parameters isn’t specified. You might be tempted to do something like this:
CREATE PROCEDURE dbo.MyProcedure @MyParam1 datetime = '' AS
But what if you wanted the default date to be the current date? You can’t put any functions in the procedure definition. You could clean that up later in the procedure with something like this:
SELECT @MyParam1 = CASE WHEN @MyParam1 = '' THEN getdate() ELSE @MyParam1 END;
But what if the caller wanted the date to be “empty” (i.e. 1900-01-01)? And what if a NULL is passed?
In our environment, we’ve disallowed NULLs from our table fields. We understand that NULL is actually information– it says that the data is unknown– but we believe that for most data fields, there are non-NULL values which just as effectively represent unknown. Typically, 0’s and empty strings (and the “blank” date 1900-01-01) serve that purpose. And those values are more forgiving during coding (they equal things; they don’t make everything else “unknown”), and we accept the risk of paying little attention to which parts of our logic touched “unknown” values.
So that means we have to “clean” or “scrub” NULLs from our parameters at some point. So our little SELECT statement above becomes a little more complicated:
SELECT @MyParam1 = CASE WHEN isNull(@MyParam1,'') = '' THEN getdate() ELSE @MyParam1 END;
Well that’s a little annoying, but it’s not the real problem. Note the repetition of the empty string literal. It’s now in three places. Are you going to remember to switch all three when your surrogate default value indicator changes? If empty string becomes something callers want to specify, then you’ll have to change your empty string default to something like ‘1899-01-01’. But if you only change it in the procedure definition, then you’ve just broken your procedure.
And you probably have this surrogate default value sprinkled all over your code base. Why? Well, there are only three ways to invoke the defaulting behavior from a calling procedure: omitting the parameter entirely, sending in the keyword “default”, or just sending in the default value. So if you only *sometimes* wanted to use that default value from a calling procedure, you’d have to do one of the following.
IF @UseParameterDefault <> 0 BEGIN EXEC dbo.MyProcedure @MyParam1 = default; END ELSE BEGIN EXEC dbo.MyProcedure @MyParam1 = @SpecificParamValue; END
IF @UseParameterDefault <> 0 BEGIN EXEC dbo.MyProcedure; END ELSE BEGIN EXEC dbo.MyProcedure @MyParam1 = @SpecificParamValue; END
IF @UseParameterDefault <> 0 BEGIN SET @SpecififcParamValue = '1899-01-01'; END EXEC dbo.MyProcedure @MyParam1 = @SpecificParamValue;
The first and second options are pretty much the same. They become much larger the more parameters exist in the procedure, and they are prone to bugs if the two parts of the IF block diverge due to bad maintenance. The third option eliminates those issues, but now our magic surrogate default value will proliferate to all of our calling procedures. That greatly increases the fragility of the system with respect to changes to that surrogate default value (in case someday ‘1899-01-01’ also becomes a valid value).
So instead, for our environment, we use NULL as the parameter default, and then we do any defaulting logic as part of the isNull. Here’s what it looks like:
CREATE PROCEDURE dbo.MyProcedure @MyParam1 datetime = NULL AS SELECT @MyParam1 = isNull(@MyParam1,getdate());
Then all calling procedures can easily send in an actual value or the universally used NULL to indicate that defaulting should occur, and the default value (or logic) is represented only once in the entire system.
So far, this pattern might be a little unusual, but nothing probably seems superfluous or silly. Bear with me, there’s more.
Let’s say you have a procedure which has a variety of control parameters (which tell the procedure how to behave– how to apply its logic or what special actions to take), along with some data parameters. Remember that we don’t store any NULLs in tables (a worthy debate for another time), but as you’ve seen we do take advantage of NULL in our logic. That leads us to a procedure like this:
CREATE PROCEDURE dbo.UpdateOrder @OrderID int = NULL, @OrderDate datetime = NULL, @OrderCustomerID int = NULL @DoSpecialThing1 bit = NULL, @DoSpecialThing2 bit = NULL AS SELECT @OrderID = isNull(@OrderID,0), @OrderDate = isNull(@OrderDate,NULL), @OrderCustomerID = isNull(@OrderCustomerID,NULL), @DoSpecialThing1 = isNull(@DoSpecialThing1,0), @DoSpecialThing2 = isNull(@DoSpecialThing2,0);
Wait. What the Frosting? Why do you have NULL in those isNulls? That’s essentially a NOOP. It’s useless.
True, but we want those NULLs to remain. We do something special with them later:
UPDATE dbo.Orders SET OrderDate = isNull(@OrderDate,OrderDate), OrderCustomerID = isNull(@OrderCustomerID,OrderCustomerID) WHERE OrderID = @OrderID;
With this pattern we’re able to update only those table fields which we want to affect. We can leave the others alone. Otherwise, every time we call this proc, we have to first gather all of the Order fields we don’t care about just so we don’t wipe out any data. And we can’t default the data as part of the NULL scrubbing SELECT because the data varies for each row. We could do it through a separate SELECT which actually used the Orders table, but (1) that’s an extra hit on the table and (2) it’s less concurrent– we might carry over data from that SELECT to the UPDATE when an intervening UPDATE had changed it.
Even if we aren’t doing this sort of a fancy UPDATE, we might want pass those values down to a lower-level procedure and have defaulting occur there. In either case, they need to stay NULL past the initial scrubbing SELECT.
Ok, that’s all fine and dandy, but still, why bother with the isNull? It doesn’t do anything.
Strictly speaking, that’s true. It has no programmatic effect. But SQL Server isn’t the only one reading our code. Humans read it. It tells a story to those users. Imagine if we had simply omitted those lines. The reader would wonder if they omitted for a reason or as a mistake. Sure, a comment serves a similar purpose, but this code makes it clear without a colorized context switch and it keeps consistent formatting for each line. With large lists of parameters, you can line the two lists up in separate columns in Excel to compare them. If you’re really ambitious, you can build code generation tools which generate the SELECT scrubbing line.
My hope is that this example illustrates that even code which seems obviously wrong or pointless can serve a purpose. And that style, context, and environment can shape the right course of action.
By all means, review and critique code, but do so with an open mind and without judgment. A code review is an opportunity for both parties to learn.
Note that performing any manipulation of your procedure parameters might frustrate the query optimizer’s ability to use their specified values during plan generation. Using local variables turns off parameter sniffing similar to OPTIMIZE FOR UNKNOWN (at least for now). We have not researched what effect simply manipulating the parameter values have. Circumventing parameter sniffing might avoid any performance surprises due to unlucky parameter values on first run, but it also sacrifices any benefits if a better plan could have been chosen with clean parameter information. We don’t believe this hasn’t been a problem for us so far, but your mileage may vary.
(Hey, I managed to resist the temptation to link to the canonical graphic on this subject. D’oh. Oh, well.)