Chapter 4 — Set-based iteration, the third alternative

“A perceived benefit of iterative code might be that developers with a background in third-generation languages can start coding right away, instead of having to learn a radically different way to do their work.  But that argument would be like someone from the last century suggesting that we hitch horses to our cars so that drivers don’t have to learn how to start the engine and operate the steering wheel”. 

Before I dive into this subject, I just had to share that quote with you from the book that this blog is based on.  If you haven’t yet purchased this book and you are reading this blog, do yourself a huge favor and purchase this book.  The author’s who put this together are not only experts, they write incredibly well… Honestly, when was the last time you laughed at something in a technical book?

Hugo Kornelis wrote chapter three and I enjoyed every moment of it.  When I dove into chapter 4, I was greeted again by Hugo’s thoughts, advice and expertise on declarative and iterative based code.  I’ve typically used the names “set-based” and “cursor-based”.

Just today I was working on an issue for a client that involved nested cursors being called, in a transaction, from another server via a linked server.  When an error would occur, the entire thing would un-wind and the whole ‘batch’ would fail.  Much like Hugo states in his book, this solution was not something that set-based logic can efficiently handle.  So I was left with a good solution that wasn’t working.  I had to think outside of the box and get creative with how to handle this problem.  In this case, I modified the process to work in smaller main batches and added a try / catch block to the nested transaction scope that enabled me to drill into a smaller set of records, one of which was the problem record.  In the process of doing this, I was also attempting to explain it to someone and that actually proved to be more difficult than coming up with the solution.  The important take-away from me sharing the story is to illustrate that the effectiveness of this approach is not just limited to performance.  It can also help with your applications stability and availability by allowing a process to do as much as it can without it all failing.

Last year I had an in-depth conversation with a different client who was adamantly against cursors.  They believed that nothing HAD to be done with cursors.  While I cannot disagree with the truth of that statement, I do disagree with the validity of it.  There are far too many times that a cursor does make sense.  The difficulty with understanding this paradigm lies in truly understanding set-based logic.  Once a solid foundation exists in both, then the third alternative that Hugo defines, provides codes examples for and explains in this chapter make a ton of sense.  The bin packing posts Hugo discusses in this book can be found on his blog site.

I have example after example of where I’ve used this over the years.  I’d love nothing more than to share with you the “secret sauce” that makes this the most viable consideration based on a list of known certainties; but it’s simply based on so many what-if’s and it depends sets of variables that it’s not something I can just spell out.

If you have ever found yourself scratching your head at 3AM waiting for that delete to finish or for that huge update to complete or the year-end financial summary report to finish; take a step back and ask yourself… would this potentially perform better if I thought about incorporating some iterative code into this huge set-based query?  It very well might work better to delete the data one week at a time instead of one year at a time or it might make sense to just update the records where the zip code range is 00000 to 10000 rather than the entire range of 00000 to 99999.  A simple cursor will handle changing the range for you and allow the database to handle things in byte-sized fashions rather than forcing it to consume the entire big mac in one bite.

Another simple example is that cursor that exists today that looks at each record in a table and decides if it should be selected, updated or deleted based on a date (or any) value.  That will take an eternity compared to finding a range of dates to update, select or delete.

This is one of those areas of database technologies that I personally didn’t understand for the first few years of learning.  However, that’s not because it’s complicated.  It’s because I was taught to do things a certain way and then later learned that doing them a different way could sometimes be better.  It wasn’t until I fully understood the value and benefits of both set-based logic and cursor based logic that I saw the value in combining the strength that each had to offer together.   *one caveat.  It’s also possible to combine the weaknesses of each.  Imagine writing a cursor to determine how many days are in a year by selecting the rows out of a date table for an entire year (365 iterations of the code); to then store them in a temp table or a table variable and count the records to provide the result set of “365”.  Seem silly?  It should.  The reason it seems silly is because we know that almost every year, there are 365 days; but what if the data wasn’t intuitive?  What if you had no idea what the answer was?


1 Response to “Chapter 4 — Set-based iteration, the third alternative”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 11 other followers