Author Archive for Jeremy Lowell

14
Sep
10

Chapter 32 – Partitioning for manageability (and maybe performance)

The second chapter in a row on partitioning and it’s another excellent chapter.  Dan Guzman – Blog – takes us on an adventure with this chapter in looking at execution plans between partitioning in 2005 and 2008.  In addition he takes the time to spell out what some good use cases are for obtaining additional performance gains from partitioning.

I’m glad that he took the time to do that and it’s one of the most mis-understood elements of partitioning.  Partitioning will not always increase your performance.  As a matter of fact, it could really, really degrade your performance if your partition key, storage and research are well planned. 

In my experience, there is almost always a good partition key which can be found in the data you are looking to partition.  Often it’s a datetime field such as CreatedDate, TransactionDate or InsertedDate etc…  you get the picture.  It’s a date that has a natural order to it over time.  It’s not to say that all records will always land in perfect order but it is to say that more often than not, they will.  Imagine using a column like “ModifiedDate” as your partition key…  every time someone or some process updated a record, it would potentially end up in a different partition; could create havoc.  Or even worse, a column like “birth-date”.  Chances are that very, very few of the Select statements against the employee table contain a where clause with a small range of birthdays…. i.e., I want to see all employees who were born from march – june of 1975.   What good would that do except for the monthly query that H.R. runs to send out Happy Birthday cards?

In closing, I’d recommend that you read this set of blog posts in conjunction with the chapters on partitioning to get started and learn a few things along the way.  https://sqlperspectives.wordpress.com/category/chapter-31/

Advertisements
01
Sep
10

Chapter 31 – Some practical issues in table partitioning

This chapter is incredibly relevant in my world at the moment.  I’ve helped several clients implement table partitioning and just recently I was on-site at a client at 2:00AM to lay down the new partition scheme and migrate / move all of the data into the partition.  A relatively simple task with a lot of planning and testing… the most interesting part of this implementation was to make this all occur with zero time-outs on their system.  In fact, not only were time-outs not allowable but transactions never took more than three seconds during the operation. 

In this chapter, Ron Talmage provides an excellent baseline introduction to partitioning.  He begins the chapter by stating “Table partitioning is an involved topic and can quickly become complex…”.  At the end of the chapter he states “Table partitioning is quite an involved topic…”.  I appreciate that Ron pointed out that this isn’t some old-hat, do it in your sleep type of feature or implementation in SQL Server.  

In my experience, partitioning is an excellent tool for obtaining database scalability and manageability.  It can also be implemented incorrectly and that can actually de-grade your performance and cause a management nightmare.  I’ve been involved in both scenarios and this chapter begins to lay down the knowledge you need in order to correctly implement table partitioning. 

One of the coolest things about partitions, in my opinion is the ability to have very specific control over the size of your data files.  This can be very, very helpful when you anticipate a need for significant growth or when you first implement table partitioning.  At the beginning of this post, I mentioned that I was just involved in a partitioning excercise that was accomplished against a table while the system was fully on-line, available and being used.  There are a number of steps associated with successfully doing this and the information below will show how this was accomplished (at a high level). 

First, choosing the correct partition scheme is absolutely critical.  There are additional sources that Ron pointed out in his chapter that I’d recommend you read up on before implementing partitioning.  

Once we had identified the correct partition key (in this case a datetime field); the partition function was created as a sliding window (more info on that next week).  Now, we have to lay down the files (by partition function) and populate them.  This is where it gets fun.  The t-sql below is where I begin with this effort.  In short, this script will figure out what size the partition files should be, based on your partitionkey (in this case it’s the datetime_column from the Your_Table table, create the filegroups and then lay down the files you need; sized accordingly.

  USE <YourDBName>

GO

SET NOCOUNT ON

select COUNT(*) as numrows,

            'Y'+ CAST(DATEPART(yyyy,datetime_column) as varchar(4)) + 'Q' + CAST(DATEPART(q,datetime_column) as varchar(2))+ 'FG' as datetime,

            CAST(null as decimal(9,2)) as numrowdate, CAST(null as decimal(9,2)) as percentage,

            CAST(null as decimal(9,2)) as currentsize, CAST(null as decimal(9,2)) as partitionsize

into #tmp_Transaction

FROM [Your_Table]

group by

            'Y'+ CAST(DATEPART(yyyy,datetime_column) as varchar(4)) + 'Q' + CAST(DATEPART(q,datetime_column) as varchar(2))+ 'FG'

go

update #tmp_Transaction set numrowdate = (select SUM(numrows)     from #tmp_Transaction)

go

update #tmp_Transaction set percentage = b.total

      from

       (select numrows / numrowdate as total, datetime_column from #tmp_Transaction) b where #tmp_Transaction.datetime_column = b.datetime_column

go

create table #space_Transaction

      (name varchar(255), rows int, reserved varchar(55), data varchar(55), index_size varchar(55), unused varchar(55))

insert #space_Transaction

exec sp_spaceused [transaction]

go

update #space_Transaction set reserved = REPLACE(reserved,' kb', '')

go

update #tmp_Transaction set currentsize = (select CAST(reserved as decimal(9,2)) from #space_Transaction)

go

update #tmp_Transaction set currentsize = isnull(currentsize,0) , partitionsize = isnull(partitionsize,0)

GO

 -- ** NOTE **  I've multiplied the size by 10 percent; change this to 1.0 if you want it to be exact.

select      datetime as infodatetime_Transaction,

            cast((isnull(currentsize, 150) * 1.1 / 1024) * percentage  as int) as partitionsize_Transaction,

            cast((isnull(currentsize, 150) * 1.1 / 1024) * percentage *.1 as int) as partitionsize_Transaction_growth,

            numrows

INTO #tmp_Transaction2

FROM #tmp_Transaction

create table #filegroup_name (id int identity(1,1), FileGroup varchar(22))

insert #filegroup_name (FileGroup)

values ('Y2009Q1FG')

insert #filegroup_name (FileGroup)

values ('Y2009Q2FG')

insert #filegroup_name (FileGroup)

values ('Y2009Q3FG')

insert #filegroup_name (FileGroup)

values ('Y2009Q4FG')

insert #filegroup_name (FileGroup)

values ('Y2010Q1FG')

insert #filegroup_name (FileGroup)

values ('Y2010Q2FG')

insert #filegroup_name (FileGroup)

values ('Y2010Q3FG')

insert #filegroup_name (FileGroup)

values ('Y2010Q4FG')

insert #filegroup_name (FileGroup)

values ('Y2011Q1FG')

insert #filegroup_name (FileGroup)

values ('Y2011Q2FG')

insert #filegroup_name (FileGroup)

values ('Y2011Q3FG')

insert #filegroup_name (FileGroup)

values ('Y2011Q4FG')

DECLARE @SIZE VARCHAR(20)

DECLARE @FILEGROWTH VARCHAR(20)

DECLARE @PATH NVARCHAR(255)

DECLARE @DBName NVARCHAR(255)

SET @PATH = 'C:\Partitioning_Files\'

SET @DBName = DB_NAME()

      DECLARE  @SQL varchar(4000)

                  , @counter int

                  , @ErrorCode integer

                  , @FGName varchar(100)

                  , @NumRows int

SET @ErrorCode = 0

                  SELECT @counter = 1

            WHILE @counter <= 12 AND @ErrorCode = 0 BEGIN

                  SELECT @FGName = (Select FileGroup from #filegroup_name where @counter = ID)

                  SET @ErrorCode = @@Error

                  IF @ErrorCode = 0 BEGIN

                        SELECT @SQL = N'ALTER DATABASE '+ @DBName + ' ADD FILEGROUP ['+@FGName+']'

                        SET @ErrorCode = @@Error

                  END

                  IF @ErrorCode = 0 BEGIN

--select @sql

                        EXECUTE (@SQL);

                        SET @ErrorCode = @@Error

                  END

                  IF @ErrorCode = 0 BEGIN

            SELECT @Size = (Select ISNULL(Partitionsize_Transaction,150) from #tmp_Transaction2 A JOIN #Filegroup_Name B ON A.infodatetime_Transaction = B.FileGroup and B.ID = @counter)

            SELECT @FileGrowth = (Select ISNULL(Partitionsize_Transaction_growth,150) from #tmp_Transaction2 A JOIN #Filegroup_Name B ON A.infodatetime_Transaction = B.FileGroup and B.ID = @counter)

            SELECT @NumRows = (Select ISNULL(NumRows,0) from #tmp_Transaction2 A JOIN #Filegroup_Name B ON A.infodatetime_Transaction = B.FileGroup and B.ID = @counter)

                        SELECT @SQL = N'ALTER DATABASE '+@DBName+'

                        ADD FILE     

                                (NAME = N'''+@FGName+''',

                                    FILENAME = N'''+@PATH+@FGName+'.ndf'',

                                    SIZE = '+@SIZE+',

                          FILEGROWTH = '+@FILEGROWTH+')

                        TO FILEGROUP ['+@FGName+']'

                        SET @ErrorCode = @@Error

                  END

                  IF @ErrorCode = 0 BEGIN

                        EXECUTE (@SQL);

--select (@SQL)

      SELECT @FGName as FileGroupName, @Size as Size_of_File_MB, @FileGrowth as FileGrowth_Size_MB, @NumRows as NumberOfRows

                        SET @ErrorCode = @@Error

                  END

                  IF @ErrorCode = 0 BEGIN

                        SELECT @counter = @counter + 1

                        SET @ErrorCode = @@Error

                  END

      END 

 

O.K. Believe it or not, that’s the easy part. Once the files were sized correctly and laid down, it’s a matter of creating a table on your partition scheme, in the files you just laid down. Once that’s accomplished I typically write a loop that will then populate the newly partitioned table from the transactional table so as to limit the amount of locking required to read from the production / transactional table. 

In the case of the partitioning excercise that I mentioned above, we chose a sliding window scheme.  That’s important to note because once we populated most of the data, we didn’t insert the current day’s data.  The script (not provided here) then created all of the non-clustered indexes we would need in an aligned fashion.  Then we populated the final day of data, re-named the two tables involved (old huge table to Tablename_OLD and newly partitioned table to Tablename).  By wrapping that in a transaction, we minimized the potential for a very long running insertion process and kept the system online.

Viola…!  Sounds easy.  It is with plenty of thought and testing. 

Final thought… and please switch gears with me just a bit… there’s a feature that I desperately want added to partitioning that you can vote on!  Please see:

https://connect.microsoft.com/SQLServer/feedback/details/468517/update-statistics-at-the-partition-level

Happy partitioning folks and Ron, thank you for one of the best resources I’ve ever read regarding partitioning… I truly wish this chapter had been written a long time a go before I took my first leap into partitioning.

27
Aug
10

Chapter 30 – Reusing space in a table

It’s rare that I read the first paragraph in a chapter of a technical book and smile with fond memories. 

This is one of those rare times.  If you have the book SQL Server MVP Deep dives, written by Joe Webb (Blog | Twitter) then you know what I’m talking about.  If you don’t have the book and you are reading this… well, uh, go.  get the book.  Here. Today. 

This is a cool chapter… while I have only used the functionality provided by this on a couple of occasions, I can attest to its value.  I can especially attest to its value for some C.O.T.S. packages out there that update their schema with a version upgrade and the space is not readily re-used.  I’m not going to name names because that’s not the point… the point is that if you are a developer for some of the C.O.T.S. implementations out there; please keep this in mind.  It’s an easy, easy thing to be on the lookout for and it will keep grumpy DBA’s such as myself from calling you.

More to the point, if you use SQL Server to store data in some of the larger data types (xml, text, image, etc…) and you have data that falls outside of the IN_ROW_DATA allocation space, this chapter and understanding how SQL Server cleans up and doesn’t clean up after itself is very, very important.

The only other item that I think bears a mention is that this will lock your table while this process is running.  That’s a pretty important thing to keep in mind if you are thinking about running this command against your Transactions table in production.  🙂

Lastly, for those of you who haven’t frequented Joe’s blog, you should.  I spent about 45 minutes on it while trying to write this little post… he has some great content!  http://webbtechsolutions.com/blog/

20
Aug
10

Chapter 29 – My favorite DMV’s and why

This chapter is awesome.  Aaron Bertrand (Blog | Twitter) takes the time to explain and provide code examples for 15 of the DMV’s introduced in SQL Server 2005 and SQL Server 2008. 

I’ve been using DMV’s for a while now and the amount of information you can glean from them about your environment is incredible.  Coupled with the prior chapter on DDL and DML auditing, Microsoft is doing an excellent job with providing more information, in a much more intuitive fashion than I can remember since SQL Server 6.5.  This is a very, very welcome change as it allows DBA’s, developers and the like to have visibility into the HOW, the WHAT, the WHERE, the WHY and the WHO is doing what inside of your databases.

One of my personal favorite DMV’s is sys.dm_exec_sql_text.  The reason that I love this one so much is that it’s typically used via a cross join to other DMV’s such as sys.dm_exec_query_stats.  The DMV sys.dm_exec_query_stats provides statistics around query plans.  Specifically, how many times that plan has been executed, the cost of the plan and then you can cross join that plan to actually SEE the t-sql that’s being run.  This is incredibly valuable when looking to see if you have a lot of re-compiling going on or a lot of dynamic t-sql or linq-to-sql that’s not reusing query plans.  Addressing some of the big hitters here (from either a cost perspective or a quantity of re-compiles perspective) can significantly alter your overall database / server performance.

Aaron goes on this chapter to provide a few full-blown examples of his favorite DMV’s in use and why he likes them.  He provides all of the code necessary for an improved sp_who2, how to get statistics for stored procedures, finding unused stored procedures, finding inefficient and unused indexes, finding inefficient queries and finding missing indexes.  

An outstanding chapter Aaron, thank you.

20
Aug
10

Chapter 28 – Practical Auditing in SQL Server 2008

In chapter 28, Jasper Smith (Blog) introduces us to the built-in auditing features of SQL Server 2008.  This.is.awesome.  For those of you who have had to “build your own” dml and ddl auditing in the past, it was not intuitive nor was it an easy “out of the box” solution. 

A few years back (o.k., 9 years), I was tasked with creating a dml audit for all select, update, insert and delete statements against tables that contained personal information.  This effort required a continual trace for DDL and another for DDL.  Once I had trace files being created, I then wrote additional t-sql and bcp jobs to insert the data into a set of tables so that the security team could review who was looking at what data, where they were logged in from, etc…  While it was a great excercise for me from a professional development standpoint, it wasn’t as easy as the point-n-click implementation that Jasper walks through in this chapter. 

I would highly, highly recommend that every DBA out there set up DDL auditing in their environments.  It can help with so many issues that come up.  I have several clients who run a lot of Commercial Off The Shelf (COTS) packages and when upgrades happen in these environments and the database slows down or rapidly speeds up, many DBA’s are left out in the cold wondering what happened.  This most often means that a call has to be put into the vendor and the DBA spends a significant amount of time understanding what happened and why it had the effect that it did to their systems.  If DDL auditing had been set up, it would have been a quick check and may not even involve having to call into the vendor… score!

Another common scenario is a new release of your custom software will go out and during the deployment phase a piece of DML code updates too much data or not enough data and the release has issues.  Having the ability to know what ran, when, by whom etc… is truly invaluable in these scenarios.

Thanks for a great chapter Jasper!

05
Aug
10

Chapter 27 – Automating SQL Server Management using SMO

Timing… so much in life and career comes down to timing doesn’t it? 

Allen White (Blog | Twitter) could not have timed this chapter any better for me.  I’ve been implementing a policy management for a client across many servers and even mulitple domains and I’ve been thinking to myself, there has got to be a better way to deploy and manage all of these policies.  Viola.  SMO. 

In this chapter, Allen walks through the pre-reqs, introduces and then dives right into real life practical examples of leveraging the SMO libraries via PowerShell.  Again this book does not disappoint as the prior chapter was on Powershell and this one is using powershell to illustrate the power, agility and scalibility of SMO.

There are many pieces of information that I didn’t know about SMO in the pages of this chapter but one of them I thought was very interesting…  Via SMO, when restoring a database, the location of the default data and log file paths is available.  There’s no other way to get that in a restore process… pretty cool.

If you find yourself using Powershell and/or managing a large number of SQL Servers, I would dive into this chapter, become familiar with SMO and, in time, find yourself with your nights and weekends available to put your feet up, work on that personal project or get to the honey-do list.

Great chapter Allen, thank-you.

22
Jul
10

Chapter 25 – Working with maintenance plans

Maintenance plans generally get a pretty bad rap with a lot of the more seasoned (read: OLDer) DBA’s out there.  I think that this is due to a myriad of reasons, few of which warrant the negative viewpoint in SQL Server 2005 (post SP2) and SQL Server 2008.

In this chapter, Tibor Karaszi (Blog | Twitter), goes through the available options of maintenance plans and offers descriptions, use cases and his own opinion of which tasks he prefers and which tasks he really, really dislikes. 

Prior to SQL Server 2008, I did not have a positive opinion on maintenance plans.  Over the last couple of years, that opinion has changed drastically.  One of the reasons for this change is, as Tibor said in the chapter;

 

I hesitate to leave home-grown maintenance jobs with customers who do not have the competency to maintain this code over time.

Compentency is one reason that I’ve run into but more frequently it’s around resource allocation.  In today’s database environments, I’m increasingly seeing shops with a lot of database work that needs to be done and there simply isn’t time to perfectly fine-tune or write your own re-indexing scripts.  With clients who actually ship their code base / database as part of a product, this is nearly always the case as environmental and/or customization considerations make it incredibly challenging to build in your own maintenance routines. 

With that said, there are clients and databases where maintenance plans just don’t cut it.  As with anything, there are limitations to what maintenance plans can do really well, however that list has grown in recent releases.  So, if in the past you have thrown out the idea of maintence plans due to a poor experience in SQL Server 2000 or another build pre-2005 SP2, I’d highly recommend that you take them for a spin.




Chapters

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 13 other followers