Chapter 10 – Introduction to XQuery on SQL Server

April 5th, 2010

Chapter 10 Introduction to XQuery on SQL Server

by Michael Coles

When I was asked to pick a topic or two and write chapters for the SQL Server MVP Deep Dives book, I immediately thought of the XML Query language (XQuery)—a topic that still isn’t well understood by a lot of SQL Server professionals. I wrote the book chapter as an introduction to XQuery with a focus on explaining how SQL Server implements it from the beginner’s point of view. In this blog entry I’d like to expand on a topic that I briefly touched on—the XQuery/XPath Data Model (XDM)—which is critical to understanding XQuery.

XDM is actually one of the most misunderstood aspects of SQL Server XML and its XQuery implementation. It’s important because this is how XQuery sees your XML. Whenever SQL Server stores XML data using the XML data type, it’s actually storing it in XDM format. [XDM is defined by the W3C at http://www.w3.org]. The XDM format defines a hierarchy reflecting the structure and content of your XML data.

So why doesn’t SQL Server store your XML data as plain text? There are three answers:

(1) Storing the data in XDM format allows the XQuery processor to perform optimizations that it couldn’t do if it was trying to query raw text;

(2) Storing the data in XDM format strips out unnecessary items in your XML data including insignificant whitespace, XML prolog, and DTDs; and

(3) Most importantly XDM format allows you to use XML schemas to constrain the structure and content of your XML data and to apply data types to your XML data.

I consider this last point the most important. XML schemas allow you to define very flexible structures for your XML data, and provides a guarantee that XML data that doesn’t conform is rejected. In SQL terms XML schemas provide functionality that is comparable to [a subset of] DDL statements like CREATE TABLE as well as check constraints. XML schema is even a small step ahead of T-SQL since it supports regular expression constraints!

XML schemas also let you apply data types to your data. This is important if you want to guarantee that only numbers get stored in integer nodes, for instance. This lets the XQuery processor to optimize calculations on your XML content.

Finally, data typing increases performance and ensures comparisons produce results that make sense. As an example, when you compare “9” to “10” do you expect “9” to be less than “10” or vice versa? It depends on the data types. If the data types are integers you expect (9 < 10) is true; for strings, the character “9” comes after “1” so (“10” < “9”) is true. Can you imagine comparisons in SQL if every column of every table was varchar data? (Yes, I know some people actually do have to deal with this…)

As you explore the XML functionality available in SQL Server 2005, 2008 and beyond, be sure to keep XDM in mind and consider using XML Schema to constrain your data. A very good friend of mine, MVP Jacob Sebastian, has written an excellent (and free!) e-book about XML Schema called The Art of XSD. This book is an invaluable tool for creating XML Schema Collections on SQL Server.

More information is available on XML Schema Collections (SQL Server’s XML Schema implementation) at http://msdn.microsoft.com/en-us/library/ms176009.aspx.


0 Responses to “Chapter 10 – Introduction to XQuery on SQL Server”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 11 other followers