Thursday, March 29, 2018

More on my Oracle to SQL Server Transition

I'm continuing to adjust to this new world of Microsoft SQL Server.  Remembering my great experiences in the Oracle user community, I've joined PASS, which is the international user group for SQL Server users.  One thing PASS offers is "SQL Saturday" which are one day (yes, on Saturday) training events held by local groups all over the world.  There are a few SQL Saturdays being held somewhere almost every weekend.  On March 24, I drove the 100 miles from my home to Glen Allen, VA (northwest of Richmond) for the Richmond SQL Saturday.  It was great, I got to meet a few people, and learned at least two things that will be useful in the work I'm doing right now.

About Table Storage and Indexes

Like Oracle, it is very important to have a Primary Key constraint on all of your tables in SQL Server.  By default, SQL Server wants to make this a clustered index.  "What is a clustered index?" my Oracle expert friends will ask.  In a clustered index, the data from the row is stored with the index entry.  That means that a table may only have one clustered index.  Oracle experts will recognize this as essentially the same as an Index-Organized Table (IOT).  IOTs aren't used that much in Oracle, but maybe they should be.  If you often query a table by its primary key, or often do UPDATE or DELETE by primary key, you might get a good performance boost by making it an IOT.

At SQL Saturday, Richmond, I went to a great presentation about indexing, "Indexing for Performance" by Jeffrey Garbus.  Mr. Garbus described very well the access path used for various WHERE conditions, with and without indexes, and differences between using a clustered and a non-clustered index.  A non-clustered index is a B-tree index in SQL Server, just like a normal index in Oracle and is used in a similar way.  In both databases, the optimizer decides whether to use an index or to do a full table scan according to similar criteria, and in fact, may decide not to use an index in situations such as the likelihood that more than a certain percentage of the rows will be returned by the query.

One very important difference between an IOT in Oracle and a table with a clustered index in SQL Server is that an IOT is always indexed with its primary key. While the primary key gets an unique clustered index by default in SQL Server, it is possible to specify that the primary key should get an unique non-clustered index.  This leaves it open to create a clustered index on another key, and that index need not be unique.  Since a clustered index makes the database store the rest of the data with the index, only one clustered index may be created on a table.  Mr. Garbus explained why, for certain tables, it might be better to cluster on a parent key.  For instance, in an Order-Entry application, line items for an order are rarely accessed any other way than by the parent order_id.  Therefore, it is often a good idea to cluster the order_line_item table on this foreign key, rather than its primary key (order_line_item_id).

This reminded me of another Oracle feature that I believe may be under-used.  While IOTs can only use a table's primary key, you can actually store master table data like the "order" table physically next to its detail rows in "order_line_item".  Oracle has clustering too, but the syntax and implementation is different.  Check out the CREATE CLUSTER command and its use in the Database Concepts Guide.