Friday, June 29, 2018

Oracle SQL Developer Data Modeler and Microsoft SQL Server

Even though my target database in my new job is Microsoft SQL Server rather than Oracle RDBMS, I still use an Oracle tool for a key part of my work - data modeling.  My tool of preference is Oracle SQL Developer Data Modeler (SDDM).  While there are other data modeling tools out there, and some (ERWin and TOAD Data Modeler are two I've used) are very good, I prefer SDDM.
  • It is free to use. (Thank you, Oracle!)
  • I haven't been allowed administrative authority on my government issued laptop, so I can't install software.  But to install SDDM, you just unzip - which I CAN do.
  • It can reverse engineer from SQL Server - all you have to do is point it at a SQL Server JDBC driver.
  • It supports DDL generation for databases other than Oracle RDBMS - including SQL Server.
  • It understands the difference between Logical, Relational, and Physical Modeling.  Some tools just model databases (relational) - tables, columns, foreign keys.  They call it Entity/Relationship diagramming (logical) but it isn't really.
  • Whatever SDDM can't do, you can probably script - more on this in a later post.
  • There is good information and support available even if you don't have a contract with Oracle Support.  Some resources:
    • Jeff Smith - Oracle's Product manager for the tool.
    • Heli from Finland - Author of an excellent book about SDDM.
    • Dave Schleis - expert on SDDM scripting, especially with Groovy
    • Kent Graziano - especially good at using SDDM to design Data Warehouses.
    • SDDM Forum - The people listed above, other users and even SDDM's developers, especially Phillip Stoyanov answer your questions.

Using SDDM with SQL Server

So, as I said, SDDM works perfectly well with a SQL Server database. All you need to do is register a JDBC driver for it.  While Microsoft does have a JDBC driver that you can download, and it does work with SDDM's stand-alone version, SDDM also comes bundled with Oracle SQL Developer and that DOESN'T work with Microsoft's driver.  Neither does Eclipse, which I'm currently using as my primary IDE.  So instead, I'm using the free open source jTDS JDBC driver for SQL Server and Sybase (Sybase is a close cousin to SQL Server).  You can get jTDS here:  Just download and unzip.  Now open SDDM and select Tools/Preferences and find "Third Party JDBC Drivers":
Or the steps are similar if you are using SDDM embedded in SQL Developer.  Now when you configure a connection to a database, you will have an additional tab for a connection type of SQL Server:
Fill in the blanks, and off you go: you can now import from the SQL Server Database Dictionary to reverse engineer your database into a Relational Design.  Or if you are using SDDM embedded in SQL Developer, you can open a connection to your SQL Server database in SQL Developer, and do many of the normal SQL Developer tasks.  Just one warning - SQL Developer doesn't really understand T-SQL, Microsoft's counterpart to both SQL*Plus style scripting and PL/SQL.  So there are limitations to what you can do in the SQL Worksheet.  But one thing you CAN do is open a relational diagram in SDDM and drag objects onto it from the SQL Server connection in SQL Developer.


SDDM has a pretty good reporting capability and I'm only beginning to learn its full capabilities.  But if you want to write SQL queries or reports against your data model, you will want to export your model to a Reporting Repository.  The Reporting Repository needs to be in an Oracle database - sorry.  This is why I asked my system administrator to set up a server with an Oracle Express Edition (Oracle XE) database, just for my private use for SDDM reports.  Like SDDM, Oracle XE is free to use, and while limited, it has no trouble hosting an SDDM reporting repository.

SQL Server DDL

You can also create a Physical Database design for SQL Server and SDDM will create DDL appropriate for your database - it understands the differences.  Only problem I've had with it, is that I usually need to manually edit the output a bit.  The worst problem is that the "GO" command needs to appear on a separate line and sometimes the newline is missing.  Also, SDDM tends to output UNIX/Linux style newlines, not the standard carriage return/linefeeds expected by Windows.  There is a little bit of a problem between the Schema in Relational Design and its implementation in Physical design - Physical doesn't always pick up the mapping you specified. But this may be something I'm doing wrong.

So - if Oracle isn't a dirty word for you SQL Server database architects, you may want to check out Oracle SQL Developer Data Modeler.  It is a great, cost-free way to design your data model.  I plan to write another post soon to talk about some SDDM scripting in Dave Schleis's favorite language: Groovy.

Friday, June 22, 2018

Data Conversion in Oracle vs. SQL Server

In my last post, I talked about multiple databases in a single instance, and gave SQL Server points for doing this better, and doing it without the extra cost associated with Oracle 12c/18 Multi-Tenant option.  Today, I want to talk about something Oracle does better than SQL Server - data conversions.

By data conversions, I mean mostly conversions from strings that are human readable to numbers and dates and back to strings.  If you are one of those people who stores everything as strings (CHAR or VARCHAR) STOP IT.  This is a BAD idea in both SQL Server and Oracle.  But that's a subject for another post.

First of all, both databases will convert strings to numbers and dates implicitly if you don't include a conversion function.  This works fine - until it gives you unexpected results or awful performance.  I strongly recommend that you use a conversion function EVERY TIME you start with a number or date and want to show it on a report or screen in human readable form and EVERY TIME your get input from users and need to store it in its proper datatype.  This way, the data will be converted in a predictable and efficient way.  So what are the conversion functions?

ANSI SQL Conversion

Both Oracle and SQL Server are pretty compliant with the ANSI SQL standard, and so both support the ANSI SQL CAST function.  The syntax of CAST is:
CAST (expression AS datatype)
You can optionally include a size with the datatype, such as the length of a VARCHAR, or the precision and scale of a number.  This works well in both databases for number conversions between floating point and integer datatypes, or from string to number - just understand that if the string doesn't contain a valid number, CAST will fail.  SQL Server is nice in that it includes a TRY_CAST function that will return NULL for an invalid number rather than fail.  Since CAST is ANSI standard, it is certainly preferred if you are writing code that may need to work on any database that meets the standard.  But while CAST works for string to date and date to string conversions, the format of the string must be ANSI standard date format, which is the same as ISO_8601.  Not so good when you need to convert to or from MM/DD/YYYY - your expression would need to substring the original date to put it in ANSI format.

Proprietary Conversion

So Oracle and SQL Server both have proprietary non-standard functions to convert between strings and dates more flexibly.  SQL Server has:
CONVERT (datatype, expression, style)
Just as with CAST, the datatype parameter can include size information.  The style parameter is an integer that selects the data format of the input or output data from a table in the documentation - CAST and Convert.  Now, I will admit that this list includes the formats that I use the most frequently for dates.  For instance, style 1 is MM/DD/YY, and 101 is MM/DD/YYYY.  But it doesn't include full month names in the selected natural language, such as January 1, 2018 in English or 1 Enero 2018 in Spanish.

Oracle has separate functions for converting a string to a date or a date to a string:
TO_DATE (expression, format, nlsparam)
TO_CHAR (expression, format, nlsparam)
The format parameter is optional and defaults to the default format set up by your DBA in your database.  nlsparam is also optional and defaults to the NLS (natural language system) settings set by your DBA in your database.  I almost never let format default, but almost always let nlsparam default.  However, if you need to support languages other than English, this is what would let it appear as January for English speakers or Enero for Spanish.  Documentation for TO_DATE, and TO_CHAR includes a list of formatting codes.

To me, the formats provided by Oracle are both more flexible and more clear.  Which is easier to remember and easier to read:
CONVERT (VARCHAR(10), my_date, 101) - SQL Server
TO_CHAR (my_date, 'MM/DD/YYYY') - Oracle
Advantage - Oracle.

By the way, SQL Server's CONVERT function also works for numbers, but doesn't have a significant advantage over the CAST function for numbers.  Oracle has TO_NUMBER and TO_CHAR functions for numbers with the same parameters as TO_DATE and TO_CHAR for dates, and more flexible formats for strings converted to and from numbers.
Advantage - Oracle.

Monday, June 4, 2018

Databases vs. Instances

As a long-time Oracle developer and administrator, one of the most interesting differences that I've found as a newbie SQL Server developer and administrator is in the simple basic term, database. 

In simple terms, a database is a set of database files in permanent storage that share a data dictionary, logs and control files.  An instance is a set of running processes that maintain a database and its control files and logs.  While there are in-memory databases where the database is in the server's memory, rather than maintained in permanent storage while its associated instance is running, eventually even this kind of database does get written to permanent storage.

Until Oracle 12c with its multi-tenant option, a database and an instance in Oracle were pretty close to the same thing.  With Oracle Real Application Clusters (RAC), you might have several instances on separate servers accessing a single database, but you would not have several databases under the same instance.  It would not be unusual to run several Oracle databases on the same server, but each database would be maintained by its own instance (or instances in the case of RAC).  Instances might share a network listener for client connections, but the client would indicate the instance to which it wanted to connect, and therefore which database.

But in SQL Server, and a few other relational databases, for example, PostgreSQL, a single instance can run several databases.  There are also some required databases managed by an instance, such as the database that contains the shared data dictionary.  Each database can have its own schemas, users, tables and other database objects.  None of these need to be unique except within a database.  A full unique name for a database object would be databaseName.schemaName.objectName.

This came in handy recently when I needed to create a new test database identical to our existing development database with three separate schemas.  My Oracle-trained brain initially was thinking that I needed a new SQL Server instance, since I didn't have an extra server or database license.  This is possible, but not very common, and it requires the original SQL Server installation media.  But I was reminded that I could just create a new database in the existing instance.  It could have the same schemas and database objects, but a different database name.  And it was just as easy to configure the test version of the application to connect to a different database in the same instance as to a database with the same name in a different instance.

I had two choices available for creating my test database.  It is very easy to use SQL Server Management Studio to back up my development database, and then restore it into a new test database.  But that would also copy the development data instead of creating the tables empty.  Instead, this proved to be a good opportunity to test my DDL scripts.  I have one script for each object to be created, and a master script to call each in the correct order.  This also executes scripts that load data into reference and lookup tables - remember that I consider data for these tables to be part of the source code for the database.

So, chalk this one up as an advantage for SQL Server.  Multi-tenant Option in Oracle 12c (and its successor, Oracle 18) has similar advantages - and may even be a little better if you need to move tenant databases around.  But it is an extra cost option, while multiple databases is a standard part of SQL Server.

Monday, April 16, 2018

About Indexing Character Fields

I had created a few tables in my SQL Server database like this:
CREATE TABLE ref_status(
  status_code VARCHAR(10) NOT NULL,
  description VARCHAR(80) NOT NULL,
  CONSTRAINT ref_status_pk PRIMARY KEY (status_code)
  stuff_id    INT IDENTITY,
  stuff_date  DATETIME,
  stuff_desc  VARCHAR(80),
  status_code VARCHAR(10),
  CONSTRAINT stuff_pk PRIMARY KEY (stuff_id),
  CONSTRAINT stuff_status_fk FOREIGN KEY (status_code)
    REFERENCES ref_status (status_code)
I inserted a few status codes into ref_status, 'NEW', 'IN PROGRESS', and 'PRODUCTION'.  Then the developers got busy, and wrote a user interface to insert data into "stuff".  After testing, there were about 10 rows in "stuff".

Then the lead developer asked me, "Can you update the status codes to make them read 'New', 'In progress', and 'Production' instead?"  "Sure", I said.  Because of the foreign key, I felt I needed to do this in three steps.
  1. Insert new rows into "ref_status"  with the new status codes.
  2. Update "stuff" to change the existing status codes to the values of the new codes.
  3. Delete the old status codes from "ref_status".
But step 1 failed with a duplicate key error.  What's going on?  It turns out that in SQL Server, by default, indexes on character fields are case insensitive.  So 'NEW' and 'New' are duplicate keys (and so would be 'new' or even 'neW').  The correct way to do this was simply:
  1. Update the status_code column in one table.
  2. Update the status_code column in the other table.
It didn't matter which table was updated first, and it didn't cause a foreign key problem because the index on "ref_status" DIDN'T change.

Contrast with Oracle

Of course, the reason my first crack at this failed was that in Oracle - remember that I came from the Oracle world - the indexes are case sensitive.  'NEW' and 'New' are two different keys.  Can you make the indexes case insensitive, like in SQL Server?  Yes, BUT you have to do it with a function-based index.  That means that you can't make a case-insensitive primary key constraint on REF_STATUS.  You have to make an unique function based index like this:
   (NLSSORT (status_code, 'NLS_SORT=BINARY_CI') );
Yet to point a foreign key at REF_STATUS, it needs to have an unique or primary key constraint.  Well, I haven't tried this, but here is what I THINK will work.

Even though, by default, creating an unique constraint will automatically create an unique index by the same name, you CAN create the index first.  Then the unique constraint will use the pre-existing index instead of creating a new one.  So after creating the index as described above, do:
ALTER TABLE ref_status ADD
  CONSTRAINT ref_status_uk UNIQUE (status_code);
That way, Oracle will use the existing index to enforce uniqueness.  This much, I am pretty sure works.  Then, theoretically, you can create the foreign key, STUFF_STATUS_FK from STUFF to REF_STATUS.  If anyone wants to try this, please let me know in the comments if it worked.

Thursday, March 29, 2018

More on my Oracle to SQL Server Transition

I'm continuing to adjust to this new world of Microsoft SQL Server.  Remembering my great experiences in the Oracle user community, I've joined PASS, which is the international user group for SQL Server users.  One thing PASS offers is "SQL Saturday" which are one day (yes, on Saturday) training events held by local groups all over the world.  There are a few SQL Saturdays being held somewhere almost every weekend.  On March 24, I drove the 100 miles from my home to Glen Allen, VA (northwest of Richmond) for the Richmond SQL Saturday.  It was great, I got to meet a few people, and learned at least two things that will be useful in the work I'm doing right now.

About Table Storage and Indexes

Like Oracle, it is very important to have a Primary Key constraint on all of your tables in SQL Server.  By default, SQL Server wants to make this a clustered index.  "What is a clustered index?" my Oracle expert friends will ask.  In a clustered index, the data from the row is stored with the index entry.  That means that a table may only have one clustered index.  Oracle experts will recognize this as essentially the same as an Index-Organized Table (IOT).  IOTs aren't used that much in Oracle, but maybe they should be.  If you often query a table by its primary key, or often do UPDATE or DELETE by primary key, you might get a good performance boost by making it an IOT.

At SQL Saturday, Richmond, I went to a great presentation about indexing, "Indexing for Performance" by Jeffrey Garbus.  Mr. Garbus described very well the access path used for various WHERE conditions, with and without indexes, and differences between using a clustered and a non-clustered index.  A non-clustered index is a B-tree index in SQL Server, just like a normal index in Oracle and is used in a similar way.  In both databases, the optimizer decides whether to use an index or to do a full table scan according to similar criteria, and in fact, may decide not to use an index in situations such as the likelihood that more than a certain percentage of the rows will be returned by the query.

One very important difference between an IOT in Oracle and a table with a clustered index in SQL Server is that an IOT is always indexed with its primary key. While the primary key gets an unique clustered index by default in SQL Server, it is possible to specify that the primary key should get an unique non-clustered index.  This leaves it open to create a clustered index on another key, and that index need not be unique.  Since a clustered index makes the database store the rest of the data with the index, only one clustered index may be created on a table.  Mr. Garbus explained why, for certain tables, it might be better to cluster on a parent key.  For instance, in an Order-Entry application, line items for an order are rarely accessed any other way than by the parent order_id.  Therefore, it is often a good idea to cluster the order_line_item table on this foreign key, rather than its primary key (order_line_item_id).

This reminded me of another Oracle feature that I believe may be under-used.  While IOTs can only use a table's primary key, you can actually store master table data like the "order" table physically next to its detail rows in "order_line_item".  Oracle has clustering too, but the syntax and implementation is different.  Check out the CREATE CLUSTER command and its use in the Database Concepts Guide.

Thursday, January 25, 2018

From Oracle Expert to SQL Server Newbie - First Impressions

Oops, I've done it again - I left you readers hanging when I published the first part of two on generating code for your lookup/reference/code tables.  I promise to get back to it.

But in the meantime:
I have a new job.  Yeah, it was pretty sudden - finished my old job at the end of the year, and jumped directly into the new one at the beginning of the new year.  And the biggest shock is that this old Oracle DBA and Database Architect is going to be designing a database to be implemented in Microsoft SQL Server.  No, I didn't lie about my credentials, but I've worked with this boss before, and he said, "Oh, you'll do fine.  You're a smart guy.  And SQL is SQL, right? RDBMS is RDBMS, right?"

So, here I am, the newbie, posing as an expert.  I've been learning all I can about SQL Server, and trying things out, watching tutorials (Microsoft has some really good video lessons, designed especially for the Oracle DBA transitioning.)  Here are some first impressions:

  • SQL is SQL.  Yes, there are differences, but the basics of the SELECT, INSERT, UPDATE, DELETE, and most of the DDL commands too, are pretty close.
  • T-SQL is NOT PL/SQL.  Think of Transact-SQL as more like a SQL*Plus script with some flow control statements like IF/ELSE.  Some of it is nice, like no difference between a substitution variable and a language variable.  I think (not sure) that a T-SQL variable does act as a bind variable where PL/SQL variables would, but you can do things that PL/SQL would have to put inside an EXECUTE IMMEDIATE.  Cool.
  • I miss PL/SQL packages.
  • All triggers are statement level triggers.  This means that you write triggers very differently. Hard to get used to.  That doesn't mean you can't do row-level stuff - instead of getting a pseudo-record with the contents of the pre-DML row (the :OLD record) and one with the contents of the post-DML row (the :NEW record) you get pseudo-tables containing the old and new rows.  I don't know for sure, but I think this means that you don't deal with mutating table problems.
  • The real kicker is that most triggers are AFTER statement triggers. If you want to do BEFORE stuff, you write an INSTEAD OF trigger.  In Oracle, INSTEAD OF is just for views.  Just like Oracle, the INSTEAD OF trigger needs to do the DML on the underlying table(s).
  • There are two SQL Developer-like GUI tools - the SQL Server Management Studio, and SQL Server Data Tools for Visual Studio.  These are nice, but I wish they had a good SQL formatting tool.
  • SQL Developer CAN talk to a SQL Server database, which is nice. But it doesn't really understand T-SQL, so it can have problems with it.  I did use its formatting tool on some SELECTs, however, and it worked pretty well.  Also, if you export a table as INSERTs - it writes them as Oracle INSERTs - doesn't work for dates because SQL Server doesn't have TO_DATE.
  • Oracle has nothing like the SQL Server Integration Server - this is nice but TOTALLY new to me.
  • SQL Server Reporting Server and its report building tool are MUCH better than SQL Reports from Oracle.  But then, I have always hated Oracle's reporting tool.

Thursday, August 24, 2017

Code Generation for Lookup Tables - Part 1

In my last post, I outlined some of my thoughts about lookup tables, and how they should be structured and organized. Now I want to write some code generators, to make it easier to create and manage lookup tables.  Since I want lookup tables to have the same columns and follow some naming conventions, why should I write the same basic code every time I create a new lookup table?  While my thoughts about lookup tables should probably apply to just about any relational database, stored procedure code is not as portable.  I work in an Oracle database, so all the code I'm going to show you is written in PL/SQL.  But you may be able to do something similar in the procedural language for your database.

An Index to the Lookup Tables

One thing I like to have is a table to store a list of all the lookup tables in the schema.  This isn't absolutely necessary, if you use a naming convention for your lookup table names, since you could write a view of a catalog view, like Oracle's USER_TABLES.  However, I'm going to create a real table named LOOKUP_TABLES:
CREATE TABLE lookup_tables (
  Table_name VARCHAR2(30),
  Decription VARCHAR2(255)

So my code generator is going to insert a row into this table anytime that I create a new lookup table, and delete that row anytime I drop a lookup table.  Once you start using your lookup tables, and start having foreign keys that reference them, you probably aren't going to drop any lookup tables, but you might during development, so I'll include a drop.

The Code Generator Package

In PL/SQL, I tend to write packages for all of my stored procedures, with few or no stand-alone functions or procedures.  While I'm going to show the procedures separately, you may assume that they are all inside a package named LOOKUP_UTIL.  There are two special characteristics of LOOKUP_UTIL.
In all of my databases, I have created a schema named COMMON to hold all database objects that are shared among many application schemas and users.  Most of these have EXECUTE or SELECT granted to PUBLIC, and many have public synonyms.  Since many applications may generate lookup tables, LOOKUP_UTIL would correctly be created in the COMMON schema.  By the way, some lookup tables might be shared between applications – my lookup table for U.S. States and Territories would be a prime example of this.  It might also be best created in COMMON.
 The second characteristic of LOOKUP_UTIL is that it is an invoker's rights package, using the "AUTHID CURRENT_USER" clause.  That means that it executes with the database privileges of the user (or schema) that called it.  It is going to create tables and other database objects – it should only do that in the current user's schema, and using privileges granted to that user, not COMMON, the owner of the package.  Many of the packages in the COMMON schema share this characteristic.

Creating and Dropping Lookup Tables

Here is the code for creating a new lookup table.  It creates the table with the standard columns as I explained in my previous post and adds a row to LOOKUP_TABLES.  The same description to be stored in LOOKUP_TABLES is also used as the table comment.  There are a few things that are here for demonstration purposes that might not be in real code.  For instance, I am using DBMS_OUTPUT to show the generated code in case the code gets an error upon execution.  In the real code, these would probably be in a call to a logging package.  I use LOGGER (, and I highly recommend it.  If you have users who create lookup tables in schemas other than their own, with the CREATE ANY privilege, you may want to add an "owner" parameter.
PROCEDURE new_lookup_table (
  table_name        lookup_tables.table_name%TYPE,
  description       lookup_tables.description%TYPE DEFAULT NULL,
  tablespace_name   user_tablespaces.tablespace_name%TYPE DEFAULT NULL
  v_table_name  lookup_tables.table_name%TYPE;
  create_cmd    VARCHAR2 (2000);
  IF table_name IS NULL THEN
    raise_application_error (
      'Table name of new lookup table may not be null.'
  ELSIF UPPER (table_name) LIKE 'LKUP_%' THEN
    v_table_name  := UPPER (table_name);
    v_table_name  := 'LKUP_' || SUBSTR (UPPER (table_name), 1, 25);

    INSERT INTO   lookup_tables (table_name, description)
         VALUES   (v_table_name, description);
      raise_application_error (
        'Lookup table named ' || v_table_name || ' already exists.'

  create_cmd      :=
       'CREATE TABLE '
    || v_table_name
    || '(
  code              VARCHAR2(10) NOT NULL,
  short_description VARCHAR2(30),
  description       VARCHAR2(255),
  display_sequence  NUMBER,
  active            CHAR(1) DEFAULT ''Y'' NOT NULL,
    || SUBSTR (v_table_name, 1, 27)
    || '_PK PRIMARY KEY (code))'
    || '
    || CASE
         WHEN tablespace_name IS NOT NULL THEN
TABLESPACE ' || tablespace_name

    EXECUTE IMMEDIATE create_cmd;
      DBMS_OUTPUT.put_line (create_cmd);

    'COMMENT ON TABLE ' || v_table_name || ' IS ''' || description || '''';
END new_lookup_table;

The code for dropping a lookup table is similar.  Note that my version includes the CASCADE CONSTRAINTS clause, so that if there are foreign keys referencing the lookup table being dropped, those constraints will be dropped too.  You might prefer not to use this clause and allow the drop to fail if there are references to the table.
PROCEDURE drop_lookup_table (table_name lookup_tables.table_name%TYPE)
  v_table_name  lookup_tables.table_name%TYPE;
  drop_cmd      VARCHAR2 (2000);
  IF table_name IS NULL THEN
    raise_application_error (
      'Table name of lookup table to be dropped may not be null.'
  ELSIF UPPER (table_name) LIKE 'LKUP_%' THEN
    v_table_name  := UPPER (table_name);
    v_table_name  := 'LKUP_' || SUBSTR (UPPER (table_name), 1, 25);

  DELETE FROM   lookup_tables
        WHERE   UPPER (table_name) = v_table_name;

    raise_application_error (
         'Table named '
      || v_table_name
      || ' does not exist in the lookup table catalog.'

  drop_cmd  := 'DROP TABLE ' || v_table_name || ' CASCADE CONSTRAINTS';

      DBMS_OUTPUT.put_line (drop_cmd);
END drop_lookup_table;

Next Time

That's enough generation code for now.  I'll continue in my next post with some reasons for simulating a single table of tables with a view.  I'll show you the code for generating that view, and for generating an API to make the view updatable.