Friday, June 29, 2018

Oracle SQL Developer Data Modeler and Microsoft SQL Server

Even though my target database in my new job is Microsoft SQL Server rather than Oracle RDBMS, I still use an Oracle tool for a key part of my work - data modeling.  My tool of preference is Oracle SQL Developer Data Modeler (SDDM).  While there are other data modeling tools out there, and some (ERWin and TOAD Data Modeler are two I've used) are very good, I prefer SDDM.
  • It is free to use. (Thank you, Oracle!)
  • I haven't been allowed administrative authority on my government issued laptop, so I can't install software.  But to install SDDM, you just unzip - which I CAN do.
  • It can reverse engineer from SQL Server - all you have to do is point it at a SQL Server JDBC driver.
  • It supports DDL generation for databases other than Oracle RDBMS - including SQL Server.
  • It understands the difference between Logical, Relational, and Physical Modeling.  Some tools just model databases (relational) - tables, columns, foreign keys.  They call it Entity/Relationship diagramming (logical) but it isn't really.
  • Whatever SDDM can't do, you can probably script - more on this in a later post.
  • There is good information and support available even if you don't have a contract with Oracle Support.  Some resources:
    • Jeff Smith - Oracle's Product manager for the tool.
    • Heli from Finland - Author of an excellent book about SDDM.
    • Dave Schleis - expert on SDDM scripting, especially with Groovy
    • Kent Graziano - especially good at using SDDM to design Data Warehouses.
    • SDDM Forum - The people listed above, other users and even SDDM's developers, especially Phillip Stoyanov answer your questions.

Using SDDM with SQL Server

So, as I said, SDDM works perfectly well with a SQL Server database. All you need to do is register a JDBC driver for it.  While Microsoft does have a JDBC driver that you can download, and it does work with SDDM's stand-alone version, SDDM also comes bundled with Oracle SQL Developer and that DOESN'T work with Microsoft's driver.  Neither does Eclipse, which I'm currently using as my primary IDE.  So instead, I'm using the free open source jTDS JDBC driver for SQL Server and Sybase (Sybase is a close cousin to SQL Server).  You can get jTDS here: http://jtds.sourceforge.net/.  Just download and unzip.  Now open SDDM and select Tools/Preferences and find "Third Party JDBC Drivers":
Or the steps are similar if you are using SDDM embedded in SQL Developer.  Now when you configure a connection to a database, you will have an additional tab for a connection type of SQL Server:
Fill in the blanks, and off you go: you can now import from the SQL Server Database Dictionary to reverse engineer your database into a Relational Design.  Or if you are using SDDM embedded in SQL Developer, you can open a connection to your SQL Server database in SQL Developer, and do many of the normal SQL Developer tasks.  Just one warning - SQL Developer doesn't really understand T-SQL, Microsoft's counterpart to both SQL*Plus style scripting and PL/SQL.  So there are limitations to what you can do in the SQL Worksheet.  But one thing you CAN do is open a relational diagram in SDDM and drag objects onto it from the SQL Server connection in SQL Developer.

Reporting

SDDM has a pretty good reporting capability and I'm only beginning to learn its full capabilities.  But if you want to write SQL queries or reports against your data model, you will want to export your model to a Reporting Repository.  The Reporting Repository needs to be in an Oracle database - sorry.  This is why I asked my system administrator to set up a server with an Oracle Express Edition (Oracle XE) database, just for my private use for SDDM reports.  Like SDDM, Oracle XE is free to use, and while limited, it has no trouble hosting an SDDM reporting repository.

SQL Server DDL

You can also create a Physical Database design for SQL Server and SDDM will create DDL appropriate for your database - it understands the differences.  Only problem I've had with it, is that I usually need to manually edit the output a bit.  The worst problem is that the "GO" command needs to appear on a separate line and sometimes the newline is missing.  Also, SDDM tends to output UNIX/Linux style newlines, not the standard carriage return/linefeeds expected by Windows.  There is a little bit of a problem between the Schema in Relational Design and its implementation in Physical design - Physical doesn't always pick up the mapping you specified. But this may be something I'm doing wrong.

So - if Oracle isn't a dirty word for you SQL Server database architects, you may want to check out Oracle SQL Developer Data Modeler.  It is a great, cost-free way to design your data model.  I plan to write another post soon to talk about some SDDM scripting in Dave Schleis's favorite language: Groovy.

Friday, June 22, 2018

Data Conversion in Oracle vs. SQL Server

In my last post, I talked about multiple databases in a single instance, and gave SQL Server points for doing this better, and doing it without the extra cost associated with Oracle 12c/18 Multi-Tenant option.  Today, I want to talk about something Oracle does better than SQL Server - data conversions.

By data conversions, I mean mostly conversions from strings that are human readable to numbers and dates and back to strings.  If you are one of those people who stores everything as strings (CHAR or VARCHAR) STOP IT.  This is a BAD idea in both SQL Server and Oracle.  But that's a subject for another post.

First of all, both databases will convert strings to numbers and dates implicitly if you don't include a conversion function.  This works fine - until it gives you unexpected results or awful performance.  I strongly recommend that you use a conversion function EVERY TIME you start with a number or date and want to show it on a report or screen in human readable form and EVERY TIME your get input from users and need to store it in its proper datatype.  This way, the data will be converted in a predictable and efficient way.  So what are the conversion functions?

ANSI SQL Conversion

Both Oracle and SQL Server are pretty compliant with the ANSI SQL standard, and so both support the ANSI SQL CAST function.  The syntax of CAST is:
CAST (expression AS datatype)
You can optionally include a size with the datatype, such as the length of a VARCHAR, or the precision and scale of a number.  This works well in both databases for number conversions between floating point and integer datatypes, or from string to number - just understand that if the string doesn't contain a valid number, CAST will fail.  SQL Server is nice in that it includes a TRY_CAST function that will return NULL for an invalid number rather than fail.  Since CAST is ANSI standard, it is certainly preferred if you are writing code that may need to work on any database that meets the standard.  But while CAST works for string to date and date to string conversions, the format of the string must be ANSI standard date format, which is the same as ISO_8601.  Not so good when you need to convert to or from MM/DD/YYYY - your expression would need to substring the original date to put it in ANSI format.

Proprietary Conversion

So Oracle and SQL Server both have proprietary non-standard functions to convert between strings and dates more flexibly.  SQL Server has:
CONVERT (datatype, expression, style)
Just as with CAST, the datatype parameter can include size information.  The style parameter is an integer that selects the data format of the input or output data from a table in the documentation - CAST and Convert.  Now, I will admit that this list includes the formats that I use the most frequently for dates.  For instance, style 1 is MM/DD/YY, and 101 is MM/DD/YYYY.  But it doesn't include full month names in the selected natural language, such as January 1, 2018 in English or 1 Enero 2018 in Spanish.

Oracle has separate functions for converting a string to a date or a date to a string:
TO_DATE (expression, format, nlsparam)
TO_CHAR (expression, format, nlsparam)
The format parameter is optional and defaults to the default format set up by your DBA in your database.  nlsparam is also optional and defaults to the NLS (natural language system) settings set by your DBA in your database.  I almost never let format default, but almost always let nlsparam default.  However, if you need to support languages other than English, this is what would let it appear as January for English speakers or Enero for Spanish.  Documentation for TO_DATE, and TO_CHAR includes a list of formatting codes.

To me, the formats provided by Oracle are both more flexible and more clear.  Which is easier to remember and easier to read:
CONVERT (VARCHAR(10), my_date, 101) - SQL Server
TO_CHAR (my_date, 'MM/DD/YYYY') - Oracle
Advantage - Oracle.

By the way, SQL Server's CONVERT function also works for numbers, but doesn't have a significant advantage over the CAST function for numbers.  Oracle has TO_NUMBER and TO_CHAR functions for numbers with the same parameters as TO_DATE and TO_CHAR for dates, and more flexible formats for strings converted to and from numbers.
Advantage - Oracle.

Monday, June 4, 2018

Databases vs. Instances

As a long-time Oracle developer and administrator, one of the most interesting differences that I've found as a newbie SQL Server developer and administrator is in the simple basic term, database. 

In simple terms, a database is a set of database files in permanent storage that share a data dictionary, logs and control files.  An instance is a set of running processes that maintain a database and its control files and logs.  While there are in-memory databases where the database is in the server's memory, rather than maintained in permanent storage while its associated instance is running, eventually even this kind of database does get written to permanent storage.

Until Oracle 12c with its multi-tenant option, a database and an instance in Oracle were pretty close to the same thing.  With Oracle Real Application Clusters (RAC), you might have several instances on separate servers accessing a single database, but you would not have several databases under the same instance.  It would not be unusual to run several Oracle databases on the same server, but each database would be maintained by its own instance (or instances in the case of RAC).  Instances might share a network listener for client connections, but the client would indicate the instance to which it wanted to connect, and therefore which database.

But in SQL Server, and a few other relational databases, for example, PostgreSQL, a single instance can run several databases.  There are also some required databases managed by an instance, such as the database that contains the shared data dictionary.  Each database can have its own schemas, users, tables and other database objects.  None of these need to be unique except within a database.  A full unique name for a database object would be databaseName.schemaName.objectName.

This came in handy recently when I needed to create a new test database identical to our existing development database with three separate schemas.  My Oracle-trained brain initially was thinking that I needed a new SQL Server instance, since I didn't have an extra server or database license.  This is possible, but not very common, and it requires the original SQL Server installation media.  But I was reminded that I could just create a new database in the existing instance.  It could have the same schemas and database objects, but a different database name.  And it was just as easy to configure the test version of the application to connect to a different database in the same instance as to a database with the same name in a different instance.

I had two choices available for creating my test database.  It is very easy to use SQL Server Management Studio to back up my development database, and then restore it into a new test database.  But that would also copy the development data instead of creating the tables empty.  Instead, this proved to be a good opportunity to test my DDL scripts.  I have one script for each object to be created, and a master script to call each in the correct order.  This also executes scripts that load data into reference and lookup tables - remember that I consider data for these tables to be part of the source code for the database.

So, chalk this one up as an advantage for SQL Server.  Multi-tenant Option in Oracle 12c (and its successor, Oracle 18) has similar advantages - and may even be a little better if you need to move tenant databases around.  But it is an extra cost option, while multiple databases is a standard part of SQL Server.