JCFlack's Database Development Journal: 2018

Wednesday, October 3, 2018

Source Control for and deployment of your Database

I just read Jeff Smith's latest blog entry about exporting the DDL for your Oracle database with Oracle SQL Developer. You can do a similar export with TOAD - both the original version for Oracle and the version for SQL Server. Other tools have this ability too. Jeff hints at why you might want to use this feature - to put your database under source control.

I have had the experience of walking into a shop where my predecessor database designer/developer had not done this, and I was shocked. The application developers - the Java programmers, the JavaScript framework developers, the UI developers were all very careful to keep their source code in a source code repository, either with Subversion or Git, so I did not understand why the database guy/gal hadn't done the same with the code to create the database objects used by the application. DDL is SOURCE CODE, and it belongs in the source control system. So if you haven't done it already, follow Jeff's instructions (here) or the instructions for doing the same thing with your favorite database IDE and then put the resulting DDL code into your source code repository. If you don't know how to use your shop's source code control tools, ask your fellow developers for help (by the way, this is a nice way to ingratiate yourself with the developers), or ask your favorite search engine. What? Your shop doesn't do source control? Convince the powers that be that this is ESSENTIAL to a good application development project and there are good tools available at no cost. Do it NOW - I'll wait.

There is one more thing that goes in source control: scripts to load data into the look-up and reference tables. These might be SQL scripts with INSERT commands. I do that, and the scripts go in a folder named "Data". Or in Oracle, you might have data files (maybe CSV) with the SQL*Loader control files to load the data. Your database IDE like SQL Developer can help you with this if the data is already in your development database. If reference or look-up tables are ever changed by users, you will need to update this data and check the changes into source control. You need a plan for propagating the latest changes to all environments. The key is that you should consider the data in these tables as if it was DDL.

So now, you have captured all the DDL for the database objects your application uses and put it in source code control. What next? Most of the tools that let you export your DDL, also create a script to run the DDL. This would let you re-create your objects in a new database, such as when it is time to deploy to the test database. We always have at least four environments - development, test, staging (sometimes called pre-production), and production. I have been in shops where the production database is under the control of a production database administrator (DBA), and the development DBA (often me) does not have rights to run DDL in production. The process is that when a production release is authorized, the production DBA does the database deployment. So my deployment script is always thoroughly tested, and written so that even someone who knows NOTHING about the application can run it. If your tool doesn't write a deployment script, you'll need to write it by hand. I often use (in Windows) the DIR command to list the DDL files, spooling the output to a file, then edit the file. And yes, deployment scripts go to source control - mine go in a folder named "Scripts".

So what about database deployment? The initial deployment is easy. The deployment script just runs ALL the DDL and data load scripts. But subsequent deployments are a bit harder. Views and stored procedures can use the CREATE OR REPLACE syntax (or CREATE OR ALTER in SQL Server) Once a table has been created and contains data, you can't just DROP it and CREATE a new version. You need to run ALTER commands to make some changes, and for bigger changes you might need to do something like:
CREATE TABLE my_table_new ...
INSERT INTO my_table_new (...)
SELECT ...
FROM my_table
DROP my_table;
RENAME my_table_new TO my_table;
I will usually name my deployment script for each new database release with the name of the release, like "myapp_2.1_DEPLOY.sql". Since we use JIRA to manage tasks, I will have a script named after the JIRA task that does the changes associated with that task, such as "MYAPP-123.sql". If the changes involve DDL for an object that can use "CREATE OR REPLACE" syntax, MYAPP-123.sql will simply call the new version of the DDL script for that object. If not, MYAPP-123.sql will contain the ALTER (and other) commands needed. When I create the release deployment script, it calls each of the scripts for the JIRA tasks included in that release. All of these go in the Scripts folder. I also tend to update the CREATE scripts for tables, even if deployment will not be recreating the table - that way I can create a new empty database without running all the ALTER commands.

By the way, if you are using Oracle 11gR2 or later, you probably want to use Edition Based Redefinition in your deployment scripts. This is not used as often as it should be, and will let you do your database deployments in production with no downtime.

Hope this inspires you to put you DDL under source control, and give you some ideas how to organize your database deployments.

Friday, September 21, 2018

DBMS_OUTPUT Reports in Oracle SQL Developer

Although I've been working a lot with SQL Server databases lately, I've never entirely left the Oracle world. For instance, I've used SQL Developer Data Modeler (SDDM) to design my database, and exported my models to a reporting repository in an Oracle XE database. What kind of reporting? Well, for one thing, Oracle SQL Developer has a very good report generator built into it, and comes with a number of pre-written reports, including some for SDDM. Here is the report window:

You can also build your own reports. Have a look at Jeff Smith's blog (http://www.thatjeffsmith.com/) for lots of great information about SQL Developer and SDDM. Specifically, look at the bottom of this page for links to Jeff's posts about the reporting capability: https://www.thatjeffsmith.com/sql-developer/. One good way to build your own report is to copy one of the built-in reports and modify it.

But, what if I want something that none of the built-in reports has, or I want to do formatting that the reporting tools don't do natively? For instance, what if you want to print a report something like this?

This is a data dictionary for the HR sample schema formatted according to my client's wishes for data dictionaries. There are very good reports from the Oracle data dictionary that are built into SQL Developer, and I like the master child formatting they have where you can click on the table name and see information about the table's columns, but this is more of a report that you can print off if you need to, or read into an Excel spreadsheet.

No worries. You can write your own report in PL/SQL to produce HTML that you output with the DBMS_OUTPUT built-in package. Then in SQL Developer you create it as a PL/SQL DBMS_OUTPUT style report:

The query for this one is:
BEGIN
DATA_DICTIONARY;
END;
So how did I write this? You COULD just hand write the PL/SQL, but since this is going to produce HTML, I like to write it as a PL/SQL server page. I've written extensively about the many uses for PL/SQL server pages (PSPs) before, for instance, see this about using a PSP to produce Spreadsheet ML: https://it.toolbox.com/blogs/jjflash/spreadsheetml-in-a-pl-sql-server-page-part-1-of-2-022015.

Basically, a PSP is an HTML page with some PL/SQL embedded in it. The PL/SQL is used for flow control like LOOPs and to embed data from the database into the page. You compile it with the LOADPSP utility that comes with all Oracle databases, including Oracle XE. And the compiled PSP is actually a stored procedure that uses the PL/SQL Web Toolkit to write to the web - this is the same underlying technology as is used by Oracle Application Express - i.e. APEX. The specific toolkit procedure the compiled PSP uses is HTP.PRN which outputs all of the HTML. But we aren't going to write to the web, we want to write our HTML to DBMS_OUTPUT.

So what I did was compile my PSP with LOADPSP, then edited the resulting stored procedure, named DATA_DICTIONARY. I made three changes:

Since the underlying SQL uses the USER_* data dictionary views, I wanted to make sure it shows the tables owned by the user that executes it. So I added "AUTHID CURRENT_USER" to the procedure definition. That makes it an Invoker's Rights procedure.
I did a FIND/REPLACE to change all occurrences of HTP.PRN to DBMS_OUTPUT.PUT.
I found out that when you are writing to the DBMS_OUTPUT buffer, nothing gets written until there is a call to DBMS_OUTPUT.PUT_LINE; So I added "DBMS_OUTPUT.PUT_LINE('');" as the last line of the procedure.

Recompiled, and it all worked. Almost. It seems that SQL Developer ignores your <head> section, including any <style> section you might have put there. There are several possible work-arounds, but what I advise is to put any styling that you want into "style" attributes in the HTML elements to be styled. Yes, I know this is against HTML best practices, but what can you do?

So I hope this inspires you to try the SQL Developer reporting features. And if the need arises, write your own reports your way.

Friday, September 14, 2018

SDDM Script to Create SQL Server Journal Tables

In my previous post, I talked about how Oracle SQL Developer Data Modeler (SDDM) is extendable with scripts written in languages that support JSR-223. One of those languages is Groovy (http://www.groovy-lang.org/) and I showed how to add Groovy to the JVM classpath used by SDDM. You might want to visit that post to see how. The reason that I needed to write a script was that I wanted to easily add Journal Tables and the triggers for updating them to my database DDL. SDDM actually comes with a script for doing this, but the script writes the triggers in PL/SQL for Oracle databases. As my readers know, the database I was designing was a Microsoft SQL Server database, and the triggers needed to be written in Transact-SQL (T-SQL).

About Journal Tables

So, first of all, you might ask, is what is a Journal Table? A Journal Table is a table that captures an before or after image of every change to a row in the table that is being monitored, usually with a date/time to show when the change occurred. The idea is that we be able to reconstruct a history of changes to the data in a table. Because of the overhead, you probably don't want a journal table behind all of your tables, but when you need to track who did it, when was it done, and what exactly was changed, a journal table can be a good solution. It would also help to recover from changes that shouldn't have been done. There are other solutions like Oracle's flashback query capabilities, but some databases can't do that, and some tables require a little more precise control. By the way, it is often necessary to set permissions on journal tables more stringently than permissions on the tables being journaled, so that hackers can't cover their tracks.

I personally prefer to make these before images, so that you can see what the row looked like before the change. This would imply that you need before UPDATE and before DELETE triggers that INSERT the row of the table being journaled as it exists before the change to the journal table, but no before INSERT since before the INSERT the row didn't exist. But my development lead wanted an after image, which is fine, since SQL Server only has AFTER statement triggers. Here is an example of a table to be journaled, and a journal table:

CREATE TABLE my_table (

my_id INTEGER,

my_char_data VARCHAR(30)

);

CREATE TABLE my_table_jn (

my_id INTEGER,

my_char_data VARCHAR(30),

operation VARCHAR(10),

date_changed DATETIME

);

If I do (on September 10):

INSERT INTO my_table (my_id, my_char_data)

VALUES (1,'Example 1');

The after INSERT trigger should do:

INSERT INTO my_table_jn (my_id, my_char_data, operation, date_changed)

VALUES (1, 'Example1','INSERT', CONVERT(datetime,'09/10/2018',101));

If I do (on September 12):

UPDATE my_table SET my_char_data = 'Example2'

WHERE my_id = 1;

The after UPDATE trigger should do:

INSERT INTO my_table_jn (my_id, my_char_data, operation, date_changed)

VALUES (1, 'Example2','UPDATE', CONVERT(datetime,'09/12/2018',101));

If I do (on September 14):

DELETE my_table

WHERE my_id = 1;

The after DELETE trigger should do:

INSERT INTO my_table_jn (my_id, my_char_data, operation, date_changed)

VALUES (1, 'Example2','DELETE', CONVERT(datetime,'09/14/2018',101));

The data for DELETE is actually a before image, since after the delete there is no data. By the way, triggers participate in the underlying transaction, so if the change to my_table is rolled back, so will the INSERT into my_table_jn.

The Script

As I said before, SDDM includes a script for adding code for Journal Tables to your DDL. Though I couldn't use Oracle's script as written, it served as an excellent starting point for my version. It also shows how you get access to the underlying SDDM data. Here is the first part of the script:
/*
Writes CREATE commands for Journal Table and Triggers for SQL Server.
variable ddlStatementsList should be used to return the list with DDL statements
that are created by script - as shown below:
ddlStatementsList.add(new java.lang.String(ddl));
other available variables:
- model - relational model instance
- pModel - physical model instance
- table - the table in relational model
- tableProxy - table definition in physical model
*/
Since the original script is written in Javascript, and mine is written in Groovy, I needed to change the syntax to Groovy, but much is the same or similar, including comments. Notice that SDDM hands you access points to the SDDM data - listed in the comments above. But it doesn't tell you how to write to SDDM's log. Fortunately, Dave Schleis provided the following code:
// get a handle to the application object
def app = oracle.dbtools.crest.swingui.ApplicationView
app.log("Creating DDL for Journal table for ${table.name}");
"ddl" is a variable to hold the code to be added to the ddl being exported for the table to be journaled. In the original, this was a string variable, but strings in Groovy are immutable. When you do "ddl = ddl + 'a string'" you are really creating a new string object. So I changed it to a StringBuilder, which in Groovy and Java is an object to which you can append more data, without the waste of discarding old strings and creating new ones.
StringBuilder ddl;
String lname;
//journal table name suffix
jnTabSuf = "_jn";
// trigger name suffix
jnAISuf = "_jn_ai";
jnAUSuf = "_jn_au";
jnADSuf = "_jn_ad";
prompt = model.appView.settings.includePromptInDDL;
useSchema = model.appView.settings.isIncludeSchemaInDDL();
if(model.storageDesign.open){
if(useSchema){
lname = tableProxy.longName;
}else{
lname = tableProxy.name;
}
}else{
if(useSchema){
lname = table.longName;
}else{
lname = table.name;
}
}
Here you will see a major advantage of using Groovy for your DDL Transformation (and other) scripts. Groovy has a GString type, similar to strings in Java and Javascript, but you can embed variables in your GStrings. In other languages you would have to concatenate strings. This is a great space and time saver when the script is really code that writes code.
if(prompt){
ddl = new StringBuilder("PRINT 'Creating Journal Table for ${lname};'\n");
}else{
ddl = new StringBuilder("");
}
app.log("Creating Journal Table DDL.");
Most of the rest of the code is appending strings (GStrings) to the ddl variable. Groovy overloads the "append()" method of StringBuilder to the "<<" operator, once again saving me a little time and space. Also notice that I'm using the triple quoted string in this section of code, which lets me use actual line feeds in place of the "\n" line feed character. I didn't do this throughout, because I didn't want to fool with the original Javascript code more than necessary.
ddl <<
"""CREATE TABLE ${lname}${jnTabSuf}
(${table.name}${jnTabSuf}_id INT IDENTITY(1,1) NOT NULL
,operation VARCHAR(10) NOT NULL
""";
cols = table.elements;
Here, I'm looping through the columns from the original table and writing the journal table with the same columns. I don't add the date_changed column, because my tables already have a last_update_date column. They also have a last_update_user_id. You could probably add code that says to add these to the journal table if they don't exist.
cols.each {
ddl <<
" ,$it.name $it.datatypeString";
if (it.mandatory){
ddl << " NOT NULL\n";
}else{
ddl << "\n";
}
}
/* Primary key is non-clustered because queries of the Journal table will
* usually be by the parent table's key.
*/
ddl <<
" ,CONSTRAINT ${table.name}${jnTabSuf}_pk\n" +
" PRIMARY KEY NONCLUSTERED (${table.name}${jnTabSuf}_id)\n" +
" );\n" +
"GO\n\n" +
/* So instead, we create a clustered index on the parent table's PK.
* No Foreign key, because we may delete rows of the parent table and still
* keep the journal of changes to the now-deleted rows, including the datetime of
* deletion.
*/
"CREATE CLUSTERED INDEX ${table.name}${jnTabSuf}_fki\n" +
" ON ${table.name}${jnTabSuf} (${table.name}_id);\n" +
"GO\n"

if(prompt){
ddl << "\nPRINT 'Creating Journal Triggers for ${lname};'\n";
}else{
ddl << "\n";
}
So far, my code has been similar to the original, but my triggers are very different - SQL Server triggers are not at all like Oracle triggers. Leave a comment, if you want me to write a post to compare the way it works in SQL Server with triggers to do it in Oracle.
app.log("Creating Journal Table After Insert trigger.");
ddl <<
"DROP TRIGGER IF EXISTS ${lname}${jnAISuf};\n" +
"GO\n" +
"CREATE TRIGGER ${table.name}${jnAISuf}\n" +
" ON ${lname}\n" +
" AFTER INSERT AS\n" +
"BEGIN \n" +
" INSERT INTO ${lname}${jnTabSuf}\n" +
" (operation\n";
cols.each {
ddl <<
" ,$it.name\n";
}
ddl <<
" )\n" +
" SELECT 'INSERT' AS operation\n"
cols.each {
ddl <<
" ,$it.name\n";
}
ddl <<
" FROM inserted;\n" +
"END;\n" +
"GO\n\n";
The rest of the code is pretty similar.
app.log("Creating Journal Table After Update trigger.");
ddl <<
"DROP TRIGGER IF EXISTS ${lname}${jnAUSuf};\n" +
"GO\n" +
"CREATE TRIGGER ${table.name}${jnAUSuf}\n" +
" ON ${lname}\n" +
" AFTER UPDATE AS\n" +
"BEGIN \n" +
" UPDATE t\n" +
" SET t.last_update_date = CURRENT_TIMESTAMP\n" +
" FROM ${lname} AS t\n" +
" INNER JOIN inserted AS i\n" +
" ON t.${table.name}_id = i.${table.name}_id;\n\n" +
" INSERT INTO ${lname}${jnTabSuf}\n" +
" (operation\n";
cols.each {
ddl <<
" ,$it.name\n";
}
ddl <<
" )\n" +
" SELECT 'UPDATE' AS operation\n"
cols.each {
ddl <<
" ,$it.name\n";
}
ddl <<
" FROM inserted;\n" +
"END;\n"+
"GO\n\n"
app.log("Creating Journal Table After Delete trigger.");
ddl <<
"DROP TRIGGER IF EXISTS ${lname}${jnADSuf};\n" +
"GO\n" +
"CREATE TRIGGER ${table.name}${jnADSuf}\n" +
" ON ${lname}\n" +
" AFTER DELETE AS\n" +
"BEGIN \n" +
" INSERT INTO ${lname}${jnTabSuf}\n" +
" (operation\n";
cols.each {
ddl <<
" ,$it.name\n";
}
ddl <<
" )\n" +
" SELECT 'DELETE' AS operation\n"
cols.each {
if (it.name == "last_update_date") {
ddl <<
" ,CURRENT_TIMESTAMP AS $it.name\n";
} else {
ddl <<
" ,$it.name\n";
}
}
ddl <<
" FROM deleted;\n" +
"END;\n"+
"GO\n"
The last step is to add the ddl variable that I've been building to the DDL that will be exported. Since my variable is a StringBuilder, not a String,we just need to use its toString method.
ddlStatementsList.add(ddl.toString());

So that's it. You are welcome to cut and paste this into your own SDDM project. Hope this was useful.

Friday, August 24, 2018

Scripting Oracle SQL Developer Data Modeler with Groovy

In my last post, I related how I am using Oracle SQL Developer Data Modeler (SDDM) to design a database to be implemented in a Microsoft SQL Server database. I mentioned that one of the neat things about SDDM is that you can write scripts to do things that the tool doesn't do natively. Scripts can be written in any language supported by the Java Scripting API (defined by JSR 223). Nashorn, the Java library for scripting in JavaScript is built into Java, so scripting in JavaScript works out of the box. The SDDM developers have included a number of Nashorn scripts with the tool. You can use these yourself, or use them as examples from which you can write your own scripts.

SDDM also comes with a good number of scripts written in JRuby - the JVM implementation of the Ruby language. But to use the JRuby scripts, or scripts written in some other compliant scripting language, you need to copy the Java library(s) for that language to your SDDM classpath. The easiest way to find a good place to put them is to go to the Help/About page, select the Properties tab, and find the property named java.ext.dirs:

With SDDM shut down, put the library in one of the directories listed in this property. Then you can re-start SDDM. Mine is in %SDDM_HOME%\jdk\jre\lib\ext where SDDM_HOME is the root directory where you installed SDDM. This information, and much of what I have learned about scripting SDDM, came from Dave Schleis, especially his blog post, Data Modeler Scripting-101-Lets start at the very beginning.

DDL Transformation Script for Journal Table

What I needed was a script to generate the DDL to create a journal table behind one of my tables, plus the database triggers to automatically write to the journal table every time DML is executed against the base table. SDDM happens to come with a script, written in JavaScript (for Nashorn) to do this. If you are designing an Oracle database, you may be able to use this script out of the box. But I am writing for a SQL Server database, and while the CREATE TABLE for the journal table is almost identical, triggers in SQL Server are VERY different. Not only are they written in T-SQL, which is quite different from PL/SQL, but triggers all run as statement level triggers, not row level triggers. So I was glad to have the original script as an example, but I needed to do major changes.

Writing scripts in Groovy

Now, I could have kept the script in JavaScript, but I don't know that language very well. I don't know Ruby very well either. Dave Schleis's favorite language is Apache Groovy which like JRuby does support the JSR-223 standard, so some of his examples are in Groovy. And I've gotten familiar with Groovy because of my past experience with Oracle Application Development Framework (ADF). ADF's Model component, ADF Business Objects, can be extended with scripts written in Groovy. So I decided to write my DDL Transformation for SQL Server Journal Tables in Groovy.

To write SDDM scripts in Groovy, all you need to do is copy the Groovy library to the SDDM classpath as I described above. Download a copy of Groovy from http://www.groovy-lang.org/download.html. While the latest stable version of Groovy is 2.5, it does not contain the library needed to support JSR-223. So download the binary for the latest 2.4 version - I downloaded apache-groovy-binary-2.4.15.zip. The library you want is groovy-all-2.4.15.jar, found in the "embeddable" directory.

To write scripts, select Tools/Design Rules and Transformations... from the SDDM menu. The kind of script I wanted to write is a Table DDL Transformation, which can be run when exporting the DDL for your Relational/Physical design. Choose the Relational Model for your script. Then you will see:

I have already added my script. Notice that there are four possible scripts for each script set - one to add DDL code before the CREATE TABLE commands, one to actually replace the CREATE TABLE commands, one to add code after the CREATE TABLE, and one to add code at the end of the DDL for all the tables being exported. You don't need to use all four, and in fact, my journal table code is only for After Create. You won't be able to select a scripting engine until after you start writing the script - I'm not sure why. So start by just adding a comment or a few spaces. I started by selecting the Journal tables script that comes with SDDM and copying it, then pasting it into my new script. Then pick a scripting engine. If you have correctly added the Groovy library to the classpath, "Groovy Scripting Engine" should be one of the choices. Warning - there is currently (18.1 for me) a bug in SDDM where SDDM doesn't save the scripting engine choice for Groovy Scripting Engine. Oddly, it does for Oracle Nashorn, the default, and for JRuby. The work-around is to go into dr_custom_scripts.xml find the reference to your script and change the engine attribute. I'd make a back-up of this file first.

So in the next post, I'll go through the code for my script and talk a little about testing it.

Friday, June 29, 2018

Oracle SQL Developer Data Modeler and Microsoft SQL Server

Even though my target database in my new job is Microsoft SQL Server rather than Oracle RDBMS, I still use an Oracle tool for a key part of my work - data modeling. My tool of preference is Oracle SQL Developer Data Modeler (SDDM). While there are other data modeling tools out there, and some (ERWin and TOAD Data Modeler are two I've used) are very good, I prefer SDDM.

It is free to use. (Thank you, Oracle!)
I haven't been allowed administrative authority on my government issued laptop, so I can't install software. But to install SDDM, you just unzip - which I CAN do.
It can reverse engineer from SQL Server - all you have to do is point it at a SQL Server JDBC driver.
It supports DDL generation for databases other than Oracle RDBMS - including SQL Server.
It understands the difference between Logical, Relational, and Physical Modeling. Some tools just model databases (relational) - tables, columns, foreign keys. They call it Entity/Relationship diagramming (logical) but it isn't really.
Whatever SDDM can't do, you can probably script - more on this in a later post.
There is good information and support available even if you don't have a contract with Oracle Support. Some resources:

Jeff Smith - Oracle's Product manager for the tool.
Heli from Finland - Author of an excellent book about SDDM.
Dave Schleis - expert on SDDM scripting, especially with Groovy
Kent Graziano - especially good at using SDDM to design Data Warehouses.
SDDM Forum - The people listed above, other users and even SDDM's developers, especially Phillip Stoyanov answer your questions.

Using SDDM with SQL Server

So, as I said, SDDM works perfectly well with a SQL Server database. All you need to do is register a JDBC driver for it. While Microsoft does have a JDBC driver that you can download, and it does work with SDDM's stand-alone version, SDDM also comes bundled with Oracle SQL Developer and that DOESN'T work with Microsoft's driver. Neither does Eclipse, which I'm currently using as my primary IDE. So instead, I'm using the free open source jTDS JDBC driver for SQL Server and Sybase (Sybase is a close cousin to SQL Server). You can get jTDS here: http://jtds.sourceforge.net/. Just download and unzip. Now open SDDM and select Tools/Preferences and find "Third Party JDBC Drivers":

Or the steps are similar if you are using SDDM embedded in SQL Developer. Now when you configure a connection to a database, you will have an additional tab for a connection type of SQL Server:

Fill in the blanks, and off you go: you can now import from the SQL Server Database Dictionary to reverse engineer your database into a Relational Design. Or if you are using SDDM embedded in SQL Developer, you can open a connection to your SQL Server database in SQL Developer, and do many of the normal SQL Developer tasks. Just one warning - SQL Developer doesn't really understand T-SQL, Microsoft's counterpart to both SQL*Plus style scripting and PL/SQL. So there are limitations to what you can do in the SQL Worksheet. But one thing you CAN do is open a relational diagram in SDDM and drag objects onto it from the SQL Server connection in SQL Developer.

Reporting

SDDM has a pretty good reporting capability and I'm only beginning to learn its full capabilities. But if you want to write SQL queries or reports against your data model, you will want to export your model to a Reporting Repository. The Reporting Repository needs to be in an Oracle database - sorry. This is why I asked my system administrator to set up a server with an Oracle Express Edition (Oracle XE) database, just for my private use for SDDM reports. Like SDDM, Oracle XE is free to use, and while limited, it has no trouble hosting an SDDM reporting repository.

SQL Server DDL

You can also create a Physical Database design for SQL Server and SDDM will create DDL appropriate for your database - it understands the differences. Only problem I've had with it, is that I usually need to manually edit the output a bit. The worst problem is that the "GO" command needs to appear on a separate line and sometimes the newline is missing. Also, SDDM tends to output UNIX/Linux style newlines, not the standard carriage return/linefeeds expected by Windows. There is a little bit of a problem between the Schema in Relational Design and its implementation in Physical design - Physical doesn't always pick up the mapping you specified. But this may be something I'm doing wrong.

So - if Oracle isn't a dirty word for you SQL Server database architects, you may want to check out Oracle SQL Developer Data Modeler. It is a great, cost-free way to design your data model. I plan to write another post soon to talk about some SDDM scripting in Dave Schleis's favorite language: Groovy.

Friday, June 22, 2018

Data Conversion in Oracle vs. SQL Server

In my last post, I talked about multiple databases in a single instance, and gave SQL Server points for doing this better, and doing it without the extra cost associated with Oracle 12c/18 Multi-Tenant option. Today, I want to talk about something Oracle does better than SQL Server - data conversions.

By data conversions, I mean mostly conversions from strings that are human readable to numbers and dates and back to strings. If you are one of those people who stores everything as strings (CHAR or VARCHAR) STOP IT. This is a BAD idea in both SQL Server and Oracle. But that's a subject for another post.

First of all, both databases will convert strings to numbers and dates implicitly if you don't include a conversion function. This works fine - until it gives you unexpected results or awful performance. I strongly recommend that you use a conversion function EVERY TIME you start with a number or date and want to show it on a report or screen in human readable form and EVERY TIME your get input from users and need to store it in its proper datatype. This way, the data will be converted in a predictable and efficient way. So what are the conversion functions?

ANSI SQL Conversion

Both Oracle and SQL Server are pretty compliant with the ANSI SQL standard, and so both support the ANSI SQL CAST function. The syntax of CAST is:
CAST (expression AS datatype)
You can optionally include a size with the datatype, such as the length of a VARCHAR, or the precision and scale of a number. This works well in both databases for number conversions between floating point and integer datatypes, or from string to number - just understand that if the string doesn't contain a valid number, CAST will fail. SQL Server is nice in that it includes a TRY_CAST function that will return NULL for an invalid number rather than fail. Since CAST is ANSI standard, it is certainly preferred if you are writing code that may need to work on any database that meets the standard. But while CAST works for string to date and date to string conversions, the format of the string must be ANSI standard date format, which is the same as ISO_8601. Not so good when you need to convert to or from MM/DD/YYYY - your expression would need to substring the original date to put it in ANSI format.

Proprietary Conversion

So Oracle and SQL Server both have proprietary non-standard functions to convert between strings and dates more flexibly. SQL Server has:
CONVERT (datatype, expression, style)
Just as with CAST, the datatype parameter can include size information. The style parameter is an integer that selects the data format of the input or output data from a table in the documentation - CAST and Convert. Now, I will admit that this list includes the formats that I use the most frequently for dates. For instance, style 1 is MM/DD/YY, and 101 is MM/DD/YYYY. But it doesn't include full month names in the selected natural language, such as January 1, 2018 in English or 1 Enero 2018 in Spanish.

Oracle has separate functions for converting a string to a date or a date to a string:
TO_DATE (expression, format, nlsparam)
TO_CHAR (expression, format, nlsparam)
The format parameter is optional and defaults to the default format set up by your DBA in your database. nlsparam is also optional and defaults to the NLS (natural language system) settings set by your DBA in your database. I almost never let format default, but almost always let nlsparam default. However, if you need to support languages other than English, this is what would let it appear as January for English speakers or Enero for Spanish. Documentation for TO_DATE, and TO_CHAR includes a list of formatting codes.

To me, the formats provided by Oracle are both more flexible and more clear. Which is easier to remember and easier to read:
CONVERT (VARCHAR(10), my_date, 101) - SQL Server
TO_CHAR (my_date, 'MM/DD/YYYY') - Oracle
Advantage - Oracle.

By the way, SQL Server's CONVERT function also works for numbers, but doesn't have a significant advantage over the CAST function for numbers. Oracle has TO_NUMBER and TO_CHAR functions for numbers with the same parameters as TO_DATE and TO_CHAR for dates, and more flexible formats for strings converted to and from numbers.

Advantage - Oracle.

Monday, June 4, 2018

Databases vs. Instances

As a long-time Oracle developer and administrator, one of the most interesting differences that I've found as a newbie SQL Server developer and administrator is in the simple basic term, database.

In simple terms, a database is a set of database files in permanent storage that share a data dictionary, logs and control files. An instance is a set of running processes that maintain a database and its control files and logs. While there are in-memory databases where the database is in the server's memory, rather than maintained in permanent storage while its associated instance is running, eventually even this kind of database does get written to permanent storage.

Until Oracle 12c with its multi-tenant option, a database and an instance in Oracle were pretty close to the same thing. With Oracle Real Application Clusters (RAC), you might have several instances on separate servers accessing a single database, but you would not have several databases under the same instance. It would not be unusual to run several Oracle databases on the same server, but each database would be maintained by its own instance (or instances in the case of RAC). Instances might share a network listener for client connections, but the client would indicate the instance to which it wanted to connect, and therefore which database.

But in SQL Server, and a few other relational databases, for example, PostgreSQL, a single instance can run several databases. There are also some required databases managed by an instance, such as the database that contains the shared data dictionary. Each database can have its own schemas, users, tables and other database objects. None of these need to be unique except within a database. A full unique name for a database object would be databaseName.schemaName.objectName.

This came in handy recently when I needed to create a new test database identical to our existing development database with three separate schemas. My Oracle-trained brain initially was thinking that I needed a new SQL Server instance, since I didn't have an extra server or database license. This is possible, but not very common, and it requires the original SQL Server installation media. But I was reminded that I could just create a new database in the existing instance. It could have the same schemas and database objects, but a different database name. And it was just as easy to configure the test version of the application to connect to a different database in the same instance as to a database with the same name in a different instance.

I had two choices available for creating my test database. It is very easy to use SQL Server Management Studio to back up my development database, and then restore it into a new test database. But that would also copy the development data instead of creating the tables empty. Instead, this proved to be a good opportunity to test my DDL scripts. I have one script for each object to be created, and a master script to call each in the correct order. This also executes scripts that load data into reference and lookup tables - remember that I consider data for these tables to be part of the source code for the database.

So, chalk this one up as an advantage for SQL Server. Multi-tenant Option in Oracle 12c (and its successor, Oracle 18) has similar advantages - and may even be a little better if you need to move tenant databases around. But it is an extra cost option, while multiple databases is a standard part of SQL Server.

Monday, April 16, 2018

About Indexing Character Fields

I had created a few tables in my SQL Server database like this:
CREATE TABLE ref_status(
status_code VARCHAR(10) NOT NULL,
description VARCHAR(80) NOT NULL,
CONSTRAINT ref_status_pk PRIMARY KEY (status_code)
);
CREATE TABLE stuff (
stuff_id INT IDENTITY,
stuff_date DATETIME,
stuff_desc VARCHAR(80),
status_code VARCHAR(10),
CONSTRAINT stuff_pk PRIMARY KEY (stuff_id),
CONSTRAINT stuff_status_fk FOREIGN KEY (status_code)
REFERENCES ref_status (status_code)
);
I inserted a few status codes into ref_status, 'NEW', 'IN PROGRESS', and 'PRODUCTION'. Then the developers got busy, and wrote a user interface to insert data into "stuff". After testing, there were about 10 rows in "stuff".

Then the lead developer asked me, "Can you update the status codes to make them read 'New', 'In progress', and 'Production' instead?" "Sure", I said. Because of the foreign key, I felt I needed to do this in three steps.

Insert new rows into "ref_status" with the new status codes.
Update "stuff" to change the existing status codes to the values of the new codes.
Delete the old status codes from "ref_status".

But step 1 failed with a duplicate key error. What's going on? It turns out that in SQL Server, by default, indexes on character fields are case insensitive. So 'NEW' and 'New' are duplicate keys (and so would be 'new' or even 'neW'). The correct way to do this was simply:

Update the status_code column in one table.
Update the status_code column in the other table.

It didn't matter which table was updated first, and it didn't cause a foreign key problem because the index on "ref_status" DIDN'T change.

Contrast with Oracle

Of course, the reason my first crack at this failed was that in Oracle - remember that I came from the Oracle world - the indexes are case sensitive. 'NEW' and 'New' are two different keys. Can you make the indexes case insensitive, like in SQL Server? Yes, BUT you have to do it with a function-based index. That means that you can't make a case-insensitive primary key constraint on REF_STATUS. You have to make an unique function based index like this:

CREATE UNIQUE INDEX ref_status_uk ON

ref_status
(NLSSORT (status_code, 'NLS_SORT=BINARY_CI') );

Yet to point a foreign key at REF_STATUS, it needs to have an unique or primary key constraint. Well, I haven't tried this, but here is what I THINK will work.

Even though, by default, creating an unique constraint will automatically create an unique index by the same name, you CAN create the index first. Then the unique constraint will use the pre-existing index instead of creating a new one. So after creating the index as described above, do:

ALTER TABLE ref_status ADD

CONSTRAINT ref_status_uk UNIQUE (status_code);

That way, Oracle will use the existing index to enforce uniqueness. This much, I am pretty sure works. Then, theoretically, you can create the foreign key, STUFF_STATUS_FK from STUFF to REF_STATUS. If anyone wants to try this, please let me know in the comments if it worked.

Thursday, March 29, 2018

More on my Oracle to SQL Server Transition

I'm continuing to adjust to this new world of Microsoft SQL Server. Remembering my great experiences in the Oracle user community, I've joined PASS, which is the international user group for SQL Server users. One thing PASS offers is "SQL Saturday" which are one day (yes, on Saturday) training events held by local groups all over the world. There are a few SQL Saturdays being held somewhere almost every weekend. On March 24, I drove the 100 miles from my home to Glen Allen, VA (northwest of Richmond) for the Richmond SQL Saturday. It was great, I got to meet a few people, and learned at least two things that will be useful in the work I'm doing right now.

About Table Storage and Indexes

Like Oracle, it is very important to have a Primary Key constraint on all of your tables in SQL Server. By default, SQL Server wants to make this a clustered index. "What is a clustered index?" my Oracle expert friends will ask. In a clustered index, the data from the row is stored with the index entry. That means that a table may only have one clustered index. Oracle experts will recognize this as essentially the same as an Index-Organized Table (IOT). IOTs aren't used that much in Oracle, but maybe they should be. If you often query a table by its primary key, or often do UPDATE or DELETE by primary key, you might get a good performance boost by making it an IOT.

At SQL Saturday, Richmond, I went to a great presentation about indexing, "Indexing for Performance" by Jeffrey Garbus. Mr. Garbus described very well the access path used for various WHERE conditions, with and without indexes, and differences between using a clustered and a non-clustered index. A non-clustered index is a B-tree index in SQL Server, just like a normal index in Oracle and is used in a similar way. In both databases, the optimizer decides whether to use an index or to do a full table scan according to similar criteria, and in fact, may decide not to use an index in situations such as the likelihood that more than a certain percentage of the rows will be returned by the query.

One very important difference between an IOT in Oracle and a table with a clustered index in SQL Server is that an IOT is always indexed with its primary key. While the primary key gets an unique clustered index by default in SQL Server, it is possible to specify that the primary key should get an unique non-clustered index. This leaves it open to create a clustered index on another key, and that index need not be unique. Since a clustered index makes the database store the rest of the data with the index, only one clustered index may be created on a table. Mr. Garbus explained why, for certain tables, it might be better to cluster on a parent key. For instance, in an Order-Entry application, line items for an order are rarely accessed any other way than by the parent order_id. Therefore, it is often a good idea to cluster the order_line_item table on this foreign key, rather than its primary key (order_line_item_id).

This reminded me of another Oracle feature that I believe may be under-used. While IOTs can only use a table's primary key, you can actually store master table data like the "order" table physically next to its detail rows in "order_line_item". Oracle has clustering too, but the syntax and implementation is different. Check out the CREATE CLUSTER command and its use in the Database Concepts Guide.

Thursday, January 25, 2018

From Oracle Expert to SQL Server Newbie - First Impressions

Oops, I've done it again - I left you readers hanging when I published the first part of two on generating code for your lookup/reference/code tables. I promise to get back to it.

But in the meantime:
I have a new job. Yeah, it was pretty sudden - finished my old job at the end of the year, and jumped directly into the new one at the beginning of the new year. And the biggest shock is that this old Oracle DBA and Database Architect is going to be designing a database to be implemented in Microsoft SQL Server. No, I didn't lie about my credentials, but I've worked with this boss before, and he said, "Oh, you'll do fine. You're a smart guy. And SQL is SQL, right? RDBMS is RDBMS, right?"

So, here I am, the newbie, posing as an expert. I've been learning all I can about SQL Server, and trying things out, watching tutorials (Microsoft has some really good video lessons, designed especially for the Oracle DBA transitioning.) Here are some first impressions:

SQL is SQL. Yes, there are differences, but the basics of the SELECT, INSERT, UPDATE, DELETE, and most of the DDL commands too, are pretty close.
T-SQL is NOT PL/SQL. Think of Transact-SQL as more like a SQL*Plus script with some flow control statements like IF/ELSE. Some of it is nice, like no difference between a substitution variable and a language variable. I think (not sure) that a T-SQL variable does act as a bind variable where PL/SQL variables would, but you can do things that PL/SQL would have to put inside an EXECUTE IMMEDIATE. Cool.
I miss PL/SQL packages.
All triggers are statement level triggers. This means that you write triggers very differently. Hard to get used to. That doesn't mean you can't do row-level stuff - instead of getting a pseudo-record with the contents of the pre-DML row (the :OLD record) and one with the contents of the post-DML row (the :NEW record) you get pseudo-tables containing the old and new rows. I don't know for sure, but I think this means that you don't deal with mutating table problems.
The real kicker is that most triggers are AFTER statement triggers. If you want to do BEFORE stuff, you write an INSTEAD OF trigger. In Oracle, INSTEAD OF is just for views. Just like Oracle, the INSTEAD OF trigger needs to do the DML on the underlying table(s).
There are two SQL Developer-like GUI tools - the SQL Server Management Studio, and SQL Server Data Tools for Visual Studio. These are nice, but I wish they had a good SQL formatting tool.
SQL Developer CAN talk to a SQL Server database, which is nice. But it doesn't really understand T-SQL, so it can have problems with it. I did use its formatting tool on some SELECTs, however, and it worked pretty well. Also, if you export a table as INSERTs - it writes them as Oracle INSERTs - doesn't work for dates because SQL Server doesn't have TO_DATE.
Oracle has nothing like the SQL Server Integration Server - this is nice but TOTALLY new to me.
SQL Server Reporting Server and its report building tool are MUCH better than SQL Reports from Oracle. But then, I have always hated Oracle's reporting tool.

Upcoming Events

Collaborate 2020

April 19-24, 2020
Mandalay Bay Resort and Casino Las Vegas, NV

This will be the first Collaborate conference since the Independent Oracle Users Group (IOUG) and Quest merged. I haven't been to Collaborate for a few years, because the best content in the Database part of the conference is very DBA-oriented, and I've surrendered my DBA hat. But if you are primarily a DBA or database developer, Collaborate is well worth your time.

ODTUG K'Scope 2020

June 28 - July 2, 2020
Sheraton Boston, Boston, MA

This is my favorite Oracle conference. It is completely devoted to people who make their livings developing applications with Oracle tools and Oracle databases, not just the original Oracle RDBMS, but also APEX, OBIEE, Essbase, Hyperion, My SQL, TimesTen etc. Very technical, little hype.

East Coast Oracle Users Conference (ECO)

October 29-30, 2019
Hilton North Raleigh/Midtown Hotel, Raleigh, NC

This is a very good regional Oracle conference. And it is the closest for me. It has content for Oracle RDBMS DBAs and developers, as well as content for people who work with the Oracle applications. Even though it is regional, it attracts speakers with national reputations and good representation from Oracle.

Oracle OpenWorld

September 16-19, 2019, San Francisco, CA

This is Oracle's big show and this year, they are expanding to events in London, Dubai, and Singapore. If you want to know what Oracle is up to, go to one of these. Oracle Develop is no longer a formal track at OpenWorld. That doesn't mean that you can't find good technical content - you can, especially on User Group Sunday. But I am reconsidering whether there is enough to justify my attendance.