Saturday, September 21, 2019

SDDM Physical Model for MySQL


As an example, I’ve imported the Relational Design for the HR schema into SDDM from my Oracle 11g XE Database. When you import from a data dictionary, SDDM creates a Relational Model, but it also creates a Physical Model for the database from which you imported. If you import from an unsupported database like MySQL, it will default to Oracle, but you can set the target to one of the other supported databases if you like. But I recommend that you choose from the Tools menu to run the RDBMS Site Editor, and create a new site for your MySQL database. By creating a new Site, you can distinguish your physical models that have been customized for your non-standard database.
This will require you to choose a supported database as the database type. So we have to choose the one that is as close as we can get. I did a lot of research and while there is no perfect solution, as far as I can determine, SQL Server was the best choice. There are some major differences that we will need to deal with, but at least the datatypes are fairly close, because they support the ANSI SQL standard a little better than Oracle. For instance, a text field is VARCHAR, not VARCHAR2. So here is my new RDBMS Site as shown in the Site Editor:



So, now I can create a new Physical Model for my MySQL database. The browser shows my Relational Model, the original Physical Model for my Oracle 11g XE database, and the new Physical Model.

DDL Differences for MySQL

What other adaptations do we need? For my purposes, the main change that I need is to change the DDL for generating surrogate primary keys from a non-repeating sequence of numbers. MySQL does this with the AUTO_INCREMENT attribute of the primary key column. So whenever I mark a column in the Relational Design as an auto-increment column, I want the CREATE TABLE command that is generated to have AUTO_INCREMENT, not the IDENTITY attribute that is typical of SQL Server and Oracle 12 and later, or the generated SEQUENCE and trigger that is typical of Oracle versions before 12c. And I want to use the “Start with” and “Increment” values that I put into the Relational Design, in case I don’t want to use the defaults – 1. Here’s EMPLOYEES.EMPLOYEE_ID marked as Auto Increment:



Table and column comment syntax is different too. In SQL Server, you have to call a built-in procedure to add the comment to the table or column. In Oracle, you use the “COMMENT ON” syntax. But in MySQL, there is a COMMENT attribute of the column or of the table. So I want “Comments in RDBMS” generated with the proper syntax.

Syntax for check constraint is pretty similar to syntax in Oracle or SQL Server, but they aren’t enforced in MySQL before version 8, so I want a SQL comment to warn the developer. There are a few CREATE TABLE clauses that are unique to MySQL, like ENGINE, but the defaults are reasonable, so I leave them out. If you need them, you may have to add some user-defined properties for them to SDDM and then the transformation script needs to use those properties in DDL generation – I may do that in a future version, but right now my script doesn’t have them.
In my next post, I’ll share my SDDM Transformation script, and explain how it works.

Monday, September 2, 2019

SQL Developer Data Modeler and MySQL

While my current job title is Senior Java Full-Stack Developer, my manager has taken note of the fact that I have considerable experience as a database architect, developer, and all-around database guy. So I'm doing the job of a Database Architect, and preparing to migrate an Oracle on-premises database to a MySQL database in the cloud. And of course, this is my first experience with MySQL, under the assumption that “If you know one Relational Database Management System (RDBMS), you know them all.” Well, I know Oracle very well, and three years ago, I learned a little PostgreSQL, and one and a half years ago, I learned SQL Server, so it was back to the docs to learn something about MySQL. If you are interested in my adventure in the MS SQL Server world, check out my posts:
From Oracle Expert to SQL Server Newbie - First Impressions
More on my Oracle to SQL Server Transition
About Indexing Character Fields
Databases vs. Instances
Data Conversion in Oracle vs. SQL Server
Oracle SQL Developer Data Modeler and Microsoft SQL Server

Designing the Target Database

Some of my work on the new project entailed some redesign to add new tables to the existing MySQL database. So I pulled out my favorite database design tool, Oracle SQL Developer Data Modeler (SDDM), and imported from the source Oracle database to a relational model, reverse engineered to the logical model and went to work making some changes. The biggest problem was that the original designers of the Oracle database had made extensive use of Oracle Object-Relational features. In many cases, there were columns with nested tables of objects, where a more conventional relational design would have had child tables. The target database would need to be implemented with the child tables instead.
I'll skip past several weeks of redesign work, to the point where I had my new relational design ready to go. SDDM has a very useful separation of the relational model from the physical model, and you can have more than one physical models mapped to your relational model. The relational model is simply tables, columns, keys and views, etc. - items that are common to all RDBMSs. But the physical model is specific to a particular RDBMS implementation. One physical model might be for an Oracle 11g Express Edition database, which is what I have on my laptop for experiments, testing and basic fooling around with Oracle. One might be for Oracle 12c Release 2, which is what the original database uses. And in my previous job, I had a physical model for MS SQL Server 2012. We were actually using 2017, but 2012 is the latest that SDDM supports.
Unfortunately, SDDM does not have any support for MySQL. This seems strange to me, since when Oracle bought Sun, they also obtained InnoDB, which is the most used storage engine for MySQL. And while MySQL itself is open source, and there is a community edition that is open source and free, Oracle also has a supported edition. And Oracle provides a good deal of developer support for the MySQL project. So whenever you search for MySQL information, you are going to wind up on an Oracle web site. I have seen requests for MySQL support in the SQL Developer Exchange from way back, but there is no sign that Oracle's SDDM developers will be implementing it any time soon.

SDDM DDL Transformation Scripting to the Rescue

Fortunately, what the developers of SDDM DID give us is a very powerful and flexible scripting capability. I've already talked about a script that I wrote for my SQL Server project:
Scripting SQL Developer Data Modeler with Groovy
SDDM Script to Create SQL Server Journal Tables

You can write scripts to add DDL to the DDL that SDDM generates, as I did for Journal tables, but you can also write scripts to run instead of the SDDM native processes for generating DDL. That means that you can add the missing MySQL functionality yourself. That is what I'll be talking about in the next post.

Saturday, August 17, 2019

A Few More Thoughts about Infra

I know, I know, it has been months since I posted.  Personal note: My life got a bit complicated in late February and early March, and then at the end of April, I started a new job.  My work on Infra moved from being my main professional project to being a labor of love that I fit in whenever I can spare some energy from the projects that pay the bills.  So, I'm going to write a few more thoughts on Infra here, and then I'll write a few posts on some items that came out of my current paid gig.

Application Implementation Considerations and Options

With the database design done, engineered to a relational design and some physical implementation scripts generated, my attention turned to the front end implementation of INFRA.

Requirements

INFRA is intended to be available as an application that anyone can install on just about any relational database and just about any application infrastructure. And I don’t want cost to be a major factor for anyone who wants to use it, so I want to look at free to use frameworks and tools. I say “free to use” because I have expertise in Oracle tools and frameworks, and while these are not open source, there are several options that are free to use. Still, many potential users of INFRA will prefer not to be locked into the tools of a single vendor (and I’m afraid they are strongly biased AGAINST Oracle), and thus will prefer open source.
As for requirements, I’d prefer to stick with standards and especially do as much with plain HTML and CSS as possible. I also want good tooling, and am very fond of having the tools do as much of the work as possible. So I want a good IDE with good support for the chosen framework. I have nothing against a good text editor to write short programs where I am hand writing most of the code. For an entire project, however, nothing beats a good IDE, and I have experience with three, NetBeans, Eclipse and Oracle JDeveloper. So here is a little about the choices I considered:

Oracle Application Development Framework (ADF)

ADF has been my professional choice since 2010, and I am still the Lead Moderator for the ADF Enterprise Methodology Group (ADF EMG). So despite the bias against anything from Oracle, I needed to consider ADF, because I can hit the ground running with it. ADF is certainly not open source, which is a mark against it, but there is a subset of ADF called ADF Essentials that is free to use. Unfortunately, the full ADF product only runs on Oracle WebLogic Server (WLS), but ADF Essentials is supported on Glassfish, and is known to work on several other JEE standards compliant application servers, like JBoss and with careful configuration on Apache Tomcat. ADF is a Model/View/Controller (MVC) based framework. It supports the JPA standard for Object Relational Mapping (ORM) for its Model and Java Server Faces (JSF) for its View. The ADF Faces JSF component library has lots of widgets that give you lots of options for layout, data presentation, and input/update forms.
My favorite part of ADF is tooling. You can develop ADF Essentials applications in Oracle JDeveloper, but you do have to be careful to stay away from parts of ADF that require the full tool – i.e. must be deployed on WLS. But JDeveloper has wonderful WYSIWYG support for ADF Faces (and Apache Trinidad) components and can do things like let you drag fields from your ORM onto a page and generate a table component configured to display them, or a form to insert, update or delete data in the database. There have been complaints that JDeveloper is slow and buggy however, and it isn’t in very wide use outside the Oracle user community. As an alternative, you can develop ADF Essentials applications with Eclipse by installing the Oracle Enterprise Pack for Eclipse (OEPE). OEPE doesn’t show ADF Faces components quite as well as JDeveloper, and doesn’t have as much code generation, but it is not bad, and probably a little more stable.
The problems with ADF are: It isn’t open source. Oracle’s support for ADF has been waning, and that means that updates of ADF and of JDeveloper have been very widely spaced. Later versions of JDeveloper have bugs that have not been adequately addressed. It has never gained a large following in the developer community. But to me, the killer that sent me looking for an alternative is that ADF Essentials does not support ADF Security for authorization and authentication of your application. In earlier versions of ADF, this was okay because you COULD configure standard JEE security instead, or use a third party security framework like Spring Security or Apache Shiro. But the latest versions of ADF Essentials broke this, and while I’ve developed a work around, I’m not happy with it.

Other JSF Libraries

Java Server Faces is a JEE standard for the View and Controller, and so is JPA for the Model. Through ADF, I was already familiar with how these work, so a different set of JSF components seemed like a good option. There are a number of libraries around, many of which are open source.

Apache Trinidad

One faces library that appealed to me is Apache Trinidad, which originated as Oracle’s earlier 10g version of ADF Faces (current version is 12c 2.3), so I had actually used this in my first few ADF projects. Oracle opened the source to this library and contributed it to the Apache MyFaces project. This is under active development. One nice thing about Trinidad is that both JDeveloper and Eclipse with OEPE have tooling support for it, so you can get some of the drag and drop and code generation features of these tools without tying yourself to Oracle’s proprietary ADF Faces. Or you can add JSF and the Trinidad library to an Eclipse project without using OEPE.
Unfortunately, Trinidad is also not terribly popular among non-Oracle developers. It has been enhanced to keep up with the latest JSF standard, but not much for functionality.

PrimeFaces 

This is probably the most popular JSF library, with good documentation, good examples and sample code.  There is reasonable support for it in both NetBeans and in Eclipse (I have only used these and Oracle JDeveloper as my IDE.) but there isn't much WYSIWYG support.  Very cool is the ability, like ADF, to have themes, and even to let users switch themes.  And the themes are fairly standard - they are based on JQueryUI and tools for developing them are easily available.  You can even get a good number of themes from PrimeFaces' main developer, PrimeTek.  One interesting thing, is that there are projects for PrimeFaces-like UI that works with Angular - PrimeNG, and with React - PrimeReact.

Conclusion

So this is probably my last post on Infra for the time being. It remains an active project, but I still haven't chosen a UI from one of these or perhaps one of the JavaScript frameworks.  And development has slowed to a crawl.  Hard to come home from a hard day of development, and start up the computer for a private project.  But I've accumulated some nice tidbits on SQL Developer Data Modeler to share with my readers - coming to this blog real soon.

Wednesday, February 27, 2019

Some Thoughts About INFRA Database Implementation


So with the logical database design done, my attention turns to implementation. I certainly intend for the database to be in a Relational Database Management System (RDBMS). Since I used Oracle’s SQL Developer Data Modeler (SDDM) to do the logical design, I used its ability to engineer to a relational design. At this point I don’t want to share my relational design, but you can see its outline in the logical design. It is a fairly straight implementation with the only major additions to what you saw in the logical design being surrogate primary keys for some tables, and intersection tables to implement many to many relationships. I also refined the attributes a bit, which became columns in the tables.

Physical Database

For the database implementation, I want to make sure that INFRA is database independent. I want it not to matter whether it is Oracle, SQL Server, PostgreSQL, DB2, MariaDB, or whatever. As long as it supports ANSI SQL, I’d like it to work with just a little configuration to have the application connect to the database. But it isn’t enough to support ANSI SQL, because some things I need are not part of the ANSI standard. For instance, Oracle, SQL Server and PostgreSQL all support Sequences to set the values of surrogate primary keys, but the code for getting the next value of a Sequence is non-standard and thus different. Look at the differences in the way my SERVICE table is defined for three different RDBMSs.

Oracle XE 11g

CREATE SEQUENCE service_seq;

CREATE TABLE service (
    service_id           NUMBER NOT NULL,
    service_name         VARCHAR2(127 CHAR),
    service_maker        VARCHAR2(127 CHAR),
    service_type_code    VARCHAR2(30 CHAR) NOT NULL,
    product_name         VARCHAR2(127 CHAR),
    product_version      VARCHAR2(127 CHAR),
    environment_code     VARCHAR2(30 CHAR) NOT NULL,
    database_type_code   VARCHAR2(30 CHAR),
    as_type_code         VARCHAR2(30 CHAR),
    server_id            NUMBER NOT NULL
);

ALTER TABLE service ADD CONSTRAINT service_pk PRIMARY KEY ( service_id );

CREATE OR REPLACE TRIGGER service_service_id_trg BEFORE
    INSERT ON service
    FOR EACH ROW
    WHEN ( new.service_id IS NULL )
BEGIN
    :new.service_id := service_seq.nextval;
END;
/

SQL Server 16

CREATE SEQUENCE service_seq AS INT START WITH 1 INCREMENT BY 1;

CREATE TABLE service (
    service_id           INT NOT NULL DEFAULT NEXT VALUE FOR service_seq,
    service_name         VARCHAR(127),
    service_maker        VARCHAR(127),
    service_type_code    VARCHAR(30) NOT NULL,
    product_name         VARCHAR(127),
    product_version      VARCHAR(127),
    environment_code     VARCHAR(30) NOT NULL,
    database_type_code   VARCHAR(30),
    as_type_code         VARCHAR(30),
    server_id            INT NOT NULL
);

ALTER TABLE service ADD CONSTRAINT service_pk PRIMARY KEY ( service_id );

PostgreSQL 9

CREATE SEQUENCE service_seq;

CREATE TABLE service (
    service_id           INT4 DEFAULT nextval('service_seq') NOT NULL,
    service_name         VARCHAR(127),
    service_maker        VARCHAR(127),
    service_type_code    VARCHAR(30) NOT NULL,
    product_name         VARCHAR(127),
    product_version      VARCHAR(127),
    environment_code     VARCHAR(30) NOT NULL,
    database_type_code   VARCHAR(30),
    as_type_code         VARCHAR(30),
    server_id            INT4 NOT NULL
);

ALTER TABLE service ADD CONSTRAINT service_pk PRIMARY KEY ( service_id );

About IDENTITY

Just a note about another way of generating values for surrogate primary keys. SQL Server has the concept of an IDENTITY constraint that generates these values and it is older than Sequences in SQL Server. I prefer not to use them, partly because it is more difficult to insert values with the primary key already set. This is why I have the Oracle trigger set to assign a value from the sequence only if the key is null in the INSERT command.  But I will admit that the fact that my career has been largely using Oracle may also be a factor.  The use of the DEFAULT clause in SQL Server and PostgreSQL does the same – the value is only assigned from the sequence if it is NOT in the INSERT. Oracle 12c and later has an IDENTITY type of column that creates a sequence and assigns values from it behind the scenes. It also has enhanced the DEFAULT clause so you no longer need a trigger if you want an explicit sequence used instead of an implicit one. PostgreSQL does a similar thing by creating a hidden sequence for SERIAL columns.

Benefits of SDDM for Physical Database

The nice thing about using SDDM for generating your DDL for implementing your database is that it understands the differences between a number of RDBMSs and can generate SQL DDL for them from the same relational design. For generated keys, it gives you several options for how to generate – whether triggers, or default values, or Identity columns. And if it doesn’t do what you need to do out of the box, you can write your own generation code. See my post on scripting SDDM and a later example for more information.
Next time, I’ll talk about some application implementation considerations and options.

Friday, February 1, 2019

INFRA Database Design - Application Manager View


Finally, we get to the person who is supervising the overall application development effort. While this person has an interest in the other views, she/he has a particular interest in the work of the developers and the various versions or releases of the application system and its underlying application database(s).

Application System

This represents the product of the developers’ hard work, the actual application modules that show the screens that the users want and need to see. It includes both front-end modules that the users see, and the back-end modules that process the user input and prepare the user output. It can include the actual code behind those services that are being custom coded. It also includes code to talk to the various services whether custom written or not, and to the Application Databases.

Application Version

This may also be called an Application Release. It represents a frozen in time set of code for the modules in the Application System, often numbered according to local standards. For instance, release 1.0 might be the initial release of a system, while 1.2 is the second minor enhancement release. Application Versions are also often identified with a “tag” or “label” in the source control system. Task and Bug management software like JIRA often also have a place to record versions or releases. One goal of INFRA is to be able to coordinate its database with those of various Task Management and/or source control systems.

Application Database

This represents the various database structures and objects that contain and organize data used by one or more Application Systems. It may also include database procedural code like triggers. While an Application System can use objects in more than one Application Database and an Application Database can be used by more than one Application System, as a general rule, the relationship is one database to one application, especially for which application may update the database. Nonetheless, I have designed INFRA to be able to handle the less common many to many situation.

Database Version

This may also be called a Database Release. It represents a frozen in time set of objects in the Application Database, often numbered according to local standards. For instance, release 1.0 might be the initial release of a database, while 1.2 is the second minor enhancement release. It is usual to coordinate the database version numbers with the application version numbers – so that version 2.5 of the application goes with version 2.5 of its database. Even if you don’t do this, it is very important to make sure which version of the application matches which version of the database. Database Versions are also often identified with a “tag” or “label” in the source control system.

Entity Relationship Diagram for Application Manager View


This concludes my series on the database design for INFRA.  I am very interested in what you think of INFRA so far.  What did I get right?  What did I get wrong?  Where would you go from here?  Next time, I'll talk about implementation options.

Wednesday, January 30, 2019

INFRA Database Design - Deployment Manager View


The Deployment Manger role is probably the hardest one to describe, and has the most complicated database design of any part of INFRA. For one of the teams that I have worked with, it was an additional duty for our System Administrator, and the DBA (me) did the database portion of deployments. In fact this is a fairly common arrangement, with one team to deploy the application and another to deploy application database changes. As I said before, you may not have a person with this title, but you certainly have people doing the job.

I have seen bad things happen when Application System deployment and Application Database deployment are not coordinated carefully. These are two parts of the development effort that are often done separately, yet are very dependent on each other. If an application requires storage and retrieval of a new data item, and the database does not yet have the table column to contain this data, the application will start throwing errors. If a database has been changed to require uniqueness for an item, and the application tries to insert duplicate data, the application will start throwing errors.

Another aspect of this role is the DevOps movement, which tries to automate deployments and do regular frequent releases. Here, the Deployment Manager will control and configure continuous integration software like Jenkins. The Application Manager and Database Administrator need to make sure that source control software clearly contains tags to denote which versions of which source code goes with which release. As a Database Architect, I’ve found the database side of this to be particularly sensitive, and because it is often outside of the Deployment Manager’s area of expertise, I will often assist in configuring for database deployments. See my earlier post on Database Source Control and Deployment for more of my thoughts on the subject.

Services

I’ve already touched on Services from the System Administrator’s point of view. From the Deployment Manager’s point of view, Services are the targets of Application System deployments. So we are deploying a particular release or version of an Application System and/or Application Database to a service running on a particular server. A service might be a host for an application system, such as Tomcat, JBoss, or Glassfish, or the application or some service used by an application could be running as a stand-alone service, as in microservices. It might be running under some operating environment, like NodeJS or in a JVM. Or it might be a service that is already running such as an LDAP Directory, with no new deployment needed, except to make sure that configuration files for other parts of the system reference the correct service.

Environments

I haven’t touched much on List of Value (LOV) entities, as they are mostly lookup tables of possible values for attributes in other entities. But Environment is a bit special, and especially important to the Deployment Manager. It is also special for Developers, Testers, and Users. On our teams, we have generally had four or five environments:

Development is for the Developers and Deployment Managers don’t always get involved in deployment to this environment because developers do their own deployment after unit testing. It is okay for things in this environment to be unstable, and to have bugs. We hope to find and fix the bugs here before they go to the other environments. Sometimes, each developer has his/her own development environment, then deployment is done to a shared environment.

Test is for more formal testing. This is also a good place to debug deployment itself, to make sure deployment tools are configured properly. Deployment to Test is less frequent than to Development, but more frequent than later environments. Testers would like a fairly stable test environment because they want bugs they find to be repeatable, so they can clearly identify the version where the bug was found, and the steps needed to reproduce the unwanted behaviour.

Pre-Production and/or User Acceptance Test (UAT) are the final stages before a release goes to Production. In some shops, a different team does production releases than the team that does development and test releases. So pre-production may be where the production team makes sure that the development team has given them everything they need for deployment, and where they can review for standards compliance and security. UAT is where users can get their first look at the application system, and make sure that it meets their needs before it goes to production.

Production is of course where all development teams want their hard work to end up. Sadly, I’ve seen a few application systems that never made it to this environment.

Application Deployment and Database Deployment

These are two separate entities, but very similar. Each is a deployment of a particular version or release of a particular application system or application database to a particular set of services (and of course running on a particular set of servers). You’ll see many to many relationships all over this – since you can have several releases running in different places at the same time. For instance, developers working on release 1.2, while testers are still testing 1.1. An application might be monolithic deployed in one piece to one application server, or consist of several pieces. And there is a relationship of concern to me personally, where a version of the application system is tied to the version of the application database that it expects.

Entity Relationship Diagram for Deployment Manager View



Friday, January 25, 2019

INFRA Database Design – System Administrator View


I’ve only been an assistant System Administrator, but I know what SAs want to see. They want to see the servers, their network addresses and what is running on each one.

Servers

There are two kinds of servers, physical servers and virtual servers. Physical servers are actual computers, pieces of hardware. Thus, we include attributes for the characteristics of the hardware like manufacturer, model number, and serial number. We also include the location of the hardware.  They will either run operating systems that will directly host services, or operating systems like IBM’s venerable VM/CP (I mention this, because I once used it) or VMWare that host virtual machines. Sometimes an operating system that can run services will also run a virtual machine host as a service, like Oracle’s VirtualBox or Microsoft’s Windows Virtual PC. We also use servers that are hosted in the cloud, like AWS or Oracle’s platform as a service. I would not include containers such as Docker or Kubernetes as servers, though there are some parallels. I have a self referencing relationship to show that some servers host other servers. Please note the importance of distinguishing the physical servers from the virtual ones. We often want to know that certain servers are on separate physical machines. For instance, as a DBA, I want to be sure that if the physical device hosting my database fails, the backup data needed to recover it is on another device that survived the crash.

Aliases

Servers have name, but a server may have several names. In fact, even when I list an IP address for a server, it is the primary IP address only. You see, network gateways may in fact have an internal address for a server and an outside address. As far as names go, a Web Server may be configured to show a home page for one application when accessed by one name and for a different one when accessed by another name. The user may never know that they are really the same server.

Services

A Service is a sort of nebulous concept – it can mean all kinds of running software that provide things an application system needs. A Database Instance, which was already explained, is just one specific type of service. Others might include LDAP Directories like OpenLDAP, that provide identity services, Web Servers like Apache that show and get resources using the HTTP protocol, or Application Servers like Glassfish which do similar work but tend to have additional capabilities more suited to running application systems. There are also services that your application system might use that are not operated by your organization, like Bing Maps or Yahoo Finance. I included Application Service as a target for application system deployments for the Deployment Manager view but the two specific types of services in my design are NOT the only types available. This is where I’d put references to Docker or Kubernetes containers – which are useful ways of hosting your services.

Access Ports

One thing that services often do is open certain access ports on your server. SAs want to know which ports are open as each open port is a security concern. And of course others on the development and maintenance teams need to know how to access the services that their tools and application systems are using. Each port is associated with a particular service on a particular server and two services cannot use the same port on the same server. Therefore if Apache Web Server is using ports 80 and 443 on dev.mycompany.com, and Glassfish is also running on that server, Glassfish must use some other ports, like 8080 and 4443.

The full address of a service will include an alias or IP address, a port number, and maybe a resource name as well. Usually, an access port will also be associated with a particular standard protocol used to access the service. Protocols include those such as HTTP and HTTPS for unencrypted and encrypted web services, LDAP for accessing directories, or SMTP for mail servers. There are some protocols that are proprietary as well, such as SQL*Net for Oracle RDBMS. Some ports are standard for certain protocols, such as 80 for HTTP, while others are customary but not required.

Entity Relationship Diagram for System Administrator View



Friday, January 18, 2019

INFRA Database Design - DBA View


What does the DBA want to see? The DBA wants to see the databases, of course.

Database

So the Database Entity is going to be a central player in the part of the INFRA database that supports DBAs. What is a Database? It is a means of storing, retrieving, and organizing the data that is behind an application system. In short, it is your major tool to save the data and change data into information, A Database is managed by a Database Instance which is a special case of a Service – we’ll talk about Services when we look at some of the other views. In an earlier blog post I talked about databases and database instances in more detail. For our purposes here, a Database Instance is the executing software that manages the files that contain your Database. An instance is an instance of a particular vendor’s or source of database software such as Oracle, SQL Server, Cassandra, or whatever, and one of its attributes will be which version of this software is running.

Application Database

Where the Database entity represents a general term, the more specific meaning of the word, database, is the database objects that are created to store and retrieve the data for a particular application system. To distinguish the two, I have chosen to call this the Application Database. For instance, an HR application might have database objects for information about Employees and Departments, while an Order/Entry application might have objects for Customers, Orders and the Order Line Items in those orders. These objects might be stored in any kind of a database as decided by the developers of that application system. Notice that an Application Database can store data for more than one Application System and an Application System can use data in more than one Application Database. This is in recognition that some data is shared between application systems, though usually only one system may maintain the data while others only read it.

Database Version

My Database Version entity is really an Application Database Version – I just left off the word, “application”, for brevity. The version supports the fact that Application Databases change over time, just as the applications that use them do. So a Database is the target of a deployment of one and only one Application Database Version, but can contain versions of more than one Application Databases. As a general rule, it will contain only one version of any single Application Database. Not shown on the diagram below is the relationship between Application System Versions and Database Versions. Usually a version of an application system is developed side by side with a version of its database, so the Deployment Manager needs to know which two versions correspond. You’ll see this when we get to that view.

Entity Relationship Diagram for DBA’s View


Wednesday, January 16, 2019

INFRA Users

So who is the audience for INFRA?  I’ve identified four groups of people, each with its own perspective on the overall infrastructure:

System Administrators

These are the people who manage your physical and virtual servers.  They want to see what hardware they have, what operating systems (OS) are on them, what virtual machines (VM) and the services on each server.  A service can be a database, an application server, a web server, an LDAP directory, or maybe a microservice – anything needed to operate your applications.

Database Administrators (DBA)

Databases are essential parts of any application system – I have never seen an application that didn’t have a database behind it.  Early in my career, a database may have simply been a set of files.  Later, we were using Relational Database Management Systems (RDBMS) like Oracle, SQL Server, DB2, PostgreSQL or My SQL – and in fact, that is what I’m designing INFRA to use.  Today, you may see other forms of databases, like Hadoop or Cassandra. The DBAs are responsible for managing these important resources.  And of course, this is my particular area of expertise.

Deployment Managers

Eventually, your application needs to be deployed to the various services that operate it.  This is more complex than it used to be.  Early in my career, you compiled the application and just ran the compiled code.  But now you may have different pieces – REST services to provide access to databases, data to enter into LDAP directories, HTML, CSS, Javascript and image files to write to directories accessed by a web server, or JAR, EAR, or WAR files to install on an application server. You may not have a job specifically designated as “Deployment Manager”, but believe me, you have people doing this job.

Application Managers

These are the people responsible for the overall development effort for your applications.  They determine who will do the development work, and what will be developed.  In particular, these people decide which changes will go into each release of an application, and which release will be deployed in which environment.

About the Database Design

Considering that my specialty has been as a database architect, administrator and developer, it is only natural that I would begin the application design with a database design.  I consider the Logical Design to be an important part of the requirements definition of an application.  With this, I find out what “things” - formally called “entities” we are going to collect information about, what information is going to be collected, and how the entities are related to one another.  And once this is engineered to a Physical Design and a development database is created, many tools can use the database design to give you a head start toward developing the rest of the application.  For instance, you can often use a relational table to generate a first cut at the forms that the application will use to enter or update data in the table, or the first cut at pages that show this data.

And since I have been a DBA, it is only natural for me to start by designing the part of the database that will contain the information that I would most want to see in my view of INFRA.  That is the subject of my next post.

Tuesday, January 15, 2019

Introducing INFRA – an Application to View Your Infrastructure

On my last two projects, I was the database architect and administrator.  Although our projects were small, we needed to maintain multiple sets of servers, each with different versions of the applications, databases, identity services, and other support services.  We had a development environment, a test environment, a user acceptance test environment, a pre-production (aka staging) environment and of course, production.  While some of these shared servers, others needed separate servers.  Most servers were virtual machines, but not all. So it became difficult to answer questions like:

  • Which server is running our development Oracle database?
  • Is the test application running against the test database?
  • How can we quickly reset the user acceptance test environment for a new round of UAT?
  • The backup files for the production SQL Server database need to be written to separate physical hardware. Which servers with room for these files meet this requirement?
  • Can we take this server down for an upgrade today?
  • What version of the application is on the staging application server?

I’ve been thinking about this problem on and off for several years.  I’ve done some research, and I can’t seem to find a COTS application that stores this information.  There are source control systems that can distinguish between the source code for different versions of an application, and I use this for different versions of database objects.  JIRA, which is an application that we use a lot for task and bug tracking, can list Releases of an application, lets you identify changes for each release, and can read the data in your source control system to tie releases to the source code that was changed.  Virtual Machine (VM) control software used by our system administrators, like the control software for VMWare can identify which virtual servers are running and which physical servers are hosting them.  But nothing ties all of this information together.

So, being between jobs, I decided to start writing a new application to keep the lists of servers, databases, application releases, and other services.  And especially, to keep the relationships between them – what release of the application is on this server, which databases it is using, where are the user ids and passwords that can access this application on this server. The goal is to be able to answer the questions listed above and others like them.

I call my new application, INFRA.  It is a work in progress, but I thought I’d share the process with my readers.  The initial database design is pretty much done, so I’ll start with that.  I’ll be inviting you to share your thoughts too.  Tell me where I’ve gotten it right, where I’ve gotten it wrong.  Tell me if the whole idea is stupid – I’d love to hear that someone has already solved this problem.