JCFlack's Database Development Journal: January 2019

Wednesday, January 30, 2019

INFRA Database Design - Deployment Manager View

The Deployment Manger role is probably the hardest one to describe, and has the most complicated database design of any part of INFRA. For one of the teams that I have worked with, it was an additional duty for our System Administrator, and the DBA (me) did the database portion of deployments. In fact this is a fairly common arrangement, with one team to deploy the application and another to deploy application database changes. As I said before, you may not have a person with this title, but you certainly have people doing the job.

I have seen bad things happen when Application System deployment and Application Database deployment are not coordinated carefully. These are two parts of the development effort that are often done separately, yet are very dependent on each other. If an application requires storage and retrieval of a new data item, and the database does not yet have the table column to contain this data, the application will start throwing errors. If a database has been changed to require uniqueness for an item, and the application tries to insert duplicate data, the application will start throwing errors.

Another aspect of this role is the DevOps movement, which tries to automate deployments and do regular frequent releases. Here, the Deployment Manager will control and configure continuous integration software like Jenkins. The Application Manager and Database Administrator need to make sure that source control software clearly contains tags to denote which versions of which source code goes with which release. As a Database Architect, I’ve found the database side of this to be particularly sensitive, and because it is often outside of the Deployment Manager’s area of expertise, I will often assist in configuring for database deployments. See my earlier post on Database Source Control and Deployment for more of my thoughts on the subject.

Services

I’ve already touched on Services from the System Administrator’s point of view. From the Deployment Manager’s point of view, Services are the targets of Application System deployments. So we are deploying a particular release or version of an Application System and/or Application Database to a service running on a particular server. A service might be a host for an application system, such as Tomcat, JBoss, or Glassfish, or the application or some service used by an application could be running as a stand-alone service, as in microservices. It might be running under some operating environment, like NodeJS or in a JVM. Or it might be a service that is already running such as an LDAP Directory, with no new deployment needed, except to make sure that configuration files for other parts of the system reference the correct service.

Environments

I haven’t touched much on List of Value (LOV) entities, as they are mostly lookup tables of possible values for attributes in other entities. But Environment is a bit special, and especially important to the Deployment Manager. It is also special for Developers, Testers, and Users. On our teams, we have generally had four or five environments:

Development is for the Developers and Deployment Managers don’t always get involved in deployment to this environment because developers do their own deployment after unit testing. It is okay for things in this environment to be unstable, and to have bugs. We hope to find and fix the bugs here before they go to the other environments. Sometimes, each developer has his/her own development environment, then deployment is done to a shared environment.

Test is for more formal testing. This is also a good place to debug deployment itself, to make sure deployment tools are configured properly. Deployment to Test is less frequent than to Development, but more frequent than later environments. Testers would like a fairly stable test environment because they want bugs they find to be repeatable, so they can clearly identify the version where the bug was found, and the steps needed to reproduce the unwanted behaviour.

Pre-Production and/or User Acceptance Test (UAT) are the final stages before a release goes to Production. In some shops, a different team does production releases than the team that does development and test releases. So pre-production may be where the production team makes sure that the development team has given them everything they need for deployment, and where they can review for standards compliance and security. UAT is where users can get their first look at the application system, and make sure that it meets their needs before it goes to production.

Production is of course where all development teams want their hard work to end up. Sadly, I’ve seen a few application systems that never made it to this environment.

Application Deployment and Database Deployment

These are two separate entities, but very similar. Each is a deployment of a particular version or release of a particular application system or application database to a particular set of services (and of course running on a particular set of servers). You’ll see many to many relationships all over this – since you can have several releases running in different places at the same time. For instance, developers working on release 1.2, while testers are still testing 1.1. An application might be monolithic deployed in one piece to one application server, or consist of several pieces. And there is a relationship of concern to me personally, where a version of the application system is tied to the version of the application database that it expects.

Entity Relationship Diagram for Deployment Manager View

Friday, January 25, 2019

INFRA Database Design – System Administrator View

I’ve only been an assistant System Administrator, but I know what SAs want to see. They want to see the servers, their network addresses and what is running on each one.

Servers

There are two kinds of servers, physical servers and virtual servers. Physical servers are actual computers, pieces of hardware. Thus, we include attributes for the characteristics of the hardware like manufacturer, model number, and serial number. We also include the location of the hardware. They will either run operating systems that will directly host services, or operating systems like IBM’s venerable VM/CP (I mention this, because I once used it) or VMWare that host virtual machines. Sometimes an operating system that can run services will also run a virtual machine host as a service, like Oracle’s VirtualBox or Microsoft’s Windows Virtual PC. We also use servers that are hosted in the cloud, like AWS or Oracle’s platform as a service. I would not include containers such as Docker or Kubernetes as servers, though there are some parallels. I have a self referencing relationship to show that some servers host other servers. Please note the importance of distinguishing the physical servers from the virtual ones. We often want to know that certain servers are on separate physical machines. For instance, as a DBA, I want to be sure that if the physical device hosting my database fails, the backup data needed to recover it is on another device that survived the crash.

Aliases

Servers have name, but a server may have several names. In fact, even when I list an IP address for a server, it is the primary IP address only. You see, network gateways may in fact have an internal address for a server and an outside address. As far as names go, a Web Server may be configured to show a home page for one application when accessed by one name and for a different one when accessed by another name. The user may never know that they are really the same server.

Services

A Service is a sort of nebulous concept – it can mean all kinds of running software that provide things an application system needs. A Database Instance, which was already explained, is just one specific type of service. Others might include LDAP Directories like OpenLDAP, that provide identity services, Web Servers like Apache that show and get resources using the HTTP protocol, or Application Servers like Glassfish which do similar work but tend to have additional capabilities more suited to running application systems. There are also services that your application system might use that are not operated by your organization, like Bing Maps or Yahoo Finance. I included Application Service as a target for application system deployments for the Deployment Manager view but the two specific types of services in my design are NOT the only types available. This is where I’d put references to Docker or Kubernetes containers – which are useful ways of hosting your services.

Access Ports

One thing that services often do is open certain access ports on your server. SAs want to know which ports are open as each open port is a security concern. And of course others on the development and maintenance teams need to know how to access the services that their tools and application systems are using. Each port is associated with a particular service on a particular server and two services cannot use the same port on the same server. Therefore if Apache Web Server is using ports 80 and 443 on dev.mycompany.com, and Glassfish is also running on that server, Glassfish must use some other ports, like 8080 and 4443.

The full address of a service will include an alias or IP address, a port number, and maybe a resource name as well. Usually, an access port will also be associated with a particular standard protocol used to access the service. Protocols include those such as HTTP and HTTPS for unencrypted and encrypted web services, LDAP for accessing directories, or SMTP for mail servers. There are some protocols that are proprietary as well, such as SQL*Net for Oracle RDBMS. Some ports are standard for certain protocols, such as 80 for HTTP, while others are customary but not required.

Entity Relationship Diagram for System Administrator View

Friday, January 18, 2019

INFRA Database Design - DBA View

What does the DBA want to see? The DBA wants to see the databases, of course.

Database

So the Database Entity is going to be a central player in the part of the INFRA database that supports DBAs. What is a Database? It is a means of storing, retrieving, and organizing the data that is behind an application system. In short, it is your major tool to save the data and change data into information, A Database is managed by a Database Instance which is a special case of a Service – we’ll talk about Services when we look at some of the other views. In an earlier blog post I talked about databases and database instances in more detail. For our purposes here, a Database Instance is the executing software that manages the files that contain your Database. An instance is an instance of a particular vendor’s or source of database software such as Oracle, SQL Server, Cassandra, or whatever, and one of its attributes will be which version of this software is running.

Application Database

Where the Database entity represents a general term, the more specific meaning of the word, database, is the database objects that are created to store and retrieve the data for a particular application system. To distinguish the two, I have chosen to call this the Application Database. For instance, an HR application might have database objects for information about Employees and Departments, while an Order/Entry application might have objects for Customers, Orders and the Order Line Items in those orders. These objects might be stored in any kind of a database as decided by the developers of that application system. Notice that an Application Database can store data for more than one Application System and an Application System can use data in more than one Application Database. This is in recognition that some data is shared between application systems, though usually only one system may maintain the data while others only read it.

Database Version

My Database Version entity is really an Application Database Version – I just left off the word, “application”, for brevity. The version supports the fact that Application Databases change over time, just as the applications that use them do. So a Database is the target of a deployment of one and only one Application Database Version, but can contain versions of more than one Application Databases. As a general rule, it will contain only one version of any single Application Database. Not shown on the diagram below is the relationship between Application System Versions and Database Versions. Usually a version of an application system is developed side by side with a version of its database, so the Deployment Manager needs to know which two versions correspond. You’ll see this when we get to that view.

Entity Relationship Diagram for DBA’s View

Wednesday, January 16, 2019

INFRA Users

So who is the audience for INFRA? I’ve identified four groups of people, each with its own perspective on the overall infrastructure:

System Administrators

These are the people who manage your physical and virtual servers. They want to see what hardware they have, what operating systems (OS) are on them, what virtual machines (VM) and the services on each server. A service can be a database, an application server, a web server, an LDAP directory, or maybe a microservice – anything needed to operate your applications.

Database Administrators (DBA)

Databases are essential parts of any application system – I have never seen an application that didn’t have a database behind it. Early in my career, a database may have simply been a set of files. Later, we were using Relational Database Management Systems (RDBMS) like Oracle, SQL Server, DB2, PostgreSQL or My SQL – and in fact, that is what I’m designing INFRA to use. Today, you may see other forms of databases, like Hadoop or Cassandra. The DBAs are responsible for managing these important resources. And of course, this is my particular area of expertise.

Deployment Managers

Eventually, your application needs to be deployed to the various services that operate it. This is more complex than it used to be. Early in my career, you compiled the application and just ran the compiled code. But now you may have different pieces – REST services to provide access to databases, data to enter into LDAP directories, HTML, CSS, Javascript and image files to write to directories accessed by a web server, or JAR, EAR, or WAR files to install on an application server. You may not have a job specifically designated as “Deployment Manager”, but believe me, you have people doing this job.

Application Managers

These are the people responsible for the overall development effort for your applications. They determine who will do the development work, and what will be developed. In particular, these people decide which changes will go into each release of an application, and which release will be deployed in which environment.

About the Database Design

Considering that my specialty has been as a database architect, administrator and developer, it is only natural that I would begin the application design with a database design. I consider the Logical Design to be an important part of the requirements definition of an application. With this, I find out what “things” - formally called “entities” we are going to collect information about, what information is going to be collected, and how the entities are related to one another. And once this is engineered to a Physical Design and a development database is created, many tools can use the database design to give you a head start toward developing the rest of the application. For instance, you can often use a relational table to generate a first cut at the forms that the application will use to enter or update data in the table, or the first cut at pages that show this data.

And since I have been a DBA, it is only natural for me to start by designing the part of the database that will contain the information that I would most want to see in my view of INFRA. That is the subject of my next post.

Tuesday, January 15, 2019

Introducing INFRA – an Application to View Your Infrastructure

On my last two projects, I was the database architect and administrator. Although our projects were small, we needed to maintain multiple sets of servers, each with different versions of the applications, databases, identity services, and other support services. We had a development environment, a test environment, a user acceptance test environment, a pre-production (aka staging) environment and of course, production. While some of these shared servers, others needed separate servers. Most servers were virtual machines, but not all. So it became difficult to answer questions like:

Which server is running our development Oracle database?
Is the test application running against the test database?
How can we quickly reset the user acceptance test environment for a new round of UAT?
The backup files for the production SQL Server database need to be written to separate physical hardware. Which servers with room for these files meet this requirement?
Can we take this server down for an upgrade today?
What version of the application is on the staging application server?

I’ve been thinking about this problem on and off for several years. I’ve done some research, and I can’t seem to find a COTS application that stores this information. There are source control systems that can distinguish between the source code for different versions of an application, and I use this for different versions of database objects. JIRA, which is an application that we use a lot for task and bug tracking, can list Releases of an application, lets you identify changes for each release, and can read the data in your source control system to tie releases to the source code that was changed. Virtual Machine (VM) control software used by our system administrators, like the control software for VMWare can identify which virtual servers are running and which physical servers are hosting them. But nothing ties all of this information together.

So, being between jobs, I decided to start writing a new application to keep the lists of servers, databases, application releases, and other services. And especially, to keep the relationships between them – what release of the application is on this server, which databases it is using, where are the user ids and passwords that can access this application on this server. The goal is to be able to answer the questions listed above and others like them.

I call my new application, INFRA. It is a work in progress, but I thought I’d share the process with my readers. The initial database design is pretty much done, so I’ll start with that. I’ll be inviting you to share your thoughts too. Tell me where I’ve gotten it right, where I’ve gotten it wrong. Tell me if the whole idea is stupid – I’d love to hear that someone has already solved this problem.

Upcoming Events

Collaborate 2020

April 19-24, 2020
Mandalay Bay Resort and Casino Las Vegas, NV

This will be the first Collaborate conference since the Independent Oracle Users Group (IOUG) and Quest merged. I haven't been to Collaborate for a few years, because the best content in the Database part of the conference is very DBA-oriented, and I've surrendered my DBA hat. But if you are primarily a DBA or database developer, Collaborate is well worth your time.

ODTUG K'Scope 2020

June 28 - July 2, 2020
Sheraton Boston, Boston, MA

This is my favorite Oracle conference. It is completely devoted to people who make their livings developing applications with Oracle tools and Oracle databases, not just the original Oracle RDBMS, but also APEX, OBIEE, Essbase, Hyperion, My SQL, TimesTen etc. Very technical, little hype.

East Coast Oracle Users Conference (ECO)

October 29-30, 2019
Hilton North Raleigh/Midtown Hotel, Raleigh, NC

This is a very good regional Oracle conference. And it is the closest for me. It has content for Oracle RDBMS DBAs and developers, as well as content for people who work with the Oracle applications. Even though it is regional, it attracts speakers with national reputations and good representation from Oracle.

Oracle OpenWorld

September 16-19, 2019, San Francisco, CA

This is Oracle's big show and this year, they are expanding to events in London, Dubai, and Singapore. If you want to know what Oracle is up to, go to one of these. Oracle Develop is no longer a formal track at OpenWorld. That doesn't mean that you can't find good technical content - you can, especially on User Group Sunday. But I am reconsidering whether there is enough to justify my attendance.