Implementation Aspects of Temporal Databases Nykredit Center for Database Research

Title: Implementation Aspects of Temporal Databases
By: Kristian Torp
Advisor:  Christian S. Jensen
Status: Thesis defended on October 27, 1998

Background

This Ph.D. study concerns temporal relational database management systems (DBMSs). Temporal DBMSs add built-in support for storing and querying multiple versions of data to conventional relational DBMSs that only provide built-in support for one (the current) version of data. Multiple versions of data are useful in many application areas such as accounting, budgeting, decision support, financial services, inventory management, medical records, and project scheduling, to name but a few.

The temporal support for handling multiple versions of data is today typically implemented in an ad-hoc fashion in the application code. This support is implemented anew for each application being developed. Implementing temporal support in the application is time consuming and difficult using SQL-92 and conventional relational DBMSs. On the contrary, a temporal DBMS makes it very easy to handle multiple versions of history, current, and future data, because the query languages for such DBMSs have been enriched with high-level constructs for exactly these purposes. The code for handling multiple versions of data is thus moved from the applications to the DBMS. This move reduces the complexity of application development. In this way, the productivity of application programmers is improved when using a temporal DBMS.

With the clear benefits of temporal DBMSs, these systems should already be commercially available. However, this is not the case, in part because the temporal database research community has mostly focused on the design of temporal data models and query languages. Relatively little attention has been paid to implementation issues. For example, in an overview of temporal DBMS implementations, thirteen systems are described (in 1995), whereas a recent temporal database bibliography contains 1425 papers (in 1997). The research community has not provided evidence of the benefits of a temporal DBMS substantial enough for the major DBMS vendors to extend their conventional DBMSs to become temporal.

A number of recent changes have brought temporal DBMSs closer to commercial availability. First, the temporal research community has to a large extent agreed on a single conceptual temporal data models that unifies the many existing temporal data models. Further, in a very large and ambitious effort, a large group of members of the research community have cooperated in designing a complete temporal query language (TSQL2) for this model. The unified conceptual data model and its query language have caused a shift towards implementation aspects of temporal DBMSs.

Second, many DBMS vendors offer today very feature-rich implementations of variants of the relational model. This has lead the SQL standards community to include substantial extensions to the relational model in the upcoming SQL3 standard. One of these extensions is a temporal extension that reflects a part of the temporal functionality proposed by the temporal database research community.

Third, there has been a shift in the architecture proposed for implementing a temporal DBMS. There are two overall architectures for developing a temporal DBMS: (1) An integrated architecture where the DBMS kernel, (2) and a layered architecture where the new temporal functionality is added on top of an existing DBMS without altering the DBMS kernel. Comparing the two, the integrated architecture may lead to better query processing performance than the layered architecture, at the cost of not being as evolutionary as the layered architecture. Previously, it has generally been assumed that for efficiency reasons, an integrated architecture was need for implementing a temporal DBMS. However, several temporal DBMS prototypes using a layered architecture have shown that this architecture is a promising approach for implementing a temporal DBMS.

Finally, the effect of improvements in disk storage technology cannot be ignored; currently, disk capacity is increasing by 60% each year, disk prices are dropping by 60% each year, and I/O transfer rates are increasing by 40% each year. These improvements make it possible to store multiple versions of data still more cheaply and efficiently.

Description

This Ph.D. study presents novel techniques for the implementation of significant aspects of temporally enhanced DBMSs. Three general guidelines underlie these techniques. First, the techniques aim to provide better support for managing multiple versions of data than do current DBMSs, e.g., additional temporal functionality or more efficient support for existing functionality. Second, the new techniques should be easily integrated into existing DBMS architectures. Third, when new functionality is provided, the efficiency of the provided techniques, within the architecture chosen, is of concern.

An efficient implementation of the timeslice operator has been developed. The timeslice operator restores a previously current state of a temporal table and is used extensively in temporal DBMSs. The timeslice operator is computed via differential computation. To more efficiently process the timeslice operator, we provide a new B+-tree-like index structure well suited for append-only temporal data.

The study considers how to extend a query language such as SQL-92 with temporal functionality with minimal implementation efforts, reusing the services of an existing DBMS. Also three different meta-architectures for a layered implementation of temporal DBMSs are proposed. Each meta-architecture contains several specific layered architectures. These specific architectures are compared to a set of design criteria.

The study has proposed techniques for how to correctly and efficiently timestamp versions of data in the presence of transactions. The techniques are relevant to both layered and integrated architectures.

The use of the temporal variable NOW makes it easier to model now-relative facts. The study defines the semantics of modifications involving NOW. The semantics are not directly implementable in existing temporal data models. For this reason, implementable approximate semantics are proposed.

Further readings:

K. Torp, L. Mark, and C. S. Jensen, Efficient Differential Timeslice Computation IEEE Transactions on Knowledge and Data Engineering Vol. 10, No. 4, 1998
K. Torp, C. S. Jensen, and M. Böhlen, Layered Temporal DBMSs––Concepts and Techniques [.ps.gz]
K. Torp, C. S. Jensen, and R. T. Snodgrass, Stratum Approaches to Temporal DBMS Implementation [.ps.gz]
K. Torp, C. S. Jensen, and R. Snodgrass Effective Timestamping in Databases [.ps.gz]

Copyright © 1998 - 2000.  All rights reserved.