NDB: Temporal Query Processing

Temporal Query Processing

Title:	A Middleware Approach to Temporal Query Processing
By:	Giedrius Slivinskas
Advisor:	Christian S. Jensen
Status:	Completed September 7, 2001

Description

Most real-world database applications manage time-referenced data. For example, this aspect applies to financial, medical, and travel applications; and being time-variant is one of Inmon's defining properties of a data warehouse. Recent advances in temporal query languages show that such applications may benefit substantially from a DBMS with built-in temporal support. The potential benefits are several: application code is simplified and more easily maintainable, thereby increasing programmer productivity, and more data processing can be left to the DBMS, potentially leading to better performance.

In contrast, the built-in temporal support offered by current database products is limited to predefined, time-related data types, e.g., the Informix TimeSeries DataBlade and the Oracle8 TimeSeries cartridge, and extensibility facilities that enable the user to define new, e.g., temporal, data types. However, temporal support is needed that goes beyond data types and extends the query language itself.

Developing a DBMS with built-in temporal support from scratch is a daunting task that may only be accomplished by major DBMS vendors that already have a DBMS to modify and have large resources available. This has led to the consideration of a layered approach where a layer, implementing temporal support, is interposed between the applications and a conventional DBMS. The layer maps temporal SQL statements to regular SQL statements and passes them to the DBMS, which is not altered. With this approach, it is feasible to support a temporal SQL that strictly extends SQL, thus not affecting legacy applications. However, this approach presents difficulties of its own: for example, some temporal operations, such as temporal aggregation or coalescing, are quite inefficient when evaluated using SQL, but can be evaluated efficiently with application code that uses a cursor to access the underlying data.

This Ph.D. project develops temporal query representation, optimization, and processing framework needed for introducing temporal support both via a stand-alone temporal DBMS and via a layer on top of a conventional DBMS. The layered approach is generalized by moving some of the query evaluation into the layer, which performs temporal operations itself when it can achieve better results than when passing the job to the underlying DBMS. This approach is termed "temporal middleware."

During the first year and a half year of the project, the foundation for temporal query optimization was developed, including an algebra for temporal query representation, a comprehensive set of transformation rules, and a query plan enumeration algorithm. The algebra enhances existing relational algebras based on multisets by integrating the handling of order and adding temporal support, and the transformation rules are divided into different types according to how they deal with duplicates, order, and time periods. By capturing duplicate removal and retention and order preservation for all queries, as well as coalescing for temporal queries, the foundation formalizes and generalizes existing approaches.

To validate the proposed foundation, temporal middleware architecture has been designed and implemented during the next year. To optimize and process queries, the middleware employs Volcano extensible query optimizer and XXL library of query processing algorithms, as well as Oracle as the underlying DBMS. Volcano was significantly extended to include new operators, algorithms, transformation rules, as well as to have support for different types of transformation rules; and the XXL library was augmented by new temporal algorithms. Performance experiments have been conducted, showing that performing some query processing in the middleware in some cases improves query performance up to an order of magnitude over performing it all in the DBMS.

Further readings:

	G. Slivinskas, A Middleware Approach to Temporal Query Processing, Ph.D. Thesis. [.pdf]
	G. Slivinskas and C. S. Jensen, Enhancing an Extensible Query Optimizer with Support for Multiple Equivalence Types, in Proceedings of ADBIS East-European Conference on Advances in Databases and Information Systems, Vilnius, Lithuania, 2001. [.pdf]
	G. Slivinskas, C. S. Jensen, and R. T. Snodgrass, Adaptable Query Optimization and Evaluation in Temporal Middleware, in Proceedings of ACM SIGMOD International Conference on Management of Data, Santa Barbara, CA, 2001. [.pdf]
	G. Slivinskas, C. S. Jensen, and R. Snodgrass, Adaptable Query Optimization and Evaluation in Temporal Middleware, Time Center technical report [.ps.gz] [.pdf]
	G. Slivinskas, C. S. Jensen, and R. T. Snodgrass, "A Foundation for Conventional and Temporal Query Optimization Addressing Duplicates and Ordering," IEEE Transactions on Knowledge and Data Engineering (special issue with extended versions of best papers from ICDE'2000), Volume 13, Number 1, 2001. [.pdf]
	G. Slivinskas, C. S. Jensen, and R. Snodgrass, Query Plans for Conventional and Temporal Queries Involving Duplicates and Ordering, in Proceedings of the 16th International Conference on Data Engineering, San Diego, CA, 2000 [.ps] [.pdf]
	G. Slivinskas, C. S. Jensen, and R. Snodgrass, A Foundation for Conventional and Temporal Query Optimization Addressing Duplicates and Ordering, Time Center technical report [.ps.gz] [.pdf]