Multi-Dimensional Conditional Schema Evolution in Relational Databases Nykredit Center for Database Research

Title: Multi-Dimensional Conditional Schema Evolution in Relational Databases
By: Ole Guttorm Jensen
Advisor:  Michael H. Böhlen
Status: Defended September 13, 2004

Description

Change is a fundamental but sometimes neglected aspect of database systems. In particular, changes in the real world often result in modifications to the database structure by evolving the schema. As database systems and other information systems become progressively more interdependent, changes to the database schema is no longer a problem local to the database but can necessitate changes in all systems interacting with the database. The management of change and the ability of the database system to deal with change is an essential component in developing and maintaining truly useful systems.

This Ph.D. thesis develops a formal foundation for conditional schema evolution in relational databases. Conditional schema changes change the schema of the tuples in a relation that satisfy the change condition. This class of schema changes is more general than traditional (unconditional) schema changes and temporal schema changes investigated by existing work. We develop the multi-dimensional conditionally evolving schema as a conceptual model for conditionally evolving schemas. The multi-dimensional conditionally evolving schema is the basis for practical data structures and algorithms developed in the thesis.

Changing the schema of a populated database leads to mismatches between the intended schema of a tuple and the schema used to record the tuple in the first place. The traditional approach of resolving the mismatch by migrating the (non-fitting) tuples to the new schema is lossy. The thesis proposes to keep track of the mismatches at the level of individual tuples, and develops the mismatch extended completed schema to this purpose. A salient feature of this model is that schema changes can be dealt with standard tuple updates.

The thesis proposes a parametric approach to resolve the mismatches between the intended and recorded schemas of tuples at query time. This allows mismatches to be resolved according to the needs of the application through the specification of a policy. Algorithms for mismatch resolution of relations based on the mismatch extended completed schema are developed in the thesis.

As the schema evolves the database system is required to handle legacy tuples. Such tuples are already stored in the database if tuples are not migrated to fit the new schema, and also come from legacy applications, which continue to issue queries and assertions after a schema change has been committed. The thesis develop efficient algorithms for the classification of tuples. These algorithms along with the default policies for mismatch resolution allows for transparent schema evolution, so users and applications need not be aware that the database is evolving.

In temporal schema versioning schema changes are related to different time dimensions. The thesis specializes conditional schema evolution to conditions over time recording attributes. This leads to optimizations in the space complexity of the multi-dimensional conditionally evolving schema, and facilitates a comparison between conditional schema evolution and temporal schema versioning. For transaction-time the two approaches are equivalent. In valid-time schema versioning, a schema change is applied to a single schema version specified by the user irrespective of the validity of the change and the version. In conditional schema evolution a schema change changes the schema of all segments with a validity that overlaps the validity of the change.

Further readings:

Ole Guttorm Jensen, Michael H. Böhlen: Evolving Relations. In Database Schema Evolution and Meta-Modeling, 9th International Workshop on Foundations of Models and Languages for Data and Objects of Springer LNCS 2065, pages 115-132, 2000.

Ole Guttorm Jensen, Michael H. Böhlen: Current, Legacy, and Invalid Tuples in Conditionally Evolving Databases. In Second International Conference on Advances in Information Systems, ADVIS 2002, Izmir, Turkey, October 23-25, 2002, Proceedings of Springer LNCS 2457, pages 65-82, 2002.

Ole Guttorm Jensen, Michael H. Böhlen: Lossless Conditional Schema Evolution. In 23rd International Conference on Conceptual Modeling, ER 2004, Shanghai, China, November 2004, Proceedings of Springer LNCS 3288, pages 610-623, 2004.

Ole Guttorm Jensen, Michael H. Böhlen: Multitemporal Conditional Schema Evolution. In Third International Workshop on Evolution and Change in Data Management (ECDM 2004), Proceedings of Springer LNCS 3289, pages 441-456, 2004.

 

Copyright © 1998 - 2005.  All rights reserved.