
In the previous four installments of this special series, Ive explained the challenges of expressing integrity constraints, queries, and modifications on time-varying data in SQL. This final installment looks to the future, examining enhancements to SQL that bring temporal processing to the masses. With just a few additional concepts, SQL can just as easily express temporal queries as it already does for nontemporal queries.
Many knotty problems arise in SQL when time-varying data is involved:
*Avoiding duplicates in a time-varying table requires an aggregate or complex trigger.
*A simple three-line join when applied to time-varying tables explodes to a 37-line query consisting of four SELECT statements or a complex 20-line statement with four CASE expressions.
*A three-line UPDATE of a time-varying table translates into five modification statements totaling 29 lines.
*Maintaining an audit log requires several triggers comprising some three dozen lines.
What is the source of this daunting complexity? While SQL-92 supports time-varying data through the DATE, TIME, and TIMESTAMP data types, the language really recognizes no notion of a time-varying table. SQL also has no concept of current or sequenced constraints, queries, modifications, or views, nor of the critical distinction between valid time (modeling the behavior of the enterprise in reality) and transaction time (capturing the evolution of the stored data). To use the terminology I introduced in this series, SQL supports nonsequenced operations onlywhich, as weve seen, are often the least useful.
Fortunately, the standards committees are now considering specific proposals for temporal support in SQL3 (see the sidebar, Building the Standard, page 48) and vendors are starting to incorporate them into products. Ill summarize these new SQL3 constructs and revisit the applications discussed previously in this series, showing how these constructs greatly simplify writing SQL for time-varying applications.
In the following, when I refer to a SQL3 construct, I mean those constructs introduced in the proposals referenced in the sidebar or those already present in the draft SQL3. I should emphasize that these proposals are still under consideration for SQL3. The constructs may well change; indeed, SQL3 as a whole is being refined as it inches toward publication.
SQL3 adds a new series of data types: PERIOD(DATE), of a day granularity; PERIOD (TIME), of a second granularity; and PERIOD (TIMESTAMP), of a microsecond granularity, with additional variants possible by specifying a precisionfor example, PERIOD (TIMESTAMP(3)) has a granularity of 10-3 seconds (milliseconds) and a range of 9,999 years.
SQL3 also introduces the notion of a table with temporal support. The table can have valid-time support, transaction-time support, or both (bitemporal support). Finally, SQL3 provides facilities for current and sequenced operations, and retains the ability to perform nonsequenced operations.
The first article in this series (Of Duplicates and Septuplets, June 1998) considered a table called NICUStatus and a uniqueness integrity constraint. Lets assume that the application was initially nontemporal in that it captured in the NICUStatus table only the current situationthe infants present in the NICU and their current status:
CREATE TABLE NICUStatus ( Name CHAR(15), Status CHAR(8), UNIQUE (Name, Status) )
We now wish to retain the history of this information, or how the status of infants varied over time. In SQL3, you can meet this goal with an ALTER statement:
ALTER TABLE ADD VALIDTIME PERIOD(DATE)
Here we add valid-time support at the granularity of a day. Each row of the table is now associated with a valid-time period. The rows in the table when we added this valid-time support are associated with the period from now to forever.
SQL3 is temporally upward compatible, meaning that the nontemporal application is unaffected by the addition of temporal support. The benefit of this important property is that the tens of thousands of lines of code associated with the NICU application continue to work as before, without altering a single line of code. The DBMS now interprets the queries in the application code on this valid-time supported table as current queries. To ascertain which infants are currently in serious condition, the query:
SELECT Name FROM NICUStatus WHERE Status = serious
works just as before.
The DBMS now interprets modifications as current modifications, thereby capturing the history. To update Alexis Mays status to fair, the modification:
UPDATE NICUStatus SET Status = fair WHERE Name = Alexis May
still works fine and automatically retains the prior status as a side effect of the table having valid-time support.
Integrity constraints are handled in the same way. The initial uniqueness constraint, specified when the table was created, is considered a current constraint: No infant can (currently) have two identical status values.
Expressing the sequenced analogat no time can an infant have two identical conditionsrequires an aggregate or trigger in SQL-92. In SQL3, the keyword VALIDTIME indicates sequenced statements:
ALTER TABLE NICUStatus ADD CONSTRAINT VALIDTIME UNIQUE (Name, Status)
One line in SQL3 suffices for many in SQL-92. Well repeatedly encounter similarly dramatic reductions in code size as we express our application in SQL3.
In the July 1998 installment (Querying Valid-Time State Tables), I looked at recording the movement of cattle among pens. Here a valid-time table is appropriate as well:
CREATE TABLE LOT_LOC ( FDYD_ID INT, LOT_ID_NUM INT, PEN_ID INT, HD_CNT INT ) AS VALIDTIME PERIOD(DATE)
As before, current queries such as How many head of cattle from lot 219 in feed yard 1 are (currently) in each pen? require no special attention:
SELECT PEN_ID, HD_CNT FROM LOT_LOC WHERE FDYD_ID = 1 AND LOT_ID_NUM = 219
The sequenced variant Give the history of the number of head of cattle from lot 219 in feed yard 1 in each pen is trivial. Just prepend VALIDTIME:
VALIDTIME SELECT PEN_ID, HD_CNT FROM LOT_LOC WHERE FDYD_ID = 1 AND LOT_ID_NUM = 219
The nonsequenced variant How many head of cattle from lot 219 in yard 1 were, at some time, in each pen? is also easy to specify; just prepend NONSEQUENCED VALIDTIME:
NONSEQUENCED VALIDTIME SELECT PEN_ID, HD_CNT FROM LOT_LOC WHERE FDYD_ID = 1 AND LOT_ID_NUM = 219
These two rules (prepend VALIDTIME for sequenced and NONSEQUENCED VALIDTIME for nonsequenced) apply to all nontemporal SQL statements. Consider joins. Anyone who knows SQL could write the current join Which lots are coresident in a pen? in about two minutes (see Listing 1). The sequenced version, Give the history of
requires but a few more seconds to write in SQL3; you simply add one keyword (see Listing 2). You may recall from the July 1998 installment that in SQL-92 this query is 37 lines long! The nonsequenced version Which lots were in the same pen, perhaps at different times? requires adding NONSEQUENCED.
SELECT L1.LOT_ID_NUM, L2.LOT_ID_NUM, L1.PEN_ID FROM LOT_LOC AS L1, LOT_LOC AS L2 WHERE L1.LOT_ID_NUM < L2.LOT_ID_NUM AND L1.FDYD_ID = L2.FDYD_ID AND L1.PEN_ID = L2.PEN_ID |
VALIDTIME SELECT L1.LOT_ID_NUM, L2.LOT_ID_NUM, L1.PEN_ID FROM LOT_LOC AS L1, LOT_LOC AS L2 WHERE L1.LOT_ID_NUM < L2.LOT_ID_NUM AND L1.FDYD_ID = L2.FDYD_ID AND L1.PEN_ID = L2.PEN_ID |
The August 1998 article (Modifying Valid-Time State Tables) considered modifications applied to the following table, with valid-time support, which captures the (changing) gender of lots of cattle:
CREATE TABLE LOT ( LOT_ID_NUM INT, GNDR_CODE CHAR(1) ) AS VALIDTIME PERIOD(DATE)
Current modifications on such tables require no additional keywords:
A new lot of heifers arrives today.
INSERT INTO LOT VALUES (433, h)
Lot 101 is leaving the feed yard.
DELETE FROM LOT WHERE LOT_ID_NUM = 101
The cattle in lot 799 are being steered today.
UPDATE LOT SET GNDR_CODE = s WHERE LOT_ID_NUM = 799
This simple update when expressed in SQL-92 requires an INSERT and two UPDATEs, or 16 lines of SQL-92 code.
By now, you may have guessed (correctly) that the keyword VALIDTIME signals a sequenced modification. The period of applicability is assumed to be over all time; it can also be explicitly stated with a period expression following this keyword:
Lot 426, a collection of heifers, was on the feed yard from March 26 to April 14.
VALIDTIME PERIOD [1998-03-26 - 1998-04-14) INSERT INTO LOT VALUES (426, h)
Here we see the first use of a period literal. The left bracket ( [ ) indicates the period includes March 26; the right parenthetical ( ) ) specifies that the period ends just before (or during) April 14.
Sequenced modifications can apply in the past, in the future, or in the past through the future:
Lot 234 will be absent from the feed yard for the first three weeks of October, when the steering will take place.
VALIDTIME PERIOD [1998-10-01 - 1998-10-22) DELETE FROM LOT WHERE LOT_ID_NUM = 234
Consider The lot was steered only for the month of March, which, while difficult to effect, does illustrate a sequenced update:
VALIDTIME PERIOD [1998-03-01 - 1998-04-01) UPDATE LOT SET GNDR_CODE = s WHERE LOT_ID_NUM = 799
This update, when expressed in SQL-92, requires two INSERT statements and three UPDATE statements, or some 29 lines of code.
The most recent installment of this special series (Transaction-Time State Tables, September 1998) considered transaction timespecifically, capturing the succession of states of the Washington Double Star Catalog:
CREATE TABLE WDS_TT ( RA_Hour INT, RA_Min INT, RA_Sec INT, Dec_Degree INT, Dec_Minute INT, Discoverer CHAR(7), Mag_First DECIMAL(5,2) ) AS TRANSACTIONTIME
Here the precision is automatically assigned by the DBMS but is sufficient to distinguish transactions (and so is probably on the order of microseconds, if the DBMS can sustain a high transaction processing rate).
Remember that maintaining this audit log in SQL-92 requires three onerous triggers, or 37 lines of code; SQL3 requires but two additional keywords. More important, SQL3 will ensure the maintenance of transaction-time semantics so accessing prior states will obtain the correct result. Guaranteeing this outcome in SQL-92 is impossible; anyone with update permission on the table could modify the audit log to indicate a different sequence of changes than had actually occurred.
As before, current queries (as currently known) require nothing extra on tables with temporal support:
Which stars are of magnitude 11 or brighter, as currently known?
SELECT Discoverer FROM WDS WHERE Mag_First <= 11.0
Sequenced queries (When was it recorded) over all time are easy to express in SQL3:
When was it recorded that A1248 had a magnitude other than 10.5?
TRANSACTIONTIME SELECT Mag_First FROM WDS_TT WHERE Discoverer = A 1248 AND Mag_First <> 10.5
When was it recorded that a star had a magnitude equal to that of A1248?
TRANSACTIONTIME SELECT W2.Discoverer, FROM WDS_TT AS W1, WDS_TT AS W2 WHERE W1.Discoverer = A 1248 AND W2.Discoverer <> W1.Discoverer AND W1.Mag_First = W2.Mag_First
This query is twice as long and much more complicated in SQL-92.
Nonsequenced queries, in contrast, are similar in SQL-92 and SQL3. In When was the RA_Sec position of a double star corrected? corrected means we look at two consecutive transaction-time states (that is, their timestamps meet). This query is nonsequenced because it compares states at two different transaction times. Consistent with valid time, to access the transaction time in a nonsequenced operation, use TRANSACTIONTIME() (see Listing 3).
NONSEQUENCED TRANSACTIONTIME SELECT W1.Discoverer, W1.RA_Sec AS Old_Value, W2.RA_Sec AS New_Value, END (TRANSACTIONTIME(W1)) AS When_Changed FROM WDS_TT AS W1, WDS_TT AS W2 WHERE W1.Discoverer = W2.Discoverer AND W1.RA_Sec <> W2.RA_Sec AND TRANSACTIONTIME(W1) MEETS TRANSACTIONTIME(W2) |
As valid time and transaction time are orthogonal, you can easily use them together in SQL3:
CREATE TABLE WDS_B ( Discoverer CHAR(7), Mag_First DECIMAL(5,2) ) AS VALIDTIME PERIOD(DATE) AND TRANSACTIONTIME
With six keywords, we get a table with valid-time as well as transaction-time supportthat is, a bitemporal table.
CREATE VIEW WDS_VT_AS_OF_Jan_1 AS VALIDTIME AND NONSEQUENCED TRANSACTIONTIME SELECT Discoverer, Mag_First FROM WDS_B WHERE TRANSACTIONTIME(WDS_B) OVERLAPS DATE 1994-01-01 |
Listing 4. Give the history as best known at the start of 1994.
For example, as Listing 4 shows, What was the history recorded as of January 1, 1994? is sequenced in valid time (history) and a timeslice in transaction time (recorded as of). List the corrections made on plates taken in the 1920s implies sequenced in valid time (taken in the 1920s) and nonsequenced in transaction time (corrections, expressed with MEETS) (see Listing 5). This query expressed in SQL-92 requires 24 lines of highly complex code.
VALIDTIME AND NONSEQUENCED TRANSACTIONTIME SELECT B1.Discoverer, END(TRANSACTIONTIME(B1)) AS When_Changed FROM WDS_B AS B1, WDS_B AS B2 WHERE B1.Discoverer = B2.Discoverer AND B1.Trans_Stop = B2.Trans_Start AND TRANSACTIONTIME(B1) MEETS TRANSACTIONTIME(B2) AND VALIDTIME(B1) OVERLAPS PERIOD [1920- 01-01 - 1929-12-31] |
| Listing 5. List the corrections made on plates taken in the 1920s. |
Current modifications will automatically track the behavior of valid as well as transaction time:
A photographic plate indicates that the magnitude of A1248 is 10.5.
UPDATE WDS_B SET Mag_Start = 10.5 WHERE Discoverer = A 1248
Current modifications in SQL3 require absolutely no changes when expressed on tables with valid-time, transaction-time, or bitemporal support, as this example illustrates. The same holds for queries, constraints, views, and so on. Expressing this modification in SQL-92that is, manually managing the valid and transaction timestamps in the tableis quite challenging. Of course, sequenced and nonsequenced (in valid-time) modifications are also relevant on a bitemporal table.
For evidence of the advantages of the new SQL3 constructs, try expressing the following in SQL-92. They can be formulated in SQL3 in just a few minutes but would take an SQL-92 expert many hours to express in that language:
LOT_LOC.LOT_ID is a (sequenced) foreign key to LOT, meaning that at every point in time the value of LOT_ID is in LOT at that time.
ALTER TABLE LOT_LOC ADD CONSTRAINT VALIDTIME FOREIGN KEY LOT_ID REFERENCES LOT
Give the history of the number of cattle in pen 1.
VALIDTIME SELECT COUNT(*) FROM LOT_LOC WHERE PEN_ID = 1
Ten head of cattle were added today to lot 219. This is a current update.
UPDATE LOT_LOC SET HD_CNT = HD_CNT + 10 WHERE LOT = 219
When was it recorded that A1248 had a magnitude but no other stars were known of that magnitude? This is sequenced in transaction time (when was it recorded).
TRANSACTIONTIME SELECT Mag_First FROM WDS_B AS W1 WHERE Discoverer = A 1248 AND NOT EXISTS (SELECT * FROM WDS_B AS W2 WHERE W1.Mag_First = W2.Mag_First)
A photographic plate taken in October 1994 indicated that the magnitude of A1248 is 10.5 is sequenced in valid time; SQL3 automatically supplies the semantics of transaction time.
VALIDTIME PERIOD [1994-10-16 - 9999-12-31) UPDATE WDS_B SET Mag_Start = 10.5 WHERE Discoverer = A 1248
To recap, here are five requirements that must be satisfied to claim that a language or DBMS provides temporal support:
*It supports valid time and transaction time in a compatible and orthogonal manner. In particular, the language or DBMS must guarantee the semantics of transaction time, where the state as of a time in the past can be reconstructed.
*It ensures upward compatibility. Existing constructs applied to nontemporal data should operate exactly as before. This requirement is fairly easy to satisfy.
*It ensures temporal upward compatibilityan existing nontemporal application will not be broken when a table acquires temporal support, say via an ALTER TABLE statement. No changes to application code should be required when the history of the enterprise (valid time), the sequence of changes to the data (transaction time), or both are retained. This implies, for example, that a conventional query on tables with temporal support should be interpreted as a current query.
*Sequenced variants should be easy to express for all constructs of the language, including queries, modifications, views, assertions and constraints, and cursors. In particular, complex rewritings of the statement should be unnecessary.
*Nonsequenced variants should also be easy to express. In part, such variants let data with temporal support be converted to and from data without temporal support.
The SQL3 constructs discussed here satisfy these requirements. Heres why:
*You can add valid time to a table via AS VALIDTIME PERIOD, and you can add transaction time with AS TRANSACTIONTIME. Transaction time allows only current modifications, ensuring that timeslices will be correct. You can use either kind of time individually, or together, forming a bitemporal table.
*SQL/Temporal is defined as an upward compatible extension of the other parts of SQL3.
*All conventional queries (modifications, views, assertions, constraints, and cursors) on tables with temporal support are interpreted as current queries (modifications and so on). As an example, when we added valid-time support to the NICUStatus table, the existing code of this applicationperhaps tens of thousands of linesdid not require a single change.
*You can convert a query to a sequenced query in SQL3 simply by prepending the keyword VALIDTIME. This also holds for modifications (VALIDTIME UPDATE, for example), views (CREATE VIEW AS VALIDTIME SELECT), and constraints (VALIDTIME UNIQUE). Of course, this capability also applies to transaction time via the TRANSACTIONTIME keyword.
*Nonsequenced statements require the additional keyword NONSEQUENCED. The valid timestamp associated with a row is accessible via the function VALIDTIME() and the transaction timestamp via TRANSACTIONTIME().
As Ive shown with sample SQL3 statements, these proposed constructs (three new reserved words, VALIDTIME, TRANSACTIONTIME, and NONSEQUENCED, in addition to those already in SQL/Temporal) can greatly simplify application development, often reducing the amount of required SQL code by a factor of 10 or more, while improving the comprehensibility of that code. We can look forward to the day when these constructs are in the SQL standard, and even more important, when they are supported by products.
Rick Snodgrass is a professor of computer science at the University of Arizona. He chairs ACM SIGMOD, has written many papers and several books on temporal databases, and consults on designing and implementing time-varying databases. He is working with the ANSI and ISO SQL3 committees to add temporal support to that language. His book Developing Time-Oriented Applications in SQL will be published by Morgan Kaufmann early in 1999. You can reach Rick at rts@cs.arizona.edu or visit his Web page at www.cs.arizona.edu/people/rts, where you can find code samples from this article.
Building the StandardSQL-86 and SQL-89 have no notion of time. SQL-92 added DATETIME and INTERVAL data types, although no product has yet been validated for conformance to this standard. (Some products are validated at the entry level, which does not include the temporal data types.) However, the temporal database research community has long recognizedas the case studies in this special series illustratethat these data types alone are inadequate. Momentum for a temporal extension to SQL began at the Workshop on an Infrastructure for Temporal Databases in Arlington, Texas in June 1993. Subsequently, the TSQL2 committee formed, producing a preliminary language specification the following January. It completed the final version in September 1994, and a book describing the language and examining in detail its underlying design decisions was distributed at the VLDB International Workshop on Temporal Databases in Zurich in September 1995. The ANSI and ISO SQL3 committees became involved in late 1994. A new part for SQL3, called SQL/ Temporal, was proposed and formally approved by the ISO SQL3 in Ottawa, Canada in July 1995 as Part 7 of the SQL3 draft standard. Jim Melton agreed to edit SQL/Temporal. The first task was to define a PERIOD data type, which is now included in Part 7. Discussions then commenced on adding further temporal support. Two change proposals resulted, one on valid time support and one on transaction time support. The ANSI SQL3 committee (ANSI X3H2) unanimously approved these change proposals for consideration by the ISO SQL3 committee (ISO/IEC JTC 1/SC 21/WG 3 DBL). You can read the full story at www.cs. arizona.edu/people/rts/tsql2.html. In the meantime, the SQL committees decided to focus on Parts 1, 2, 4, and 5 of the SQL3 draft standard. These parts are expected to be finalized as an international standard next year. At that time, the committees will revisit the other parts and move them through the exhaustive process toward standardization.
Suggested Resources Snodgrass, R.T. (editor). The Temporal Query Language TSQL2. Kluwer Academic Publishers, 1995.
Snodgrass, R.T. et al. Adding Valid Time to SQL/Temporal. Change proposal, ANSI X3H2-96-501r2, ISO/IEC JTC 1/SC 21/WG 3 DBL-MAD-146r2, November 1996. At ftp.cs.arizona.edu/ tsql/tsql2/sql3/mad146.pdf.
Snodgrass, R.T. et al. Adding Transaction Time to SQL/Temporal. Change proposal, ANSI X3H2-96-502r2, ISO/ IEC JTC1/SC21/WG3 DBL MAD-147r2, November 1996. At ftp.cs.arizona. edu/tsql/tsql2/sql3/mad147.pdf. |