Article first published by Dataversity. This is the second of a two-part series on bitemporal data modeling.
Bitemporal Modeling is an extremely useful tool for documenting historical data. It allows you to recreate databases as they existed at any point in the past and see whether the records were correct — based on what you know to be true now. This information can not only help enterprises better understand past decisions, but it can also prevent executives from making decisions about the future on the basis of faulty data.
So what scenarios warrant Bitemporal Data Modeling? Bitemporality is useful either when there is a lag in record-keeping or when records need to be revised. Those are the two main ways system time and valid time fall out of sync with each other.
Our first post in this series on bitemporality introduces some foundational concepts using an example illustrating a lag in record-keeping. In that example, the employee’s pay rate changed on Feb. 15 but wasn’t recorded until March 15. Let’s go a step further and see what it looks like when a record needs to be amended.
Correcting the Record
Say we find out April 1 that the employee’s new pay rate, effective Feb. 15, was actually $250 per day, not the $211-per-day rate that we’d previously been told. Based on that information, we’ll add a third row to our table to correct the record:
If we look at the data represented visually in a tree map, we can trace along the horizontal axis to get an understanding of what the record showed at any given point in system time. Even though the $211-per-day rate is no longer valid, it shows up in the data history, allowing us to account for any payroll mistakes that might have occurred between March 15 and April 1.
Industry Examples of Bitemporality
Deutsche Bank lead engineer Anthony Coates gave a talk in 2015 called “Why Banks Care About Bi-temporality.” In that presentation, he identifies one of the primary incentives driving businesses and institutions to adopt Bitemporal Modeling: federal regulations.
In 2010, the United States passed the Dodd-Frank Wall Street Reform and Consumer Protection Act, a law endowing the federal government with the power to regulate financial institutions and hold them accountable for their actions. The U.S. Securities and Exchange Commission now requires that broker-dealers electing to store transactional data electronically do so in a “non-rewriteable and non-erasable format.”
As we’ve seen, Nontemporal Models rewrite old records as new ones are added, and Unitemporal Models “erase” inconsistencies by not taking system time into account. As a result, bitemporal modeling has become an imperative for banks.
Healthcare institutions are also becoming increasingly interested in bitemporality. More accurate record-keeping would help pharmaceutical companies better comply with the U.S. Food and Drug Administration’s new rules regarding international electronic standards for the identification of medicinal products. In addition, there’s potential for bitemporal records to bolster medical research with more complete patient histories.
Other industry verticals — including law enforcement, legal, and energy — are also beginning to invest in Bitemporal Modeling. With all this interest in a complex and costly Data Strategy, software companies are hastening to fill the demand.
What Tools Facilitate Bitemporality
Many databases now support bitemporal tables. SQL Server, for example, introduced support in 2016, and DB2 offered Temporal Data Management support as far back as 2012. MarkLogic, a NoSQL database, has been an outspoken advocate of bitemporality as well.
What databases do to support Bitemporal Models is simple: They keep track of system time for you. Instead of having to log both valid time and system time manually, system time is stored alongside your valid time entries as you write them into the database. Several databases — MarkLogic, for example — allow admins to override that function and exercise some autonomous control over system time logs, but this can lead down a slippery slope as far as data integrity is concerned.
On the front end, Business Intelligence software can also play an important role in bitemporality. Admins can shield business users from a torrent of historical data by filtering on null SysEnd values. Conversely, power users with access to ad hoc operational reporting features can use filtering to isolate data from a period of interest.
Moreover, BI makes it possible to perform complex joins server-side. Not all databases support joining on a range of values, for example, which means that joining two bitemporal tables on a data field between date one and date two would have to happen server-side.
Check out the following resources for more information about bitemporality and how to implement a bitemporal system in your database: