An immutable database means the data within it cannot be deleted or modified. There are numerous reasons why an immutable database is beneficial for you and this article explains some of those arguments.
Martin Kleppmann, who is a serial entrepreneur and self-described database rebel, says a database should be, “an always-growing collection of immutable facts,” and thinks the only reason databases still have the mutable state is inertia*.
He has a good point as your data is your worldview and history (be it transactions, people, services, or systems) and if that history changes the perception of the world becomes warped. We want the data we use to make decisions and run applications to be facts, only then will we be able to deliver value.
Here are some good reasons why you need an immutable database.
1 Fast recovery
Humans and machines interact with databases and are both capable of corrupting data, slowing performance, or causing downtime. Processes can overwrite data, improper patches can wreak havoc, and those fallible humans are capable of the odd error. Then there are malicious attacks that deliberately destroy data, it’s a cruel world.
An immutable database means you can recover faster to a previous state. All of your data is there, so it’s a case of using the database tools to revert to a happier and more stable database.
As an example, our CTO Gavin, many years ago had spammers use an SQL injection to cause a disaster to his WordPress website. He had to spend over four hours cleaning the SQL database by hand. Now imagine if this was a massive enterprise database, we could be talking weeks to sort through the data and get the database stable again.
2 Fast and accurate auditing & regulatory reporting
Who doesn’t love auditing? Why’s it all gone silent? Okay, auditing and regulatory reporting can be a nightmare and can cost large firms hundreds of thousands of dollars a year. An immutable database is beneficial for this type of task as the history of yesterday and beyond is right there in the data.
The previous state of the database as it actually was for a period of time can be easily retrieved by traveling back in time and surfacing it for reporting. Compare this to how a mutable database, with ad hoc version tracking of rows, would think was actually there, and you can see where that world can warp and cause problems.
3 Better debugging
You have implemented migration processes for database upgrades to mitigate downtime, but programs and scripts applying automatic updates still cause issues. Databases can become complex beasts and these unplanned outages are part of the parcel of a delicately balanced data stack.
You may not want to revert to a previous state, as mentioned in point one, you may want to debug the problem and fix it yourself. With an immutable database, you can see precisely what changed anywhere in the database history.
A great example of where things can go wrong and when you can’t just restore a backup is the Atlassian outage. An engineer ran a script with the incorrect flags and inputs and inadvertently deleted 400 customers’ data. Not all customers were impacted so if they rolled back the unaffected customers would lose days’ worth of data, making a bad situation worse. They had to painstakingly find and restore each customer’s data from the backup. With an immutable database, they could have queried what had been deleted and restored those customers in no time, saving a lot of face, time, and money, heck, they could have even spun it into a positive PR story. For more information about the Atlassian outage, our friends at Dolt wrote this great article.
4 Tamperproof protection for fraud & cyber security
Data breaches are a pain. They’re bad for customers, embarrassing, and a nightmare for your engineers. With mutable databases, anyone with access can update your data and the likely outcome is that nobody would notice. It’s like writing a check in pencil.
Because no data is deleted and every update is appended to an immutable database, it’s easy to create log streams to show who changed what and when. This makes fraud and tamper detection a lot easier and can help to protect your data from unwanted manipulation.
5 Improved database performance
With an immutable database, the shared memory states are just the ‘label’ of the database. This means we can run multiple reader and writer processes over the database concurrently, and still have the ability to edit the database at the same time without having a major impact on the database’s performance.
For example, our Terminators frequently use the command-line tool while the TerminusDB server is running to perform tasks, tests, and query the database without impacting its performance.
Improved performance means that the applications, analysis, and services relying on the database can operate concurrently to improve productivity.
6 Work collaboratively with your team
Another benefit of immutability is that it is possible to communicate with the database with push, pull, and clone as we are only telling other nodes what has changed and not the whole database state.
The ability to communicate with the database in this way means multiple people can work on specific tasks concurrently and collaboratively, enabling your data team to work with the same agility as software developers.
7 Broader data for analytics
Although we cannot predict the future, we can use the past to provide a path for guidance. Most decent-sized organizations are using some form of analytics to make business decisions. Data scientists and analysts use the data available to them to train ML to make decisions as accurately as feasibly possible. Immutable databases give analytics teams a broader picture of data by providing a complete history, allowing them to run queries to show changes over time.
8 Protect build and deployment processes
Storing CI/CD recipes in an immutable database helps to protect build and deployment processes. Performance is key to ensuring that the build process performance isn’t negatively impacted, so having the ability to validate process integrity by obtaining historic data based on pipeline ID or digital asset checksum means CI/CD processes can function efficiently and effectively.
9 Centralize digital records
Sometimes, when dealing with a record, it makes sense to keep a history of it so that those accessing the data can get more context. A good example of this is employee records, details such as bonuses, vacation entitlement, and salary. If stored in a mutable database HR or management get a single view of that person. But having an immutable view of that person means they can see when an employee last received a pay rise, by how much, and whether they’re deserving of a bumper payday.
Of course, it’s not all positive, if you’re never going to delete anything from your database, storage costs will be higher, but with low storage prices, this isn’t really too big a deal. We believe that the benefits of immutable databases far outweigh the negatives for a wide range of practical uses.
If you’re interested in immutable databases, check out TerminusDB. It’s an open-source document graph database combining the best of document stores and graph databases. It provides ease of development using JSON documents, with the benefits of leveraging the relationships of the graph into applications and services. Check out some of our TerminusDB use cases here.