8 reasons to version control your database

8 Reasons to version control your database

Version control is synonymous with software development. Most will automatically think of GitHub. Those superpowers gifted to software developers have made application builds quicker, more efficient, and more collaborative. 

The database world has been slow to follow. But it is getting there, TerminusDB is one database with version control features. There are others like Dolt, Planetscale, and Liquibase that extend the functionality of other databases.

Revision control for databases is a no-brainer. We’ve listed eight beneficial reasons to version control your database. But before we do that, here’s a brief overview of how TerminusDB does version control.

TerminusDB is an immutable graph database that creates a graph from JSON documents. It features collaboration and version control tools that enable you to branch and query data. These tools enable collaborative builds and the creation of applications designed to let many users collaborate asynchronously. Read here for more information about how TerminusDB does version control.

1. Build collaborative applications

Collaborative applications have changed the way people consume and use applications. User-generated content drives a lot of these. Look at Spotify as a prime example. Contributors provide songs, podcasts, and audiobooks. All of these collaborators give consumers a rich application where they can consume just about anything ever released.

Collaborative applications can take many forms, perhaps you’re creating a knowledge base that captures specialist knowledge from a diverse range of people, or maybe it’s a subscription service for erotic stories. Whatever the use case, version control is a way to embed business logic into the database so that collaborators can work together to input substance to your applications.

With version control, you can establish workflows, so there is a level of authority and approval to ensure collaboration is to a set standard. As we’re immutable, there’s also the ability to store all data entries, whether the application uses this data or not. This might be useful for conflicts or where a collaborator wants to revert to a previous version of their work.

The immutable nature of TerminusDB and the commit-graph, in particular, makes version control features like branch, squash, reset and rebase possible. This also provides some other beneficial functions for delivering collaborative applications. Diff and patch let users, either directly or via a user interface, compare JSON documents and decide what the data should look like. This is useful for people working on the same assets and when conflicts occur. 

If you’re interested in developing collaborative applications, have a read of our article highlighting the reasons why TerminusDB is the right data toolkit for building collaborative apps.

2. Build collaboratively with version control

Where Git has revolutionized software development, adding version control features to your database provides backend and full-stack developers access to similar tools to build collaboratively.

The ability to create database branches means that changes to schema and the data itself can be achieved quickly and efficiently without impacting day-to-day operations. Workflow pipelines that help developers work together to review, discuss, and approve updates also help to remove silos and encourage people to work together to use their collective skills and experience to build better. 

Version control is an enabler of collaboration. If you look at the DataOps movement that is focused on speed, quality, collaboration, and measurement, you have to ask the question, is this all possible without revision control for your database? 

You can read more about our thoughts on collaboration and version control here, and the relationship between DataOps and collaboration here.

Version Control for Data

3. Establish and maintain good data governance

Former Data Scientist, and our DevRel Associate, Cheuk, recently wrote about the importance of version control for better data governance. In the article, Cheuk talked about her experience with ending up with multiple versions of data sitting around with no good explanation about which one is meant to be used for a certain experiment.

The way we do version control means that we can add metadata to both branches and commits to provide a narrative about data so that users have some context and with it understanding. 

The point of data governance is to ensure that data remains accessible, accurate, and secure so that it is available and useful to other users. Version control enables you to achieve better data governance by adding the narrative, but also removing the temptation for renegade developers to make changes in isolation that can impact applications and services.

4. Surface data for regulatory compliance

Another benefit of version control features is the ability to satisfy audits and regulatory compliance faster. With many regulations across regions, these tasks can become a logistical nightmare to give even the most courageous professionals night sweats.

As an immutable database, TerminusDB stores everything. The commit-graph time stamps all changes and records what data changed and who changed it. You can then travel back in time to see what data looked like at any particular time.

Branching the database also means that data can be prepared for regulators to provide them with everything they need to satisfy their particular requirements. There’s no need to touch anything in production, all of the commit histories are available from the branch so that auditors can enjoy the rich history of your data. You can use this branch to develop UIs or export to CSV to make audits far easier.

Commit graph diagram for Regtech

Have a read about using TerminusDB to build a headless regulatory system.

Headless architecture doesn’t just stop at regulatory systems, organizations can structure all of their data and use it as the headless backend and content management systems across various use cases, from websites to customer portals.

5. Development audit from inception

Speaking of audits, they’re not always something to fear. Auditing your builds is important to ensure data quality, accuracy, and security, it can also help later with license and regulatory compliance.

In many instances of application development, an audit comes later, mainly due to heavy workload and tight deadlines, so the fine-tuned auditing comes after much work has gone into the build. But is delaying your application audits causing harm in the long run?

In many app builds, the requirements can be in flux as the application takes shape and users start experimenting and subsequently finding things they’re not quite happy with. By building audit capabilities into applications from inception, application needs and development can proceed along parallel lines to give you a greater chance of improving the quality of the delivered product.

This is what version control gives you. For every development, you have a branch with workflow pipelines. And with every commit, you get a snapshot of what the backend looks like, and so with appropriate metadata, the lineage and build of your application are documented.

6. Feature development with safety controls

Nothing stands still. We’re constantly in flux. As such, applications need to move with the times. Adding features and enhancements is all part of providing the best service to customers.

Version control database features such as branch, squash, reset, rebase, and establish workflows allow development teams to build in safety. If something goes wrong, you can revert to the working version.

The ability to thoroughly test on a branch also helps to ensure that when features are pushed, quirks are ironed out and disruption is avoided.

7. Database disaster recovery (from humans)

Human error causing lost data

Do you worry about earthquakes and volcanos destroying your data? Of course not, you’ve got disaster recovery plans to ensure the availability of your data. What’s more dangerous than an earthquake? Humans of course!

We could leave it there, but we are actually referring to your database and not just in general. Human error leads to lost or inaccurate data and even with database disaster recovery plans, some of this data can be lost and time wasted.

Version control is a way for you to protect yourself from these pesky humans and their fallibilities. With the commit history, you don’t need to revert to a backup to restore the database, you can travel to the time of the error-prone commit and fix the issue. Our web architect Francesca wrote a piece talking about database disaster recovery in more detail that you can read here.

8. Academic & research - start with a base and branch forward

Quite often, academic research involves graph databases as they’re perfect for mapping relationships and getting a semblance of meaning from data. Experiments are repetitive and require a base to reference a zillion tests against. Quite often in the research world, time will be spent rebuilding a knowledge graph. With a version control database, specifically time-travel and branching, academics and researchers can save bags of time by traveling back to a specific point and begin new experiments from there.

If you’re interested in having a play with TerminusDB/X, getting started is easy. Either install TerminusDB or sign up to TerminusX and get started in minutes. Join us on Discord for troubleshooting or to discuss your ideas and seek inspiration.