Creating a database with CSVs using TerminusDB console

Hello everybody! I am going to write a tutorial on how to work with CSVs using TerminusDB. That’s right, TerminusDB 4.0 now has the feature in which you can import and export CSVs, Let’s dive into this.

How to create a database with CSVs

  1. Open up the TerminusDB console and create a database
  2. Fill in the Id, Name and description of your database
  3. You will find a button Create Database from CSVs create database with csvs
  4. Click on this button to load which ever CSVs you wish to import to your database, In this blog I am going to use The Bike Share Data
  5. I have added 2 CSVs into my database as shown below different csvs loaded into db
  6. And that’s it, go ahead and click on Create New database, which includes the 2 CSVs in it 😊😊😊😊😊

What happens when you import CSVs into your Database?

On creating our database, Go to the Documents Page

document page view

Each of our CSV files that were loaded on create becomes a separate Document Type. All the columns in the CSV file have been converted into properties and a schema is auto-generated.

Each CSV is of doctype CSV and has rows hanging off them, we can use these auto generated column properties to query the database.

You can also add more CSVs once you have created your database by clicking the Add CSVs

Go to the Schema Page to view the property types which have been generated off the columns

comaprison between excel and loaded csv

Let’s Query the Database

Get CSV Id’s, Query for type scm:CSV

and (
    triple('v:CSV ID', 'type','scm:CSV'),
    triple('v:CSV ID', 'label', 'v:CSV Name')
)

query database to get csv Id

This query simply gives you the ID and name of available CSVs in the Database. Take a note of the CSV Id you’re interested in, also visit the Schema page to get the Property Id’s your interested in as well.

Open a new Query Pane and try the below query

const id="doc:CSV_bike_tutorial-1.csv"

WOQL.and (
  .limit(1).triple('v:CSV ID', 'type', 'scm:CSV')
    .eq('v:CSV ID', id)
    .triple('v:CSV ID', 'scm:csv_row', 'v:CSV Row')
    .triple('v:CSV Row', 'type', 'v:Row Type'),
  .quad('v:Property', 'domain', 'v:Row Type', 'schema/main')
    .quad('v:Property', 'label', 'v:Property Name' ,'schema/main')
)

This query gets all the property Id and name from the schema graph.

Next, let’s query to display a few columns of our CSV, I plan to display Start Station Number, Start Station, End Station Number, End Station and Duration

const id="doc:CSV_bike_tutorial-1.csv"

const startStationNumber = "scm:column_Start%20station%20number"
const StartStation = "scm:column_Start%20station"
const endStationNumber = "scm:column_End%20station%20number"
const endStation = "scm:column_End%20station"
const duration = "scm:column_Duration"

and (
	triple('v:CSV ID', 'type','scm:CSV').eq('v:CSV ID', id)
      .triple('v:CSV ID', 'scm:csv_row', 'v:CSV Row')
      .triple('v:CSV Row', startStationNumber, 'v:Start Station Number')
      .triple('v:CSV Row', StartStation, 'v:Start Station')
      .triple('v:CSV Row', endStationNumber, 'v:End Station Number')
      .triple('v:CSV Row', endStation, 'v:End Station')
      .triple('v:CSV Row', duration, 'v:Duration')
)

All the rows of CSV matching the column names will be displayed on running this query

So yeah that’s it, you can do all sorts of queries on your CSV

How to update CSV

I am going to show an example in which I change the duration of Bike number W22558 to 500 in my CSV file and I am going to update this change. So we go to Document Page and click on Add CSVs and add the CSV of interest

update csv

Since bike_tutorial-1.csv already exists, console will automatically prompt you to update the CSV. Input a commit message of what we have changed and hit on Upload CSV.

The brilliant part about the update is that only diffs are considered and updated which makes update of big sized files more efficient. To understand this, Go to DB Home Page -> Latest Updates

DB Home latest updates

Click on your recent update, The only change on update was the row for bike W22558 of duration 417 was removed and a new row for bike W22558 with duration 500 was added instead. This makes updates of CSVs much more quick 😊😊😊😊😊

comparison between commits

You can also time travel on the database to see the diffs on CSV files over time.

time travel on data

Export your CSV

Now that we have imported our CSVs into our database we can do whatever we like or modify it however we want and Export this CSV back.

Go to Documents, and filter for all CSV types to get a download option as shown and download the CSV.

export csv from document page

So here we go, this tutorial shows a very simple way to import CSVs into TerminusDB and play with them and do all sorts of useful cool queries, whatever you like.

Earlier we had to do all the steps mentioned in TerminusDB: Importing, Reordering & Exporting a CSV not anymore!

Now it’s just a few button clicks and voilà!


A computer engineer, keen to learn :)