Graph Database Performance Benchmark - Neo4j vs TerminusDB

TerminusDB vs Neo4j – Graph Database Performance Benchmark

At TerminusDB, we’re committed to pushing the boundaries of graph database performance, and what better way to showcase this commitment than by pitting ourselves against a titan like Neo4j? Our latest venture takes us into the heart of benchmarking, where we’ve harnessed the immense potential of the Wikidata dataset to compare TerminusDB and Neo4j.

The benchmark uses ‘WDBench: A Wikidata Graph Query Benchmark’ that features real-world queries extracted from the public query logs of the Wikidata SPARQL endpoint, enabling us to scrutinize graph databases under real-world scenarios. Armed with the benchmark, we’ve meticulously dissected query execution times, scrutinized result accuracy, and evaluated overall completeness on both platforms.

This blog presents a head-to-head comparison of TerminusDB and Neo4j across a spectrum of query scenarios. We will follow it up with a technical blog with an in-depth exploration of graph database benchmarking, delving into graph query optimization, index utilization, and data retrieval techniques. 

Wikidata Graph Query Benchmark 

We record query times, query results, and disk space usage within our TerminusDB versus Neo4j comparison. The benchmark ran 868 successful queries (278 for singles and 590 for path queries) on the Wikidata dataset with both databases. Any query that took over 60 seconds was timed out.

We ran the benchmarks using Neo4j 4.4.23 on an AWS machine r6g.16xlarge running Ubuntu 22.04.2 LTS. We used the same machine for the TerminusDB tests. 

Disk Usage

We loaded the dataset into TerminusDB, Neo4j, and for the hell of it, MongoDB.

As you can see, TerminusDB is a much more compact database compared to Neo4j, 81% smaller. Incredibly, TerminusDB is also smaller than MongoDB even though it is indexed and provides better search with easy relationships and joins.

Average Query Times

Times are measured in milliseconds meaning that lower bars equal better performance.

Single-Hop Query – Average Time

Neo4j vs TerminusDB Average Times for Single Hop Queries

Across 278 single-hop graph queries, TerminusDB is 64% faster than Neo4j, clocking in at just over a second at 1,254 ms.

Neo4j suffered two timeouts, whereas TerminusDB had one.

Path Query – Average Time

Over 590 path queries, TerminusDB once again proved more performant than Neo4j, 51% faster.

Neo4j had 16 timeouts and TerminusDB had five.

Results Across All Queries

Single-Hop Query

Sorted by Neo4j Time

Neo4j vs TerminusDB - Single Hop Queries Graph Database Benchmark figures

The chart shows Neo4j queries from fastest to slowest and compares TerminusDB speeds for the same queries. Overall, the chart highlights the performance benefits enjoyed by TerminusDB. For legibility, we’ve cropped the results at 30,000 ms.

Sorted by TerminusDB Time

Neo4j vs TerminusDB - Single Hop Queries Graph Database Benchmark figures Sorted by TerminusDB time from fastest to slowest

This is the same chart but sorted by TerminusDB speed from fastest to slowest.

Here is a breakdown of the overall results for single-hop queries –

We have also broken the query time differences down into time groups. The table below presents the number of single-hop queries for each time group. For example, TerminusDB is 1 to 5 seconds faster for 27 queries, whereas Neo4j is 1-5 secs faster for 12 queries.

The numbers show that TerminusDB is faster for 67% of benchmark queries, compared to Neo4j’s 33%. TerminusDB is also over 1 second quicker for 25% of all queries tested.

Path Query

Sorted by Neo4j Time

Neo4j vs TerminusDB - Single Hop Queries Graph Database Benchmark figures sorted by Neo4j

The chart shows Neo4j queries from fastest to slowest and compares TerminusDB speeds for the same path queries. Overall, the chart highlights the performance benefits enjoyed by TerminusDB. For legibility, we’ve cropped the results at 30,000 ms.

Sorted by TerminusDB Time

Neo4j vs TerminusDB - Single Hop Queries Graph Database Benchmark figures sorted by Neo4j

This is the same chart but sorted by TerminusDB speed from fastest to slowest.

Here is a breakdown of the overall results for path queries –

We have also broken the query time differences down into time groups. The table below presents the number of single-hop queries for each time group. For example, TerminusDB is 1 to 5 seconds faster for 194 queries, whereas Neo4j is 1-5 secs faster for 20 queries.

The numbers show that TerminusDB is faster for 67% of benchmark path queries, compared to Neo4j’s 32%. TerminusDB is also over 1 second quicker for 43% of all path queries tested.

Conclusion

TerminusDB compares very favorably against Neo4j. 67% of queries are faster with big time savings for path queries. TerminusDB keeps versions of everything in extremely compact delta formats (succinct data structures). The sparing use of memory has huge performance benefits as demonstrated by the benchmark data.

We still have more benchmarking to do within WDBench. We will update this blog with the results of the multiple edge queries when time permits

In the meantime, we’re putting together a technical blog to explain graph query optimization, index utilization, and data retrieval techniques for this benchmarking project. Sign up for our newsletter, or follow our socials to be notified about its release.