I gave a talk called ‘Bleeding Edge Databases’ at this weekend’s BigDataCampLA. Several attendees asked me to record the talk, so I did (and will link that post below).
Here’s my favorite tweet from the event.
I am regularly invited to preview many, new database technologies, so you may be wondering why I chose these three solutions to highlight. The first aspect of all three solutions is independently benchmarked noticeably better performance in their particular areas. I also take usability into account.
Aerospike – 1 Million TPS (read-only workload) PER SERVER and 40K TPS (read-write) on that same server. Very good integration with client tools and libraries.
AlgebraixData – 1 Billion Triples on ONE NODE and substantially better query response times than any competitor for the core benchmark queries for RDF databases. Their core engine, which is optimized using patented mathematical algorithms interests me because of the solid performance benchmarks they’ve been able to achieve.
Google Big Query – 750 Millions Rows in 10 seconds, solidly ‘productized’ with usable integration points both in/out, reduced pricing and increased streaming – also nearly ANSI SQL-like querability.
My screencast includes demos of the first two products using Google Compute Engine (VMs on the Google Cloud). I chose a Linux GCE instance for Aerospike and a Windows GCE instance for AlgebraixData. I prefer to use the Google Cloud for this type of testing for a couple of reasons:
1) Fast, easy VM spin-up
2) Generous free tier, clear pricing information (i.e. no surprises)
3) As a GDE (Google Developer Expert), I have usage credits for the Google Cloud beyond typical, so I rarely encounter any fees for quick POC testing
If you are interesting in trying out these databases, the instructions on how to get access are in my screencast – enjoy!