I’ve been doing some work with the super fast in-memory database, Aerospike lately. See previous blog posts here about the speed of this product. Since I’ve started work w/Aerospike, the team there has announced that their core product is now open source.
In this blog post, I’ll be covering how to get started developing with Aerospike. There are a couple of considerations as you begin. The first consideration is where you want to host your Aerospike cluster (which can be a single node for initial testing). Aerospike itself runs only on Linux at this time, i.e not Windows, Mac, etc… So you have two options for hosting the server – either on the cloud or using virtualization software on your local development machine. I did extensive testing on both methods and prefer to use a cloud-hosted instance at this time. I will cover the process to do both types in this blog post.
The second consideration is which client language/library you prefer to use. Because there is a good amount of information already available on using Java with Aerospike, I will cover using both Python or .NET (C#) here. As of this writing, Aerospike has client libraries for C/C++, Java, C#, Node.js, C libevent, PHP, Erlang, Python and Perl.
So let’s get started….
Installing Aerospike on the Cloud
I’ve tested two configurations – Google Cloud using Google Compute Engine (I documented the install steps for Aerospike when using this method in a previous blog post – here), the Google Developers Console with a GCE instance hosting Aerospike is shown below. You’ll note that I used a ‘n1-standard-2’ image for my testing – 2 CPUs and 7.5 GB RAM. If you are new to the Google Cloud, you can use the code ‘gde-in‘ at this link to get $ 500 usage credit as well.
I also tested using an AWS Amazon Machine Image (EC2 service) linked here. This AMI has Aerospike pre-installed. Both methods are simple and quick for initial testing. To use this method, simply spin-up the image linked (and shown below in the screenshot) via AWS.
Tip: Be sure to include a firewall rule to open port 3000 for testing your client connectivity for either cloud configuration.
Installing Aerospike locally using Virtual Box
If you prefer to install Aerospike locally, you can use the instructions found on Aerospike’s web site to do so. They have instructions and links to Vagrant files (wrappers) for Oracle’s Virtual Box so that you can quickly download and start an Aerospike image. Because you are using virtualization technology to host Aerospike itself, you can install this on your local machine with any OS.
If you choose this route, be sure to follow the instructions exactly as listed on the Aerospike site, as there are a number of configuration steps and each must be done in the order listed. I tested both the Mac and Windows instructions. There are 9 install steps for each type, the Mac install steps are shown below.
Part One of install instructions for Mac
Part Two of install instructions for Mac
If you are attempting to install on a Windows machine, be sure to verify that your installation of Virtual Box is able to use Hardware Virtualization Vt-x and AMD-V (look in the Settings tab) for Virtual Box, as some Windows machines may need BIOS settings updated in order for this to be possible.
Using Client Libraries with Aerospike
After you’ve set up and tested your Aerospike server, the next step is to select your client library. First I’ll cover using the Python client library.
The instructions provided by Aerospike worked just fine for me — with one small exception as I tested on my Mac. The exception is for the last command (‘sudo pip install aerospike‘) – if you do not have ‘pip’ installed on your Mac, then just run ‘sudo easy_install pip‘ from Terminal to install it.
On the first page of the Aerospike Python client manual there is a complete sample Python file that you can use to quickly test your client connectivity by inserting a record via the put() method. You can see this file in Sublime in my environment in the screenshot below.
This sample is designed to connect to a local installation of Aerospike. If you are using a cloud-hosted installation, then just change the IP address (shown in line 6) to the external IP address for your hosted instance. Also in the config (line 6) is the default port of 3000.
You may also notice that Aerospike records are addresses via the pattern of namespace, set and key (shown on line 13). You call the put() method (line 16) to write a record and, optionally, the get(key) method (line 22) to read a record.
I found the easiest way to verify record insertion while testing was to use the web console. A sample of the output (with several test records inserted) is shown below on the Definitions tab of the console. The URL format for the web console for a remote installation (including cloud) is as follows:
The example above shows the first IP address being the external IP address for the remotely hosted instance and the second IP address being the internal IP address for that instance
If you are using a local instance, then the default URL for the console is shown below:
There are also more complete sample files for working with Aerospike using Python to be found on Github – here. Support for Python in Aerospike is relatively new and the team is also asking for your feedback if you use this library.
The C# client library (here) is quite rich and conveniently includes a test harness (Windows Form application), that allows you to easily connect and test the Aerospike API. Additionally the sample code, includes a benchmark test harness that I found useful.
I tested the library on a Windows 8 machine with Visual Studio 2012 and it built with no issues. I then connected to an instance (local shown in the screenshot, but in reality, I connected to a cloud-based instance) and was happy to explore the API via this well-written test harness shown below. You’ll notice that the sample includes code for all types of activity, i.e. put, get, append, prepend, batch, etc.. and also that the sample includes code for asynchronous processes.
In addition the sample includes a benchmarking tool, which makes it simple for me to test and benchmark on various vendor clouds (in this case AWS vs Google Cloud). An example of the benchmark application that is part of the C# sample client is also shown below.
Just for completeness, I’ll include a screenshot of C# sample code in Visual Studio. You can see there are two projects, AerospikeClient and AerospikeDemo (the latter is set to be the start-up project). The AerospikeDemo project contains the code for the test harness (Windows Form) shown in the previous screenshot. Shown below is the source file ‘Operation.cs‘ from the /Main directory. Here you can get a sense of core database operations, put, get, etc… which take Bin objects (each Bin is a column name/value pair).
I’ll close by reminding you why I am so interested in trying out Aerospike. You’ll remember, the core product is now free and open source. Also, it has the potential for relevant BigData scenarios to cost much less than other storage methods that can scale to this speed and size. I have found a graphic from Aerospike’s site to be compelling and accurate.
What’s your experience like? If you try out Aerospike, let me know (in the comments section below) how it goes for you.
Fun addition for those of you who have read this far…