After having read the chapter "How Hashes Scale From One To One Million Elements" from the book "Ruby Under A Microscope" (This post does not dispense the reading of the respective chapter), I decided to play around with the subject and make some tests with other languages and do a small analysis of the performance of each. The chosen languages were Perl, Ruby, Python and Javascript (node.js).
The tests focused on measuring the time in ms when retrieving an element from a hash with N elements 10.000 times and were based on tests that were performed on the book.
- Ubuntu Server 12.04 LTS
- Instance type micro
- Intel(R) Xeon(R) CPU E5430 @ 2.66GHz
- 604.364 KB of memory
- 64-bit platform
Test description
Programs have been implemented for each of the languages to be tested and they are similar as long as possible.The implementation passed by the creation of hashes with size equals to powers of 2, ranging from 1 to 20.
For each of the hashes created, are made 10.000 gets of the value for a key (key=target_index), and measuring the time in milliseconds.
The measured values are placed in an output file for later to be consumed by a program that will generate the graphics.
Comparing Ruby Versions
From the graphs we can see greater improvements from ruby 1.8 to ruby 1.9 so 1.9 is the the way to go.
Time taken to retrieve 10.000 values (ms)
Comparing Perl Versions
Perl is a language with some years and performance has been stable at least since version 5.8, the results show it.Yet I hoped there were more commitment from the community in trying to optimize these values.
Time taken to retrieve 10.000 values (ms)
Comparing Python Versions
The versions of python show few variations and show some stability between versions. However python 3.2.3 shows a slight degradation in performance.
I may also say that it's performance is quite interesting.
I may also say that it's performance is quite interesting.
Time taken to retrieve 10.000 values (ms)
Comparing Languages
Python was the winner for the interpreted languages. node.js has a great performance however the library used in node.js is not enough accurate to give us reliable values.
You can easily do the tests in your environment using my scripts, the source code used in this tests can be found on Github. I've used rvm, perlbrew and pythonbrew to switch between versions, you can do the same if you want. Nevertheless, you can use your installed versions.
Run the tests:
$ ./experiment1.rb
$ ./experiment1.pl
$ ./experiment1.py
The execution of these scripts will create an output file in the form "values.#{lang}-#{version}" (ex: values.perl-5.8.8) for each script.
To generate the graph just run the ruby script (requires gem googlecharts):
$ ./chart.rb
This script will output the link to the googlechart image.