Blog
News.
Running analytics using Infobright
Sid
05 Mar 2010 16:10
Last month I went to an interesting seminar given by MySQL where one of the presentations was by Infobright. It’s an open source analytics solution and their sales guy was plausible enough to get me download it.
I’m currently using it to do some analytics/Business Intelligence testing running it against some historical data from a data warehouse we’ve built. I’m asking it things that I know relational databases struggle with, or at least require a lot of tuning input to get working nicely (e.g. top-n queries to show the most valuable customers, etc.)
At the moment the performance improvements for are pretty significant (15x quicker) with only a month’s worth of data. I’m hoping that once we load a quarter’s-worth, or a year, or multi-year then the improvements will ramp up even more.
The nice thing about it is that it’s built on MySQL so you can run regular SQL against it and it does all the clever stuff.
The weird thing about it from a relational viewpoint is that there are no indexes or anything like that. It’s a column-oriented database and hence great for group and aggregate functions – the mainstay of analytics.
Once I’ve done a bit more formal testing I’ll publish out the queries we used along with the comparison to the traditional relational database we have currently implemented on.
Quick update:
Just finished the first round of testing with around 625,000 rows of data which represents about 1/3 of a year. Infobright was faster on all the queries ranging from 15x to 30x faster.

