Vertica on a Stick

By cwardell • June 11th, 2009

Can you believe it? A bootable 16 Gb thumb-drive with Linux, GUI, Vertica AND room for data!?

Yes, I admit it. I am a geek. But how cool is this. Imagine developing, prototyping, and presenting a data warehouse without procuring a server, installing Linux, or commandeering your hard drive. Everything you need is on a small bootable USB thumb-drive.

Welcome VStick

 

There were a few challenges in creating the VStick, but the complications were mostly around finding the right Linux distro.

  • I had to find the right thumb-drive the could sustain decent read/write rates
  • Finding a distro that could be installed on the thumb-drive was easy, getting it to work well was a challenge
  • The distro had to be small enough to fit on a thumb-drive yet leave room for Vertica and some data.
  • I had to make sure that it had all of the features, GUI, Text editor, browser, etc.. that would help with the evaluation and development process.
  • I did a first pass in shutting down the unneeded services to free up some system resources
  • I went through a few load and query cycles to test it out.
  •  

    Vertica on a Stick

     

    Initially, I loaded some sample data and just about all the queries performed with sub-second response times. I am now in the process of cramming as much data on the stick that I possibly can. I am excited to see what the performance and compression would be like on the larger data set.

    This little midnight engineering project really highlights the technology of Vertica.

  • The size of the entire Vertica installation is a miniscule 8 MB so it is very light on resources
  • The columnar approach and compression deployed on the database minimizes I/O substantially. So far, the impact of the slower flash drive has gone unnoticed during my tests.
  • If Vertica performs this well on a laptop with a thumb-drive, imagine what it can do on a rack of modern servers
  • Stay tuned, over the next few days, I will be loading and querying a larger data set and will let you know how it goes. Outside of evaluating and developing on Vertica, I was wondering if there was a market for a data warehouse on a stick? Perhaps the embedded market like Kiosks. 

    Let me know if you have any ideas?

    Regards,

    Charlie

     


    Comments

    Charlie,

    Very intersting posts.. For the Vertica on USB drive, which linux distro did you use and how did you download vertica ?

    Thanks,
    Ramesh

    Ramesh,

    I used quite a few linux distros. RedHat, Fedora, CentOs, Suse. Once I setup the distro the way I liked it, I downloaded vertica from Vertica V-Zone http://www.vertica.com/v-zone/download_vertica

    I am getting ready to release an ISO image of the pre-configured and formated Vstick. I will be setting up a registration page and support forum. So please stay tuned for further details.

    Regards,
    Charlie

    By Jerry Platz on August 27th, 2009 at 6:38 pm

    Hi Charlie,
    Nice hack. How’s this one: Similar concept only I put it on linksys wireless routers. This gives me a distributed dbms capable of running across hundreds of other routers (using the spare linux cycles that the routers possess). Talk about commodity hardware. (:-) Anyway, keep up the good work.
    JP

    Jerry,

    Definitely an interesting concept. Have you already done this with Vertica? I assume you are talking about the WRT54G and would suspect that the challenge would be with RAM. I love the idea though and if you have not done so already, we should think about launching a project?

     

    Leave a Comment

    You must be logged in to post a comment.

    « | Home | »