It’s big, it’s complicated, it’s the BIG solution providers dream. It’s Hadoop and unfortunately, it’s here to stay.
I find it amazing how technologies that are put together with rubber bands and chewing gum become the de-facto standard in which all other technologies are measured. Sadly, it’s easy to see why.
Hadoop is complicated, not so much from a Map/Reduce perspective, but because of all the other stuff needed in order to anything with it. (Pig, Hive, Hbase, Map/Reduce, HDFS, zookeeper, …) Not only is it complicated, but these layers are built on top of inefficiency. The result is a large big-data eco-system that is hard to maintain, deploy, and manage.
Hey, but it’s free! Or is it?
Inefficiency sells hardware, complexity sells services. It’s no wonder that the big solution providers are backing it. Unfortunately, you will rapidly learn that free gets pretty expensive and by the time you realize it, you have just been Hadooped!
The data-warehousing failure rate is still incredibly high. This is in spite of wonderful technologies that take the guess work out of managing BIG structured data. Hats off to (Netezza, Vertica, Teradata, Greenplum, and all the other players who radically shifted that space away from OLTP, ROLAP, MOLAP, OLAP for analysis)
The one thing we don’t need is inefficiency!
When I work with my clients on big data challenges, I am always educating. The DBAs for the most part come from a world of structured OLTP databases and the very thought of flat-file processing or NOSQL is completely foreign and to some an abomination. However, they want to learn. It’s just a crime that everyone is drinking the Hadoop Koolaid. I cringe when I see elegant technologies like Vertica and Netezza messaging about how nicely they fit into the Hadoop movement.
Over the course of the next few posts, I will be posting details about Decooda’s Liquid Data Platform.
- Extremely simple
- Extreme Parallelism (even within a MAP)
- Thread-less - (Event Based Actor Model)
- Streaming Real-time or Batch Mode
- Dynamic and flexible
- Integrates with Existing ESBs
- SAN, GPFS or HDFS
- Distributed Data Storage
- Did I mention Simple?
Please stay tuned and feel free to send or post your comments.