The Word on the Street (Teradata, Netezza, Vertica, ParAccel, Greenplum)
As I venture into my new skunk works “developing a sentiment based analysis platform”, I started investigating Google Insight to see what the search volumes are with regard to a few of the competing database technologies. My interest is primarily in Teradata, Netezza, Vertica, ParAccel, and Greenplum. Since text mining and sentiment analysis are part of my product road-map, I thought investigating search volumes would be an interesting metric when placed side by side with sentiment. Although I have not performed a sentiment analysis on the blogosphere or news feeds for this topic, I believe that search metrics can provide some pretty interesting insight.
All three charts span the last 12 months. Teradata is left off of the second chart so I can have finer granularity when comparing Netezza, Vertica, ParAccel, and Greenplum. Likewise, Netezza is left off of the third chart so I can have more granularity on Vertica, ParAccel and Greenplum.
It is important to note that according to Google: “The numbers on the graph reflect how many searches have been done for a particular term, relative to the total number of searches done on Google over time. They don’t represent absolute search volume numbers, because the data is normalized and presented on a scale from 0-100. Each point on the graph is divided by the highest point, or 100. When we don’t have enough data, 0 is shown. The numbers next to the search terms above the graph are summaries, or totals.”
Including Teradata
Needless to say, I was quite surprised to see the amount of searches regarding Teradata. It is interesting to note that the dip in October 08 followed by the sharp decline in January 09 follows an October 13th wire on “Teradata Breaks the Affordability Barrier to Analyzing Petabyte-Sized Data Warehouses” What I did notice is that the top searches for Teradata were technical in nature. IE: “Teradata SQL”, “Teradata date”, “Teradata Syntax” with searches rising on the topic of “Teradata Spool”, “Teradata Rank”, “Teradata SAP”.
Given the fact that the majority of the searches are technical in nature, it becomes obvious as to why the search volume was so high over the other products. Looking into search terms that are more exploratory or comparative are probably far more valuable from a marketing point of view. If I find some time, I may do another post that would analyze more “inquisitive” terms likely submitted by potential customers.
Omitting Teradata
When omitting Teradata you get a better sense as to where Netezza is relative to the other contenders. It is clear that Netezza is holding on to their position and slightly widening the gap as of late. I do however believe that the upward trend is related to the launch of Twinfin. If I am right, I would have expected to see more volume of searches. What is interesting is that unlike Teradata, The top search terms for Netezza are “Netezza Database”, “Netezza Teradata”, “Netezza Corporation”, “Netezza data warehouse” This is excellent news for Netezza as there seems to be an obvious interest and comparison between Netezza and Teradata. I believe my hunch is right regarding one of my earlier posts “Netezza Paved the Way“.
Netezza is well positioned against Teradata. Netezza has a large opportunity here because unless Teradata is being purchase with a whole host of other applications and logical data models, Netezza should be winning the analytics play. Switching from Teradata to Netezza is a hard sell even when the Teradata platform is fully amortized. A large portion of Teradata’s revenue is related to professional services but the nature of the search terms lead me to believe that Teradata is expanding existing environments and may not be seeing new client acquisition (for analytics that is). This is purely a guess but can easily be discovered from their filings. Perhaps Netezza should work on a migration strategy to chip away at the existing Teradata stronghold but definately attempt to take them head on for new client aquisition that fall into the extreme analytics category.
Omitting Teradata and Netezza
The pattern of Greenplum and Vertica trend-lines are so close in nature that it appears they are going head to head on most deals. Vertica does edge out Greenplum by a decent margin from an inquisitive search metric. The April 09 spike related to Greenplum seems to be related to the article on April 7th “GoldenGate Software and Greenplum Join Forces to Enable High Performance, Real-Time Data Feeds to the Warehouse”
Paraccel has a few good announcements lately but it’s tough to be in 4th position. Paraccel’s recent climb seems to be related to the Price Choper deal. I also expect to see climb in search volume in the month of October related to the announcement “Scales-Out – Merkle Scales-Out its ParAccel Consumer Data Warehouse Platform to 20 Terabytes“. From a search perspective, it still trails behind Vertica and Greenplum quite a bit.
Vertica has a steady upward climb going on related to search and all searches seem to be inquisitive in nature. IE: “Vertica Systems”, “Vertica Database”, “Sybase Vertica” Their recent release “Pink OTC Markets Inc. Cuts Reporting Time, Hardware and Programming Costs with Data…” is bringing the climb upward still. It is possible that another few deals would put Vertica in a position to break away into the Teradata, Netezza, Vertica trifecta.
Summary
My analysis needs to stop here for now. It would be nice to have the time to do a deep dive study on the technologies from a search, attitudinal, perception and sentiment point of view. This quick analysis was pretty interesting but it is just the tip of the iceberg and focuses on a small segment of data. This year there will be about 24 terabytes of social network data and news feeds. Carving out the MPP Database section of the web would prove to be an interesting study.
So in general this quick study leads me to believe:
- Teradata’s Search volume is mostly technical in nature possibly indicating that new customers are hard to come by
- Netezza is being compared against Teradata and the searches for Netezza are inquisitive or comparative
- Vertica is being compared against Greenplum and Sybase and all searches seem to be inquisitive or comparative
Regards, Charlie
« The V-Stick is Coming | Home | Need a Job? »





Comments
Hello, very interesting analysis, and it has led me to check out Google Insights.
I did see something which may be skewing your analysis quite a bit in reference to Vertica. When you look at the search terms included in the overall analysis for Vertica you see a number of them that don’t refer to the database company, but is apparently the name of an apartment property management company in Canada. That is why Canada shows up as the number 1 region of interest for Vertica.
I don’t know whether there is a way to exclude that to get a more accurate picture. I added my company Infobright to this analysis to see where we stood (right now looks like between Greenplum and ParAccel) but I am not sure what the data would look like if Vertica excluded non-company searches.
Regards…
Thanks Susan,
Yes you are right about a community in Miami. When I first saw “NEO VERTICA” it really through me for a loop and my mind went right to Neoview. It is in fact an up and coming (at least from a search standpoint) phrase. Looking at the relative #’s on the 100 point scale, it showed a top of about 3. So I did not take it out because the “Jist” of the analysis would theoreticall be the same.
I would however want to focus on “REAL” inquiry based terms to bring some accuracy to this type of study.
Thank you for your comment. I may dive a little deeper to see if I can normalize this a little bit more.
Charlie
Thanks Susan,
Yes you are right about a community in Miami. When I first saw “NEO VERTICA” it really through me for a loop and my mind went right to Neoview. It is in fact an up and coming (at least from a search standpoint) phrase. Looking at the relative #’s on the 100 point scale, it showed a top of about 3 and the searches were conducted primarily in 2008. So I did not take it out because the “Jist” of the analysis would theoretically be the same. I am looking now at the impact of “Vertica Apartments” because that phrase had search spill over into 2009 can skew the numbers. I would like to get the “Vertica Apartments” out of this analysis and see what it looks like. Stay tuned.
I would however want to focus on “REAL” inquiry based terms to bring some accuracy to this type of study.
Thank you for your comment. I will dive a little deeper to see if I can normalize this a little bit more.
Charlie
Hi again,
I just re-ran your graph selecting only the category “computer and electronics” and you’ll see that vertica winds up below Greenplum. Relatively speaking, Greenplum, Vertica, Infobright and Paraccel are quite close while as one would expect Teradata and Netezza are quite a bit higher.
Focusing only on the Technology Category or not was a debate I had in my mind last night. I would like to see the search spanning across categories into finance, business, etc.. So I left it ALL in. In order to do this properly a whole set of filters need to be identified and subtracted from the search results.
If you leave al the categories ON and subtract out -NEO -APARTMENTS from the Vertica query it shows the edge as described above.
Obviously, PROPER analysis needs to be done. The study I am suggesting with sentiment would go against the blogopshpere, social media, youtube, and news to categorize the topics and then mine the text for discourse.
A metric on Search volumes is very interesting but the lacks the beef. Is it “Directionally correct?”, “Why are they searching?”, “Who is searching?”, “What event prompted the search?”, “What is the sentiment of the results of the search?”, “What’s in the news?”, “What are the bloggers and tweeters saying?”, “What Sentiment do the comments of the blogs bring?”
A long way to go, I know. I hope that others with a passion in this space can help interpret the data. The current platforms are maturing in this free text analysis area, but they are expensive and seriously lacking. (More on this to come)
Thanks
Leave a Comment