<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: &#8230;and this database is Just Right</title>
	<atom:link href="http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/feed/" rel="self" type="application/rss+xml" />
	<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/</link>
	<description>Dashboards to Data Warehouses</description>
	<lastBuildDate>Thu, 12 Aug 2010 14:58:28 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: Charlie</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-144</link>
		<dc:creator>Charlie</dc:creator>
		<pubDate>Thu, 17 Jun 2010 03:41:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-144</guid>
		<description>Aditya,

So let me start by sending a disclaimer. I have not had hands on with Aster Data so I can&#039;t really comment on it as a product. However, the technology is interesting and I hope to eventually kick the tires.

Here is my take. It depends on what you want to do with your cluster. Aster is based on map-reduce so it opens up opportunities for many interesting things. With map-reduce, you have the ability to run application logic in parallel at the node level. However, it is a row-store database so the I/O is going to be higher than that of a columnar store like Vertica and I would expect that given the equal # of nodes, Vertica will out-perform Aster in query response.

Netezza has the ability to push algorithms down into the FPGA which made for lightning fast computations. With their twinfin product Netezza makes it relatively easy to move algorithms to the SPU level through user defined functions. This opens them up to a lot of opportunities to play in the HPC (high performance computing) space. 

Vertica has done an awesome job with the compression combined with the Columnar approach to storing the data. I/O is really minimized and the query response times are incredibly fast. HPC was not their focus so running application logic at the node level was less of a priority. With that said, there is no reason why you could not have something like Hadoop running along side of your Vertica cluster. You would get map-reduce, incredible compression and the benefits of a column store. With the recent version 4.0 of Vertica, external user defined functions are possible, so they are closing that application logic gap.

So the question is what do you want to do? Query? HPC, both? Do you mind proprietary hardware? Do you want to grow your warehouse incrementally with the smallest possible initial capex? What does your data management staff look like? These are just a few of the many considerations you need to give when selecting the technology that suits your business. It will be easier to do a POC on Vertica or Aster, while Netezza will require quite a bit of co-ordination and qualification before they roll a Twinfin frame in your data center. 

Netezza and Vertica have an INCREDIBLE client list. Aster is starting to make strides. 

One last thing, the Patent office awarded Google a software method patent on MapReduce. What this means for hadoop I really can&#039;t say, but make sure you investigate that path and ask the right questions. 

Lot&#039;s to consider as this is an area where you want to be very diligent. I hope this helps. Feel free to contact me if you would like to explore the options more formally.  http://www.bcsolution.com/contact/

Best Regards,
Charlie</description>
		<content:encoded><![CDATA[<p>Aditya,</p>
<p>So let me start by sending a disclaimer. I have not had hands on with Aster Data so I can&#8217;t really comment on it as a product. However, the technology is interesting and I hope to eventually kick the tires.</p>
<p>Here is my take. It depends on what you want to do with your cluster. Aster is based on map-reduce so it opens up opportunities for many interesting things. With map-reduce, you have the ability to run application logic in parallel at the node level. However, it is a row-store database so the I/O is going to be higher than that of a columnar store like Vertica and I would expect that given the equal # of nodes, Vertica will out-perform Aster in query response.</p>
<p>Netezza has the ability to push algorithms down into the FPGA which made for lightning fast computations. With their twinfin product Netezza makes it relatively easy to move algorithms to the SPU level through user defined functions. This opens them up to a lot of opportunities to play in the HPC (high performance computing) space. </p>
<p>Vertica has done an awesome job with the compression combined with the Columnar approach to storing the data. I/O is really minimized and the query response times are incredibly fast. HPC was not their focus so running application logic at the node level was less of a priority. With that said, there is no reason why you could not have something like Hadoop running along side of your Vertica cluster. You would get map-reduce, incredible compression and the benefits of a column store. With the recent version 4.0 of Vertica, external user defined functions are possible, so they are closing that application logic gap.</p>
<p>So the question is what do you want to do? Query? HPC, both? Do you mind proprietary hardware? Do you want to grow your warehouse incrementally with the smallest possible initial capex? What does your data management staff look like? These are just a few of the many considerations you need to give when selecting the technology that suits your business. It will be easier to do a POC on Vertica or Aster, while Netezza will require quite a bit of co-ordination and qualification before they roll a Twinfin frame in your data center. </p>
<p>Netezza and Vertica have an INCREDIBLE client list. Aster is starting to make strides. </p>
<p>One last thing, the Patent office awarded Google a software method patent on MapReduce. What this means for hadoop I really can&#8217;t say, but make sure you investigate that path and ask the right questions. </p>
<p>Lot&#8217;s to consider as this is an area where you want to be very diligent. I hope this helps. Feel free to contact me if you would like to explore the options more formally.  <a href="http://www.bcsolution.com/contact/" rel="nofollow">http://www.bcsolution.com/contact/</a></p>
<p>Best Regards,<br />
Charlie</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: aditya</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-143</link>
		<dc:creator>aditya</dc:creator>
		<pubDate>Wed, 16 Jun 2010 18:56:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-143</guid>
		<description>Hello, 

Can you throw some light on Asterdata as compared to Netezza and Vertica.
Which do you think would really break the ice.</description>
		<content:encoded><![CDATA[<p>Hello, </p>
<p>Can you throw some light on Asterdata as compared to Netezza and Vertica.<br />
Which do you think would really break the ice.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Charlie</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-142</link>
		<dc:creator>Charlie</dc:creator>
		<pubDate>Wed, 16 Jun 2010 10:18:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-142</guid>
		<description>Radhika,

When utilizing a columnar db for your data warehouse, you are typically abstracted from the underlying column store. To the client, the data appears to be stored in table, rows and columns. You simply create ansi standard sql as you would in any relational database. The the columnar db (Vertica in my case) takes care of the details behind the scenes.

Under the hood, the column store DB does not read a block of the table into memory but rather reads ONLY the columns needed to satisfy the query.

This alone minimizes the I/O. Vertica  goes quite a few steps further with some advanced  compression algorithms per column and a concept of bucket. Tremendously bringing down the cost of  I/O. 

Additionally, they have some more advanced topics utilizing their Flexstore technology that allows you to group columns for single disk reads allowing finer tuning when trying to minimize disk I/O</description>
		<content:encoded><![CDATA[<p>Radhika,</p>
<p>When utilizing a columnar db for your data warehouse, you are typically abstracted from the underlying column store. To the client, the data appears to be stored in table, rows and columns. You simply create ansi standard sql as you would in any relational database. The the columnar db (Vertica in my case) takes care of the details behind the scenes.</p>
<p>Under the hood, the column store DB does not read a block of the table into memory but rather reads ONLY the columns needed to satisfy the query.</p>
<p>This alone minimizes the I/O. Vertica  goes quite a few steps further with some advanced  compression algorithms per column and a concept of bucket. Tremendously bringing down the cost of  I/O. </p>
<p>Additionally, they have some more advanced topics utilizing their Flexstore technology that allows you to group columns for single disk reads allowing finer tuning when trying to minimize disk I/O</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Radhika</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-141</link>
		<dc:creator>Radhika</dc:creator>
		<pubDate>Wed, 16 Jun 2010 05:34:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-141</guid>
		<description>My query is, How do we query this type of columnr database from Mysql. What differenrt settings would be requierd in the normal select query</description>
		<content:encoded><![CDATA[<p>My query is, How do we query this type of columnr database from Mysql. What differenrt settings would be requierd in the normal select query</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Charlie</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-140</link>
		<dc:creator>Charlie</dc:creator>
		<pubDate>Wed, 02 Jun 2010 18:35:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-140</guid>
		<description>Swetha, 

I believe that the Logical Data Models (Finance, Retail, Travel, etc) is a way for Teradata to sell &quot;Solutions&quot; as opposed to &quot;Just Technology&quot;. For example, you could very easily buy the NCR self service checkout registers as seen in Home-Depot knowing that it would feed a retail data model in Teradata.

Teradata has a pretty significant professional services group providing domain expertise not only for the Teradata database but for the vertical market as well.

Teradata handles transactions very well. So you can have your KPI&#039;s sit on top of the transactional model. Depending on the size of data and the # of nodes, you may never need to aggregate your data. In some cases  you would just let the parallel processing aggregate on the fly. This is what Teradata calls Real-Time data warehousing. 

When you get into the MPP realm, many times it is not needed to create the KPIs as materialized tables.

I am pretty sure that Teradata will not sell there LDM without their appliance though. Let me know if you feel I addressed your questions.

Regards,
Charlie</description>
		<content:encoded><![CDATA[<p>Swetha, </p>
<p>I believe that the Logical Data Models (Finance, Retail, Travel, etc) is a way for Teradata to sell &#8220;Solutions&#8221; as opposed to &#8220;Just Technology&#8221;. For example, you could very easily buy the NCR self service checkout registers as seen in Home-Depot knowing that it would feed a retail data model in Teradata.</p>
<p>Teradata has a pretty significant professional services group providing domain expertise not only for the Teradata database but for the vertical market as well.</p>
<p>Teradata handles transactions very well. So you can have your KPI&#8217;s sit on top of the transactional model. Depending on the size of data and the # of nodes, you may never need to aggregate your data. In some cases  you would just let the parallel processing aggregate on the fly. This is what Teradata calls Real-Time data warehousing. </p>
<p>When you get into the MPP realm, many times it is not needed to create the KPIs as materialized tables.</p>
<p>I am pretty sure that Teradata will not sell there LDM without their appliance though. Let me know if you feel I addressed your questions.</p>
<p>Regards,<br />
Charlie</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: swetha</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-139</link>
		<dc:creator>swetha</dc:creator>
		<pubDate>Wed, 02 Jun 2010 16:17:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-139</guid>
		<description>Hi 
would like to know more abt the Logical data models which teradata has(Retail LDM).
If a client has EDW and all the Data attributes with the Bottom line all KPI&#039;s
Will they still go for Teradata&#039;s Logical Data model.If yes why?
Where will this LDM go and sit in the EDW.
Sorry if it&#039;s very basic</description>
		<content:encoded><![CDATA[<p>Hi<br />
would like to know more abt the Logical data models which teradata has(Retail LDM).<br />
If a client has EDW and all the Data attributes with the Bottom line all KPI&#8217;s<br />
Will they still go for Teradata&#8217;s Logical Data model.If yes why?<br />
Where will this LDM go and sit in the EDW.<br />
Sorry if it&#8217;s very basic</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cwardell</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-137</link>
		<dc:creator>cwardell</dc:creator>
		<pubDate>Wed, 19 May 2010 17:47:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-137</guid>
		<description>Nir,

Have you taken a look at VOLTDB.  http://voltdb.com/
It can handle a huge volume of concurrent transactions and I suspect that there may be some vertica integration points on the horizon to handle the Analytics.</description>
		<content:encoded><![CDATA[<p>Nir,</p>
<p>Have you taken a look at VOLTDB.  <a href="http://voltdb.com/" rel="nofollow">http://voltdb.com/</a><br />
It can handle a huge volume of concurrent transactions and I suspect that there may be some vertica integration points on the horizon to handle the Analytics.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nir</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-133</link>
		<dc:creator>Nir</dc:creator>
		<pubDate>Wed, 21 Apr 2010 11:40:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-133</guid>
		<description>I need to develop reporting and BI solution for CDRs DB which has new 50K CDRs every second.

Can you please share some comparison table for VLDB and DW, which can handle this kind of data.</description>
		<content:encoded><![CDATA[<p>I need to develop reporting and BI solution for CDRs DB which has new 50K CDRs every second.</p>
<p>Can you please share some comparison table for VLDB and DW, which can handle this kind of data.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cwardell</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-122</link>
		<dc:creator>cwardell</dc:creator>
		<pubDate>Fri, 20 Nov 2009 16:46:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-122</guid>
		<description>Hi Alex,

I did not forget about Sybase IQ, When I first had the opportunity to come across them, I did not see MPP as being the underlying architecture. At that time, they still had many components of their technology that was shared and could be considered a potential bottleneck. Things may have changed since I did my eval, but at that time, Column Store - Yes, Shared Nothing/MPP - No. As a result, I took them off my VLDB and MPP watch list.

Charlie</description>
		<content:encoded><![CDATA[<p>Hi Alex,</p>
<p>I did not forget about Sybase IQ, When I first had the opportunity to come across them, I did not see MPP as being the underlying architecture. At that time, they still had many components of their technology that was shared and could be considered a potential bottleneck. Things may have changed since I did my eval, but at that time, Column Store &#8211; Yes, Shared Nothing/MPP &#8211; No. As a result, I took them off my VLDB and MPP watch list.</p>
<p>Charlie</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alex</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-121</link>
		<dc:creator>alex</dc:creator>
		<pubDate>Fri, 20 Nov 2009 16:40:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-121</guid>
		<description>The strange thing is that you forgot about Sybase IQ.</description>
		<content:encoded><![CDATA[<p>The strange thing is that you forgot about Sybase IQ.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cwardell</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-37</link>
		<dc:creator>cwardell</dc:creator>
		<pubDate>Thu, 04 Jun 2009 23:17:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-37</guid>
		<description>Ash,

You ask some great questions. I am going to do a little research and see what I can find. 

What I do know is that this effort involved three core technologies. DMExpress, HP Blade System C-Class, and Vertica. The DMExpress PR is here -&gt; http://www.bcsolution.com/2009/05/syncsort-world-record/ - From what I understand, they were all extremely excited from the results. 

I find Syncsort to be and extremely reputable brand. For over 30 years they have provided  rock solid technology in the very large data space. I have only worked with them during the past 10 years but remember their advertising campaigns in the late 80&#039;s. 

So I believe you will find that Syncsort can stand behind their claims. They have some clients with enormous data deploys that I am sure have run through weekends.

I will see what I can find out.

Charlie</description>
		<content:encoded><![CDATA[<p>Ash,</p>
<p>You ask some great questions. I am going to do a little research and see what I can find. </p>
<p>What I do know is that this effort involved three core technologies. DMExpress, HP Blade System C-Class, and Vertica. The DMExpress PR is here -> <a href="http://www.bcsolution.com/2009/05/syncsort-world-record/" rel="nofollow">http://www.bcsolution.com/2009/05/syncsort-world-record/</a> &#8211; From what I understand, they were all extremely excited from the results. </p>
<p>I find Syncsort to be and extremely reputable brand. For over 30 years they have provided  rock solid technology in the very large data space. I have only worked with them during the past 10 years but remember their advertising campaigns in the late 80&#8242;s. </p>
<p>So I believe you will find that Syncsort can stand behind their claims. They have some clients with enormous data deploys that I am sure have run through weekends.</p>
<p>I will see what I can find out.</p>
<p>Charlie</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ash</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-30</link>
		<dc:creator>Ash</dc:creator>
		<pubDate>Wed, 03 Jun 2009 19:26:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-30</guid>
		<description>A few comments on the &quot;ETL World Record&quot;:

1. Who judges this world record and what are the qualifying criteria? Does it just depend on which &quot;independent&quot; analyst is looking at what solution at a particular point in time? 

2. Who held this &quot;world record&quot; previously - does Vertica even know or are they guessing? In which case, do they really know they&#039;ve beaten it?

3. The most obvious point. Nice performance over an hour in a controlled lab environment. But what sort of comparable stats can Vertica produce for a very large production environment (100 TB+) over a sustained period (i.e. weeks/months/years) - assuming they have such a production environment. And not the marketing-speak terms of peak performance, but TB/hr in continued, 24/7, mixed workload. There are systems out there loading 2TB/hr in real time into something as mainstream as Oracle 10gR2 (peaking at 5x that in busy hours), so is this really that remarkable or just more hollow marketing bluster from the niche players?</description>
		<content:encoded><![CDATA[<p>A few comments on the &#8220;ETL World Record&#8221;:</p>
<p>1. Who judges this world record and what are the qualifying criteria? Does it just depend on which &#8220;independent&#8221; analyst is looking at what solution at a particular point in time? </p>
<p>2. Who held this &#8220;world record&#8221; previously &#8211; does Vertica even know or are they guessing? In which case, do they really know they&#8217;ve beaten it?</p>
<p>3. The most obvious point. Nice performance over an hour in a controlled lab environment. But what sort of comparable stats can Vertica produce for a very large production environment (100 TB+) over a sustained period (i.e. weeks/months/years) &#8211; assuming they have such a production environment. And not the marketing-speak terms of peak performance, but TB/hr in continued, 24/7, mixed workload. There are systems out there loading 2TB/hr in real time into something as mainstream as Oracle 10gR2 (peaking at 5x that in busy hours), so is this really that remarkable or just more hollow marketing bluster from the niche players?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cwardell</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-20</link>
		<dc:creator>cwardell</dc:creator>
		<pubDate>Mon, 01 Jun 2009 21:54:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-20</guid>
		<description>No worries about being picky. This is an open forum and your feedback is welcome.

Thank you for your insight.
Charlie</description>
		<content:encoded><![CDATA[<p>No worries about being picky. This is an open forum and your feedback is welcome.</p>
<p>Thank you for your insight.<br />
Charlie</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lucas</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-19</link>
		<dc:creator>Lucas</dc:creator>
		<pubDate>Mon, 01 Jun 2009 21:49:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-19</guid>
		<description>The issue is that the Vertica benchmark is not a &quot;TPC record&quot; as you claim.  If this were so, it would be listed on the TPC website, http://tpc.org, which it is not. 
It may be a ETL world record.  It may be a data load record.  But is certainly not a TPC record.  
Vertica never once mentions &quot;TPC record&quot;.  Read their &lt;a&gt; press release&lt;/a&gt;, especially take notice to the foot note.

I&#039;m sorry for being picky about this but making inaccurate claims is a pet peeve of mine.</description>
		<content:encoded><![CDATA[<p>The issue is that the Vertica benchmark is not a &#8220;TPC record&#8221; as you claim.  If this were so, it would be listed on the TPC website, <a href="http://tpc.org" rel="nofollow">http://tpc.org</a>, which it is not.<br />
It may be a ETL world record.  It may be a data load record.  But is certainly not a TPC record.<br />
Vertica never once mentions &#8220;TPC record&#8221;.  Read their <a> press release</a>, especially take notice to the foot note.</p>
<p>I&#8217;m sorry for being picky about this but making inaccurate claims is a pet peeve of mine.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cwardell</title>
		<link>http://bcsolution.com/2009/05/teradata-netezza-vertica-choice/comment-page-1/#comment-17</link>
		<dc:creator>cwardell</dc:creator>
		<pubDate>Mon, 01 Jun 2009 15:02:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.bcsolution.com/?p=238#comment-17</guid>
		<description>Lucas, 

I reread your post and went back to the audit report to review section 4.2 a few more times. I believe your point is valid enough that the blog should be clarified. I have adjusted the blog above. 

Thanks for pointing it out.
Charlie</description>
		<content:encoded><![CDATA[<p>Lucas, </p>
<p>I reread your post and went back to the audit report to review section 4.2 a few more times. I believe your point is valid enough that the blog should be clarified. I have adjusted the blog above. </p>
<p>Thanks for pointing it out.<br />
Charlie</p>
]]></content:encoded>
	</item>
</channel>
</rss>
