hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devopam Mittra <>
Subject Re: Which [open-souce] SQL engine atop Hadoop?
Date Tue, 03 Feb 2015 07:04:51 GMT
hi Samuel,
You may wish to evaluate Presto ( , which has an added
advantage of being faster than conventional Hive due to no MR jobs being
It has a dependency on Hive metastore though , through which it derives the
mechanism to execute the queries directly on source files.
The only flip side I found was the absence of complex SQL syntax that means
creating a lot of intermediate tables for little complicated calculations
(and imho , all calculations become complex sooner than we intend them to )


On Tue, Feb 3, 2015 at 10:30 AM, Samuel Marks <> wrote:

> Alexander: So would you recommend using Phoenix for all but those kind of
> queries, and switching to Hive+Tez for the rest? - Is that feasible?
> Checking their documentation, it looks like it just might be:
> There is some early work on a Hive + Phoenix integration on GitHub:
> Saurabh: I am sure there are a variety of very good non open-source
> products on the market :) - However in this thread I am only looking at
> open-source options. Additionally I am planning on open-sourcing this
> project I am building using these tools, so it makes even more sense that
> the entire toolset and their dependencies are also open-source.
> Best,
> Samuel Marks
> On Tue, Feb 3, 2015 at 2:33 PM, Saurabh B <>
> wrote:
>> This is not open source but we are using Vertica and it works very nicely
>> for us. There is a 1TB community edition but above that it costs money.
>> It has really advanced SQL (analytical functions, etc), works like an
>> RDBMS, has R/Java/C++ SDK and scales nicely. There is a similar option of
>> Redshift available but Vertica has more features (pattern matching
>> functions, etc).
>> Again, not open source so I would be interested to know what you end up
>> going with and what your experience is.
>> On Mon, Feb 2, 2015 at 12:08 AM, Samuel Marks <>
>> wrote:
>>> Well what I am seeking is a Big Data database that can work with Small
>>> Data also. I.e.: scaleable from one node to vast clusters; whilst
>>> maintaining relatively low latency throughout.
>>> Which fit into this category?
>>> Samuel Marks

Devopam Mittra
Life and Relations are not binary

View raw message