hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devopam Mittra <devo...@gmail.com>
Subject Re: Hive alternatives?
Date Fri, 06 Nov 2015 07:33:00 GMT
I agree with the suggestions presented already.
You may want to check presto as an alternative as well.
But please remember , Presto is an added layer on top of Hive and not an
independent alternative.
It simplifies your semantic layer and querying while being faster than Hive.
For OLAP , I will always recommend pre-calculated aggregate layer and avoid
ad-hoc analytics as much feasible.

Regards
Dev
On Nov 6, 2015 1:29 AM, "Alex Kamil" <alex.kamil@gmail.com> wrote:

> 1) Apache Phoenix <https://phoenix.apache.org> + Mondrian
> <http://community.pentaho.com/projects/mondrian/>
> 2) Apache Spark <http://spark.apache.org/>
>
> On Thu, Nov 5, 2015 at 2:49 PM, Jörn Franke <jornfranke@gmail.com> wrote:
>
>> First it depends on what you want to do exactly. Second, Hive > 1.2, Tez
>> as an Execution Engine (I recommend >= 0.8) and Orc as storage format can
>> be pretty quick depending on your use case. Additionally you may want to
>> employ compression which is a performance boost once you understand how
>> storage indexes and bloom filter work. Additionally , you need to think
>> about how you sort the data. Cf. also
>>
>> https://snippetessay.wordpress.com/2015/07/25/hive-optimizations-with-indexes-bloom-filters-and-statistics/
>>
>> However, you have to rethink how you define your technical data model. A
>> lot of prejoinend data in a big flat table can be more performant when
>> using storage indexes and bloom filters than using standard indexes and
>> dimensional modeling.
>>
>> Besides besides tez you can also use other execution engine in your
>> session (eg Spark) if this makes sense.
>>
>> Finally you have to review how yarn manages resources including
>> preemption, fair vs capacity scheduler etc.
>>
>> Btw the same holds also for relational database appliances, such as
>> Exadata. The standard approach dimensional modeling + standard indexes
>> there is often not anymore the most performant.
>>
>>
>>
>> > On 05 Nov 2015, at 20:04, Andrés Ivaldi <iaivaldi@gmail.com> wrote:
>> >
>> > Hello,
>> > I was looking for Hive as OLAP alternative, but I've read that is quite
>> slow for that, does anybody have experiences about? or a Hive altenative
>> for OLAP? Killin is not an option becouse we need dynamic OLAP like ROLAP
>> >
>> > Regards,
>> >
>> > --
>> > Ing. Ivaldi Andres
>>
>
>

Mime
View raw message