hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mohit.kaushik" <mohit.kaus...@orkash.com>
Subject Re: Data Storage for Joins and ACID transactions + Hadoop Cluster
Date Mon, 18 Jan 2016 05:18:32 GMT
Hive provides a SQL like functionality over hadoop but NOSQL does not 
provide all SQL capabilities very well. As the number of joins increase 
performance decreases. Instead you should try to model your data in one 
table to avoid joins. You can try Apache Accumulo which provides full 
control, over data structure and you also don't have have to define 
Column families in advance like in HBase you have to. Its fast and most 
scalable tested datastore which uses Hadoop in its base.

-Mohit Kaushik

On 01/18/2016 10:32 AM, Divya Gehlot wrote:
> Hi,
> Which Data storage is best for multiple joins on the run time in Hadoop.
> Tried Hive but performance is poor.
> Pointers/Guidance appreciated.
>
>
> Thanks,
> Regards,
> Divya

Mime
View raw message