hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: real time query option
Date Wed, 23 Jun 2010 14:08:13 GMT
On Wed, Jun 23, 2010 at 2:12 AM, Amr Awadallah <aaa@cloudera.com> wrote:
> For low-latency queries you should either use HBase instead, or consider
> Hive over HBase, see:
> http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/
> -- amr
> On 6/22/2010 11:05 PM, jaydeep vishwakarma wrote:
>> Hi,
>> I want to avoid delta time to execute the queries. Every time even when
>> we fetch single row from hive tables it goes to typical map and reduce
>> process. Is there any platform which built on top of HDFS or hive table
>> which help me to get real time query data, I want to avoid filling data
>> to DB.
>> Regards,
>> Jaydeep
>> The information contained in this communication is intended solely for the
>> use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify us
>> immediately by responding to this email and then delete it from your system.
>> The firm is neither liable for the proper and complete transmission of the
>> information contained in this communication nor for any delay in its
>> receipt.

Hive by its nature is not real time, but there are some "REAL TIME"
options in hive, that you might be able to take advantage of.

If your dataset is small:

set mapred.job.tracker=local;

This will give you a local 1 mapper 1 reducer job. There is not
jobtracker start up overhead everything happens in thread.

Option: pre compute your results sets you want in real time.

select * from tablea where part=x

Is NOT a map reduce job. So if you have precomputed tablea selecting
it will be as fast as hadoop can stream it to your client.

View raw message